The available server environment (ASE) logger daemon
(aselogger) tracks the ASE messages generated by all the
member systems.
A logger daemon can be run on one or more member systems.
If you have more than one member system running a logger daemon, you will
have virtually duplicate logs on the member systems.
Messages appear in the
log files in the order that they were logged, not necessarily in the order
that they occurred.
During software installation, the TruCluster software installation procedure prompted you to determine if you want to run the ASE logger daemon each time the system is booted. If you chose not to run the logger daemon when you installed the TruCluster software, you can invoke the following command and then reboot the system to start the logger daemon each time the system is booted:
# rcmgr set ASELOGGER 1
Note
A temporary stop in a network or a high network load may cause the
aseloggerdaemon to overflow its message queue, resulting in the loss of some log messages on the system running the daemon. To avoid losing messages, run theaseloggerdaemon on each member system.
The ASE logger daemon logs messages generated by the
asemgr
utility, the director daemon, the agent daemon, and the logger
daemon.
Messages generated by the host status monitor (HSM) daemon and the
availability manager (AM) driver are logged to the local system.
In addition,
if the ASE logger daemon stops, all daemon messages are logged only to the
system on which they occurred.
Note that when the TruCluster software first
starts, the initial messages that are generated may be logged only to the
local system.
The logger daemon uses the DIGITAL UNIX event logging facility,
syslog, to collect messages that are logged by the various kernel,
command, utility, and application processes.
Messages are either logged
to a local file or forwarded to a remote system, as specified in the
/etc/syslog.conf
file on each member system running the logger
daemon.
The
/etc/syslog.conf
event logging configuration
file specifies how a member system logs messages.
If you use the default
logging configuration, all
asemgr
utility and ASE daemon
messages are logged to the
/var/adm/syslog.dated/date
/daemon.log
file.
The AM
driver messages are logged to the
/kern.log
file in the
same directory.
In addition, you
can set the severity level of ASE error logging by using the
asemgr
utility.
This allows you to limit the ASE messages that are logged.
See
Section 12.1.3
for more information.
To examine the ASE messages generated by the
asemgr
utility and the logger, director, and agent daemons, check the event logging
files of any member system that is running a logger daemon.
To examine the
ASE messages generated by the HSM daemon and the AM driver on a particular
member system, check that system's event logging files.
Appendix A
contains a partial list of important event messages and their descriptions.
The following example shows a remote message and a local message on
member system
gideontc, which is running a logger daemon:
Jan 27 11:22:27 gideontc ASE: pigeon Agent Error: HSM reported state
change of zen.tst.com, a non-member host
Jan 27 11:22:29 gideontc ASE: local AseLogger Notice: connected to Agent
[1] [2] [3] [4] [5] [6] [7]
The ASE messages are logged in a specific format and include the following information:
Date and timestamp. [Return to example]
Local system name. [Return to example]
Identifier (not used in messages from the AM driver). [Return to example]
System that generated the message--Note
that
local
is specified if the message was logged locally
or was not logged using the logger daemon.
This information is not specified
in messages from the AM driver.
[Return to example]
Source of the message--The following components can generate an ASE message:
| AseMgr | The
asemgr
utility |
| Director | The ASE director daemon |
| Agent | The ASE agent daemon |
| HSM | The HSM daemon |
| AseLogger | The logger daemon |
| AM | The AM driver |
| vmunix | The kernel |
| AseUtility | A process or daemon unrelated to ASE |
Severity of the message--The severity level is not included in messages from the AM driver. Messages can have the following severity levels:
| info | A low-level informational message |
| notice | A high-level informational message about significant activity in the ASE |
| warning | A message about activity in the ASE that may indicate an error condition |
| error | A message about an error that was detected |
| alert | A message about a critical condition that requires immediate attention |
Message text. [Return to example]
The ASE action scripts capture any output from the commands that
they execute.
If the action script fails, the command output is logged as
errors and the source of the message is specified in the log files as
AseUtility.
By default, the ASE logger daemon logs alert messages in the
daemon.log
file in the
/var/adm/syslog.dated/date
directory and notification is sent to root on
the local system.
You can use the
mailsetup
program to
configure mail so that the superuser can receive error alert messages from
the ASE logger daemon.
(You can use the Mail option on the
setup
utility menu to run this program.) See
mailsetup(8)
for more
information.
For information on setting up mail to fail over, see
Section 5.5.
To determine which member systems are running a logger daemon, choose the "Obtaining ASE Status" item from the ASE Main Menu and then choose the "Display the location(s) of the logger" item.
ASE message logging uses the DIGITAL UNIX
syslog
function and
syslogd
daemon.
However,
you can use the
asemgr
utility to specify the severity
level of the messages that you want the ASE logger daemon to log, which restricts
the severity level of the messages that are logged.
There are five possible severity levels that can be logged, as described in Section 12.1.1. The following table describes the types of messages associated with the possible severity levels:
| Message Type | Description |
| Informational | Logs messages of all severity levels. This is the default. |
| Notice, warning, and error logging | Logs messages with the
notice,
warning,
error, and
alert
severity levels. |
| Warning and error logging | Logs messages with the
warning,
error, and
alert
severity levels. |
| Error logging only | Logs messages with the
error
and
alert
severity levels. |
To set the severity level for message logging, choose the "Set the logging level" item from the Managing the ASE menu. Example 12-1 shows how to set the severity level for message logging.
Enter the logging level for the ASE:
i) Informational (log everything)
n) Notice, warning, and error logging
w) Warning and error logging
e) Error logging only
x) Exit to Managing the ASE
Enter your choice [i]: n
You can display the severity level of the messages being logged by choosing the "Display the level of logging" item from the Obtaining ASE Status menu.
To disable ASE event logging on a member system, you must stop ASE
services, which stops the enabled logger daemon, reset the
ASELOGGER
parameter to zero, and restart ASE services for the change to take
effect.
Enter the following commands to disable ASE envent logging on a member system:
# /sbin/init.d/asemember stop # rcmgr set ASELOGGER 0 # /sbin/init.d/asemember start
TruCluster software provides you with a script that
executes a specified task when an error with the alert severity level occurs.
You use the
asemgr
utility to edit and test the script.
The default error alert script sends mail to users that you specify in the script. You can edit the script to specify which users will receive the mail, and you can specify some other action to take when a severe error occurs.
To edit the error alert script, choose the "Edit the error alert script"
item from the Managing the ASE menu.
The
asemgr
utility
invokes the
vi
editor or the editor defined by the
EDITOR
environment variable, and you can specify the users to which
you want mail sent or you can make other changes to the script.
Example 12-2 shows how to edit the error alert script.
# Define ADMIN on next line to get mail for critical ASE errors
ADMIN=root
PATH=/sbin:/usr/sbin:/usr/bin
export PATH
ERR_FILE=/var/ase/tmp/alertMsg
TIME=`date +"%D %T"`
HSM_STATUS=`awk -F: '{print $2}' ${ERR_FILE} | sed 's/ //g'`
case "${HSM_STATUS}" in
HSM_NI_STATUS)
awk -f /var/ase/lib/ni_status_awk ${ERR_FILE}
;;
HSM_PATH_STATUS)
awk -f /var/ase/lib/path_status_awk ${ERR_FILE}
;;
esac
if [ -n "${ADMIN}" ]; then
if [ ! -f "${ERR_FILE}" ]; then
echo "Critical ASE error detected on `date`" >
${ERR_FILE}
fi
mailx -s "***Critical ASE error - ${TIME}" ${ADMIN} < ${ERR_FILE}
fi
rm -f ${ERR_FILE}
:wq
To test the error alert script, choose the "Test the error alert script" item from the Managing the ASE menu. ASE sends a test alert message to the Logger daemon and invokes the error alert script.
You can reset the available server environment (ASE) daemons on a member system if problems occur in the ASE. Resetting the ASE daemons stops the ASE director, logger, and host status monitor (HSM) daemons and initializes the ASE agent daemons on a system. The agent daemons then restart all the daemons to make the ASE fully operational. If resetting the ASE daemons does not fix the problem, you can initialize or reboot the member system.
To reset the ASE daemons on a member system, use the following command:
/sbin/init.d/asemember restart
You must ensure that the available server environment (ASE) daemons do not time out, because other system processes have a higher scheduling priority. The ASE daemons must have a scheduling priority that is higher than normal system processes; they must be able to respond to administrative commands and other events in the ASE. The daemons' high priority enables the ASE to operate even when the member systems are busy. See the DIGITAL UNIX System Administration manual for information about scheduling processes.
If there are processes other than those generated by the ASE with a scheduling priority that is higher than the priority of the ASE daemons, the daemons could time out while waiting to run. If this occurs, messages such as the following are written to the log file, indicating that operations are timing out:
Mar 8 13:09:28 surry ASE: surry AseMgr Error: ASE timeout -
Unable to stop service.
The ASE agent daemon (aseagent) and logger daemon (aselogger) are started
in the
/sbin/init.d/asemember
script with a "nice"
value of -5, which raises the priority of the daemons.
The processes that
descend from the ASE daemons inherit the raised scheduling priority.
For
example, the director daemon (asedirector) and any programs
or scripts started by the ASE daemons have the same raised priority as the
agent and logger daemons.
You can raise the scheduling priority of the ASE daemons
by changing the "nice" value specified in the lines in the
/sbin/init.d/asemember
file that start the
aseagent
and
aselogger
daemons.
See
nice(1)
for more information
about scheduling priorities.
Note that ASE daemons started with a "nice" priority will
not always stay at that priority.
Over time, if the member systems do not
reboot, the daemons' priority may return to the average run priority.
When
the member systems reboot, the daemons' priority is raised again according
to the "nice" value in the
/sbin/init.d/asemember
script.
Therefore, the default
/sbin/init.d/asemember
script
contains the following command, which supersedes the "nice" value
for the
asehsm
daemon and runs the daemon with a fixed
high priority that does not degrade over time:
aseagent -p hsm
If you do not want the fixed high priority for the
asehsm
daemon, remove this command from the
/sbin/init.d/asemember
script.
You can also raise and fix the priority of the
aseagent,
asedirector, and
asehsm
daemons by including
the following command in the
/sbin/init.d/asemember
script:
The connection
manager's monitor daemon,
cnxmond
is started on all cluster
members by the
/sbin/init.d/clumember
script.
The
cnxmond
daemon has the following two options that, when multiplied,
specify the longest duration that communications can be inoperative between
a system and the connection manager monitor daemon:
The
-p
option specifies a ping interval;
that is, the interval during which at least one ping must be received from
the
cnxpingd
daemon on a member system.
(Normally, the
cnxpingd
daemon sends two pings during this interval.)
The
-D
option is a multiplier that determines
a timeout interval, which is based on the ping interval.
cnxmond -p 10 -D 6
This command results in a 60-second timeout interval.
When started by the
clumember
script, the
cnxmond
daemon searches the
/etc/rc.config
file to determine the values for the
-p
and
-D
options.
The value for the
-p
option is obtained from the
CNX_INTERVAL
variable;
the value for the
-D
option is obtained from the
CNX_WAVES
variable.
When the
cnxmond
daemon detects an interruption in
communications with a system (that is, no ping is received during the timeout
interval), the connection manager removes the system from the cluster.
Investigate
the source of the communications problem and, if necessary, use the
rcmgr set
command to increase the value of the
CNX_WAVES
variable.
For example, to change the value of
CNX_WAVES
to 10, enter the following command:
# rcmgr set CNX_WAVES 10