[Contents] [Prev. Chapter] [Next Section] [Next Chapter] [Index] [Help]

2    Managing ASE Members and Cluster Members

An available server environment (ASE) manages a collection of systems and the shared SCSI buses to which they are connected, and provides an environment in which services can be started, stopped, and automatically relocated in response to a software or hardware failure. ASEs provide high availability and increase storage capacity, reduce performance bottlenecks, and permit a wider range of configurations.

An Available Server configuration contains but a single ASE. A Production Server configuration contains one or more ASEs. As discussed in the TruCluster Software Products Software Installation manual, when you install the Production Server Software, you identify each ASE that is to exist within the cluster by assigning an ASE identifier (ASE_ID) to those members that are to participate in an ASE.

The ASE_ID is a value from 0 to 63 that cluster software uses to uniquely identify the ASE in which the system resides within the cluster. Each ASE has a unique ASE_ID; all systems in the same ASE share the same ASE_ID. (In an Available Server configuration, all systems have an ASE_ID of 0. You are not prompted to supply an ASE_ID during software installation.)

The following sections describe how to manage the membership of ASEs and clusters. It discusses the following tasks:


[Contents] [Prev. Chapter] [Next Section] [Next Chapter] [Index] [Help]

2.1    Adding a New System to a Cluster (PS)

To add a new system to an existing Production Server configuration, follow these steps:

  1. Follow the instructions in the TruCluster Software Products Hardware Configuration manual to physically connect the new system to existing shared storage, the MEMORY CHANNEL interconnect, and external networks.

  2. Connect the new system to the MEMORY CHANNEL subnet, turn on the power, and boot the system.

  3. Install the same versions of the DIGITAL UNIX operating system and TruCluster software on the new system as you installed on existing cluster member systems. Decide which available server environment (ASE), if any, the new system will belong to and specify the correct ASE_ID during cluster base subset configuration.

  4. Ensure that the settings of the following /etc/sysconfigtab attributes are the same on the new system as on all current member systems:

    Warning

    Failure to match the current configuration can cause one or more systems to panic when you attempt to add the new system to the cluster.

  5. Modify the new /etc/hosts file on the new system to add entries for the IP address and hostname associated with the MEMORY CHANNEL subnet for each existing member system (including the new system).

  6. Modify the /etc/hosts file on existing cluster member systems to include an entry for the new system's MEMORY CHANNEL IP address and hostname.

  7. If the new cluster member is to belong to an ASE, run the asemgr utility on an existing member of the ASE to which the new system will belong and add the new member. The updated ASE database will be propagated to all ASE members when the new system is rebooted.

See Section 2.2 for instructions on adding a new member to an ASE.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

2.2    Adding Member Systems to an ASE

The following requirements pertain to adding member systems to an available server environment (ASE):

Use the asemgr utility to add one member system at a time to the ASE. The system on which you run the asemgr utility for the first time is your first member system. You must add at least one other member system to your ASE. Add all the member systems from the same system. Do not run the asemgr utility on one system and add one member system, and then run the asemgr utility on another system and add a different member system.

In an Available Server configuration, after you enter the names of all the member systems, you are prompted for additional network interfaces for each member system. Before you add a member system, all network interfaces must be configured on the system. See Section 3.3.3 for information about using multiple networks in an Available Server configuration.

If you want to add member systems to an existing ASE, choose the "Add a member" item from the Managing the ASE menu. A list of the current member systems is displayed and you can add additional member systems, one at a time. You are then prompted for additional network interfaces for the new member systems. Example 2-1 shows how to add member systems to the ASE.

Example 2-1:  Adding Member Systems to the ASE

                Managing the ASE
 
    a)  Add a member
    d)  Delete a member
    n)  Modify the network configuration
    m)  Display the status of the members
    C)  Display the configuration of the ASE database
    l)  Set the logging level
    e)  Edit the error alert script
    t)  Test the error alert script
     )  Enable ASE V1.5 functionality
 
    q)  Quit (back to the Main Menu) 
    x)  Exit to the Main Menu              ?)  Help
 
 
Enter your choice [x]: a
 
Member List: tototc, gideontc
 
Enter a new member: daffytc
 
Member List: tototc, gideontc, daffytc
 
Is this correct (y/n) [y]: y
 
Would you like to define any other network interfaces to daffytc
 for ASE use (y/n)? [n]: n
 
                ASE Network Configuration
 
    Member Name          Interface Name       Monitor
    ___________          ______________       _______
    tototc                 tototc             Yes
    gideontc               gideontc           Yes
    daffytc                daffytc            Yes
 
Is this configuration correct (y|n)? [y]: y

Note

After an ASE member system has been deleted from an ASE, attempts to add it back into the ASE may fail. To resolve this problem, perform one of the following actions before trying to add the affected system back into the ASE:


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

2.3    Enabling ASE on an Existing Cluster Member (PS)

When you install the TruCluster Production Server software on a member system, the installation procedure asks if you intend to run the available server environment (ASE) on that system. If you answer "no" at that time, and later decide to run ASE on that system, you must perform the following steps to enable ASE:

  1. Select a value for the member system's ASE_ID. The member's ASE_ID must be the same as the ASE_IDs of the other member systems in the ASE it is joining.

  2. Determine whether this system should run the ASE logger.

  3. Enter the following commands, specifying the selected ASE_ID for <var> where appropriate:

    # rcmgr set ASE on
     
    # rcmgr set ASE_ID <var>
     
    # rcmgr set ASELOGGER 1 if you wish to run the logging daemon
    # 
    

  4. Rebuild the kernel using the doconfig program. See the DIGITAL UNIX System Administration manual for instructions on running the doconfig program.

  5. Reboot the system.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

2.4    Deleting Member Systems from an ASE

To delete member systems from the available server environment (ASE), choose the "Delete a member" item from the Managing the ASE menu. You then specify the number associated with the member system you want to delete. If a member system is running a service, ASE relocates the service to another member system.

You cannot delete the member system on which you are running the asemgr utility. To delete the last member system in an ASE, you must delete the TruCluster software subsets from that system.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

2.5    Displaying the Status of the Member Systems

To display the status of the member systems in an available server environment (ASE), choose the "Display the status of the members" item from the Managing the ASE menu. The status of each member system and the agent daemon running on the system are displayed. See Section 1.2 for a description of the ASE daemons.

Example 2-2 shows an example of member system status in the ASE.

Example 2-2:  Displaying Member System Status

                 Member Status
 
Member:             Host Status:     Agent Status:
tototc              UP               RUNNING
daffytc             UP               RUNNING

The director daemon obtains system status from the host status monitor (HSM) daemons running on all the member systems. The following table describes the information for the Host Status field:

Host Status Description
UP The member system is up and can be accessed by the member system that is running the ASE director daemon using the cluster interconnect. The member system can be queried over the cluster interconnect, and can add, delete, start, and stop services.
DOWN The member system cannot be accessed by the member system that is running the director daemon using any network or shared SCSI bus. The member system does not answer queries over the cluster interconnect or SCSI bus, and cannot start or stop services.
DISCONNECTED The member system is disconnected from all monitored networks. Any services running on the member system are stopped, and no services can be added, deleted, or started on the member system.
NETPAR There is a network partition between the member system and the member system running the director daemon, although the member systems can communicate using SCSI bus queries. Services that are currently running on the member system remain running, but the member system cannot start or stop any service until it leaves this state.

The director daemon determines the status of the agent daemons running on the member systems. The following table describes the information in the Agent Status field:

Agent Status Description
RUNNING The ASE agent daemon is running on the member system.
DOWN The ASE agent daemon is not running on the member system.
INITIALIZING The ASE agent daemon that is running on the member system is in its initialization phase and will be running soon.
UNKNOWN The ASE director daemon cannot determine the state of the agent daemon on the member system.
INVALID The ASE director daemon reports an invalid state for the agent daemon on the member system.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

2.6    Initializing ASE Member Systems

If an available server environment (ASE) does not work correctly, make sure that you have adhered to the requirements in this manual and in the TruCluster Software Products Hardware Configuration manual. You should also read the TruCluster Software Products Release Notes. Running the clu_ivp utility may reveal the cause of errors. If you cannot fix the problem, you can initialize one or all of the member systems in an ASE.

Initializing a system stops any running ASE daemons and removes any member system and service information from the ASE database on the system. After you initialize a system, it can be added to an existing ASE or used in a new ASE.

You may want to initialize a system if you cannot add it to an ASE, or if the ASE database is corrupted on the member system. However, you may need to initialize all the ASE member systems to solve the problem.

The following sections describe how to initialize one or all of the ASE systems.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

2.6.1    Initializing One System

To initialize one system, follow these steps:

  1. If the system is already a member system, use the asemgr utility to delete the member system from the ASE. If you cannot delete the member system, you cannot initialize only this member.

  2. If the system is not an ASE member system, delete the /usr/var/ase/config/asecdb ASE database file, if it exists, from the system.

  3. Invoke the /usr/sbin/asesetup command on the system.

  4. Run the asemgr utility on an existing member system and add the initialized system to the ASE.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

2.6.2    Initializing All the Member Systems

Initializing all the member systems returns the ASE to a state that includes no member systems or services. After you do this, you must add the member systems and set up your services again.

To initialize all the member systems in an ASE, follow these steps:

  1. If possible, use the asemgr utility to display the status of the member systems, network, and services in the ASE. This information will help you to re-create your ASE.

  2. If possible, use the asemgr utility to delete all the services from the ASE. This allows you to save any Logical Storage Manager (LSM) or Advanced File System (AdvFS) disk configurations on a specific system.

  3. Delete the /usr/var/ase/config/asecdb ASE database file from all the systems.

  4. Invoke the /usr/sbin/asesetup command on each system.

  5. Run the asemgr utility on a system, add the other initialized systems to the ASE, one at a time, and set up your services.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

2.7    Stopping and Restarting ASE Activity

To change your available server environment (ASE) hardware configuration or perform maintenance, you may have to stop all activity in the ASE.

To stop all ASE activity, follow these steps:

  1. Use the asemgr utility to place each ASE service off line, stopping the services.

  2. Invoke the /sbin/init.d/asemember stop command on all the member systems.

After you stop ASE activity, you can perform the desired maintenance.

To restart ASE activity, follow these steps:

  1. Invoke the /sbin/init.d/asemember start command on all the member systems.

  2. Use the asemgr utility to place the ASE services on line.


[Contents] [Prev. Chapter] [Prev. Section] [Next Chapter] [Index] [Help]

2.8    Shutting Down a Cluster Member (PS)

Because each cluster member must maintain a kernel state regarding the clusterwide activities of the connection manager and distributed lock manager (DLM), you cannot shut a cluster member down to single-user mode and then bring it back up to multiuser mode. A full halt or complete reboot is required.

All normal methods of shutting down a single system and rebooting work for a cluster member. That is, the shutdown -h and shutdown -r commands (and halt and reboot console operations) work normally for systems running TruCluster Production Server Software with one exception.

On multiprocessing systems that are cluster members, pressing the halt button (or typing Ctrl/P at an AlphaServer 8200/8400 system console) does not cause a full halt of the member. To bring a multiprocessing system in a cluster to a full halt, enter one of the following console commands immediately after pressing the halt button:


[Contents] [Prev. Chapter] [Prev. Section] [Next Chapter] [Index] [Help]