8 Setting Up a DRD Service (PS)

The distributed raw disk (DRD) subsystem of a TruCluster Production Server cluster allows a disk-based, user-level application to run within a cluster, regardless of where in the cluster the physical storage it depends upon is located. A DRD service enables you to provide to applications, such as database and transaction processing (TP) monitor systems, parallel access to storage media from multiple cluster members.

Applications that perform I/O involving sets of large data files, random access to records within these files, and concurrent read/write data sharing can benefit from using the features of DRD. The DRD subsystem driver is a pseudodevice driver that provides an abstraction of the physical storage throughout the cluster.

The available server environment (ASE) manager utility (asemgr) allows you to set up a DRD service within an ASE in a cluster, and make it highly available and eligible for failover among the cluster member systems within that ASE. When creating a DRD service, you specify the physical media that the service will provide clusterwide. The asemgr utility sets up the service name and the device special files by which the service is accessed. After a DRD service has been established within a given ASE, cluster members both within and outside that ASE can access the disk storage it provides.

This chapter discusses the following topics:

The key concepts of the DRD subsystem implementation (Section 8.1)

The semantics of DRD device volume names (Section 8.2)

How to add a DRD service to a cluster (Section 8.3)

How to modify an existing DRD service (Section 8.4)

How to remove a DRD service from a cluster (Section 8.5)

How to tune the DRD subsystem (Section 8.6)

How to measure and test the DRD subsystem's performance (Section 8.7)

Other considerations for managing DRD services (Section 8.8)

How to locate and interpret DRD messages and troubleshoot the DRD subsystem (Section 8.9)

8.1 Overview of the DRD Subsystem Driver

The distributed raw disk (DRD) subsystem driver provides character device driver interfaces, receiving user requests through conventional system calls such as open, close, read, write, and ioctl. The DRD subsystem provides only raw disk capabilities: that is, file systems cannot be mounted on DRD devices.

Upon receipt of a user-level request, the DRD driver determines which member system is the server of the physical device, with the following results:

If the physical device is being served by the member system that received the user request, it is the server and the request is considered a local request. The local request is passed on to the underlying physical device driver (for instance, the SCSI CAM driver or LSM driver).

If the physical device is not being served by the member system that received the user request, another member system is the server and the request is considered a remote request. That member system could be in the same available server environment (ASE) as the server, or it could be in a different ASE within the cluster. A remote request is sent across a network transport to the server. The server then passes the request to the underlying physical device driver. When the local physical driver completes the request, the server returns the results and status to the client. Finally, the client returns results and status to the calling user-level program.

8.2 DRD Namespace

A distributed raw disk (DRD) device is a device special file that provides clusterwide access to a single underlying physical device (either a SCSI disk or Logical Storage Manager (LSM) volume). There is a one-to-one correspondence between a DRD device and an underlying physical device.

The asemgr, drd_mknod, and drd_ivp utilities create the DRD device special files that disk-based, clusterwide application programs open and to which they issue read and write system calls. These files reside in the /dev/rdrd/ directory and are assigned names such as drd1, drd2, and drd30. (The DRD subsystem reserves device number 0 for subsystem control purposes.)

The collection of DRD device special files forms the DRD namespace. As DRD services are added, the asemgr utility assigns DRD special file names sequentially. For example, if the file names /dev/rdrd/drd1, /dev/rdrd/drd2, and /dev/rdrd/drd3 are in use, the next new DRD service will be added as /dev/rdrd/drd4. To minimize holes in the DRD namespace, the asemgr utility reuses DRD device numbers for services that have been deleted.

The DRD namespace must be unique across the cluster. Therefore, in cluster configurations that include multiple available server environments (ASEs), the /dev/rdrd/drd1 special file must refer to the same disk regardless of which ASE is serving it. For this reason, the asemgr utility partitions the DRD namespace on a per-ASE basis. The first ASE is limited to DRD numbers drd1 through drd9999; the second ASE is limited to DRD numbers drd10000 through drd19999; and so on. The DRD namespace implementation accommodates a cluster consisting of 64 separate ASEs. Each ASE can have 9999 separate DRD services.

The asemgr utility creates the DRD device special files corresponding to DRD services within a single ASE. In cluster configurations consisting of a single ASE, the cluster administrator need not perform any explicit tasks to create these special files throughout the cluster. However, in cluster configurations consisting of multiple ASEs, the asemgr utility cannot create the special files for DRD services provided by one ASE on cluster members outside the serving ASE. In these configurations, the cluster administrator must perform the procedures detailed in Section 8.3, Section 8.4, and Section 8.5 to manage DRD services clusterwide.

Do not use the mknod command to create and assign names to DRD device special files. Unlike the SCSI device driver, the DRD device driver dynamically obtains a major number during system startup. This number can change each time the system is rebooted, and it can vary from member system to member system. For example, if you used the mknod to create a DRD device special file, its file handle would become stale at the next reboot because its major number will have changed. In some instances, you might obtain inconsistent results when using mknod for this purpose, including data corruption.

If you need to adjust DRD special filenames, use the drd_mknod command. If you need to create your own form of the DRD device special file namespace which better matches your usage model, set up symbolic links to the actual DRD device special files.

8.3 Adding a DRD Service

As described in Section 8.2, a distributed raw disk (DRD) device is a device special file that provides clusterwide access to a single, underlying physical device (either a SCSI disk or LSM volume).

A DRD service is a highly available service created and managed by means of the asemgr utility. A DRD service consists of one or more underlying DRD devices (each of which represents a single SCSI disk or LSM volume). As a result, there is not necessarily a one-to-one correspondence between a DRD service and an underlying physical device.

A cluster administrator may group various physical devices into a single DRD service for functional or organizational reasons. For example, an administrator may create a DRD service consisting of all physical devices within a single storage enclosure to force them to be served by a single member system within the available server environment (ASE).

For systems with large numbers of physical devices, it is often useful to group devices into DRD services to limit the number of services that need to be maintained. Note that when a DRD service consists of multiple underlying physical devices, all devices are relocated and failed over as a group. As a result, any DRD service is available only if all of its underlying physical devices are operational; failure of any underlying physical device disables a DRD service.

Note

When a DRD service consists of multiple underlying physical devices, the devices will not be considered for relocation by the drd_balance utility. See Section 8.6.1 for a discussion of DRD service placement and a description of the drd_balance utility.

Similarly, for DRD services that use Logical Storage Manager (LSM) volumes, the unit of failover and relocation is the LSM disk group. All disks within the disk group are relocated as one set. Because of this, you cannot assign separate LSM volumes within the same disk group to separate DRD services. However, you can have a single DRD service that consists of multiple LSM volumes within the same disk group, and you can employ multiple disk groups in a single DRD service.

When you add a DRD service, the asemgr utility prompts you for the following information:

DRD service name--The name can be up to 64 characters long. The asemgr utility uses a single service name to identify and manage a DRD service, regardless of how many underlying devices participate in it.

Physical disk(s) or LSM volume(s) to be used--You can associate a single physical disk with no more than one DRD service. Although you can specify multiple, nonoverlapping disk partitions in the device special file names you supply when creating a single DRD service, you cannot assign multiple partitions of the same physical disk to multiple DRD services. The asemgr utility assigns a discrete DRD device special file to each underlying physical device participating in the service.
The more disks that are configured into a single DRD service, the longer it takes for the service to relocate. For this reason, DIGITAL recommends that no more than 50 DRD disks participate in a single service.
An application accesses DRD devices by referring to the DRD device special files, and does not use the DRD service name. If you specify an LSM volume for the underlying physical device, the utility asks you to confirm the list of physical devices that comprise the associated disk group. This question is meant to remind you of the underlying storage configuration.

Automatic Service Placement (ASP) policy to use for the service--Note that, when a DRD service consists of multiple underlying physical devices, all devices are relocated and failed over as a group.

After you enter the required information, the asemgr utility displays the physical devices that you selected, the DRD service name, and the device special files that will be used to provide the service.

Note that the DRD service name and the DRD device special file names are defined separately. The DRD service name is assigned by the system administrator, and the DRD device special file names are automatically assigned by the DRD subsystem. There is no direct correlation between the two names. For example, the administrator may select the name drd2 as the DRD servce name. However, if this is the first DRD service in the cluster, it is likely that the DRD subsystem will automatically assign a device special file name of drd1 to the service. It is also possible to have a single DRD service name that contains multiple disk devices. In this case, there is not a one-to-one mapping between a service name and a device special file name.

After it creates a unique DRD device special file for each underlying physical device or LSM volume participating in the new service (for example, drd20004), the asemgr utility ensures that member systems within the same ASE as the DRD server know the service by creating the appropriate DRD device special file (for instance, /dev/rdrd/drd20004) on all member systems within the same ASE as the server. In other words, the device special file is created on all member systems with the same ASE_ID as the server.

On each member system that is outside of the available server environment (ASE) providing the DRD service, you must execute the drd_mknod command to create the device special files. As a last step in DRD service configuration, the asemgr utility tells you the command you must execute on each member system that is not in the server's ASE. For example:

NOTE: In order to access this DRD service from cluster members outside
      of this ASE execute the following on each node which is
      not a member of this ASE:
         drd_mknod -f drd20004

To configure a large cluster that consists of multiple ASEs with large numbers of DRD devices, first create within each ASE the DRD services you intend that ASE to provide. Next, use the drd_ivp utility on each cluster member. The drd_ivp -r command automates the creation of DRD device special files and provides a faster and less error-prone mechanism to configure large clusters than the drd_mknod utility. The drd_ivp -r command compiles a list of member systems within the cluster, polls each ASE for its DRD service configuration, and invokes the drd_mknod utility, as needed, to create all required DRD device special files on the member system from which it is executed.

Example 8-1 shows how to add a DRD service.

Example 8-1: Adding a DRD Service

# asemgr

.
.
.
Adding a service
 
Select the type of service:
 
    1)  NFS service
    2)  Disk service
    3)  User-defined service
    4)  DRD service
    5)  Tape service
 
    q)  Quit without adding a service
    x)  Exit                    ?)  Help
 
Enter your choice [1]: 4
 
You are now adding a new DRD disk service to your ASE.
 
A DRD disk service is comprised of any number of DRDs which can be
created from a single raw disk partition or LSM volume which will
be accessible from all members in the cluster.
 
Note: If using a raw disk partition please be sure that the character
      device special file exists on all members which are in this ASE.
 
                    DRD Service Name
 
The name of a DRD disk service must be a unique service name.
Enter the DRD disk service name: drd_svc_1
 
You will now be prompted to enter a list of devices comprising
the DRD service, select q when you have completed the list.
 
Enter an existing character device special file for one of the following:
 
        a physical device (ie /dev/rrz1c)
        a LSM volume (ie /dev/rvol/dg/vol01)
        To end the list, press the Return key at the prompt.
 
Enter character device special file: /dev/rrz9h
 
Enter an existing character device special file for one of the following:
 
        a physical device (ie /dev/rrz1c)
        a LSM volume (ie /dev/rvol/dg/vol01)
        To end the list, press the Return key at the prompt.
 
Enter character device special file: 
DRD Device Special File:    /dev/rdrd/drd10011
 
Underlying Storage:         /dev/rrz9h
 
NOTE: In order to access the DRD device[s in this service from cluster
      members outside of this ASE execute the following on each cluster
      node which is not a member of this ASE:
                drd_mknod -f drd10011
 
        Selecting an Automatic Service Placement (ASP) Policy
 
Select the policy you want ASE to use when choosing a member
to run this service:
 
    b)  Balanced Service Distribution
    f)  Favor Members
    r)  Restrict to Favored Members
 
    x)  Exit to Service Configuration    ?)  Help
 
Enter your choice [b]: 
 
        Selecting an Automatic Service Placement (ASP) Policy
 
Do you want ASE to consider relocating this service to another
 member if one becomes available while this service is running (y/n/?): y
 
Enter 'y' to add Service 'drd_svc_1'  (y/n): y
 
Adding service...
 
Starting service...
 
Saving the updated database...
 
 
Service drd_svc_1 successfully added...

Example 8-1 shows how to create a DRD service named drd_svc_1 in the local ASE. I/O requests to the DRD device in this service use the raw disk interface for the physical device rrz9h. The DRD special file used to access the DRD device is /dev/rdrd/drd10011. To allow clusterwide access to the service, you must execute one of the following commands on each member system outside of the ASE in which the DRD service's server resides:

# drd_mknod -f drd1
 
# drd_ivp -r

8.4 Modifying a DRD Service

To modify the properties of a distributed raw disk (DRD) service, use the asemgr utility. The utility's modify option allows you to change the Automatic Service placement (ASP) policy as well as the service configuration description. Changes to the service configuration description include the following:

Adding, modifying, and deleting physical disks used by services

Adding, modifying, and deleting LSM volumes used by a DRD services. See Section 10.6.3.3 for more information.

Changing the service name

The following behavior is specific to modifying DRD services:

The physical disks specified in DRD services can be disk partitions. If you are modifying a DRD service to add an additional disk partition, you must run the asemgr utility on the member system that provides the DRD service. This allows the necessary partition overlap checks to be performed.

When a DRD service is modified, the file permissions, ownership, and group of all of its DRD device special files are reset to the default settings. If you have modified these attributes of the device special file, you must reset them on all member systems in the ASE.

Example 8-2 shows how to modify the configuration information for a DRD service without interrupting the service's availability.

Example 8-2: Modifying an Online DRD Service

		Service Configuration
 
    a)  Add a new service
    m)  Modify a service
    o)  Modify a service without interrupting its availability
    d)  Delete a service
    s)  Display the status of a service
 
    x)  Exit to Managing ASE Services    ?)  Help
 
Enter your choice [x]: o
 
	Online Service Modification
 
Select the service you want to modify:
 
    1)  ase1 on fgreg1
    2)  greg on fgreg2
    3)  drd1 on fgreg1
 
    x)  Exit to Service Configuration
 
Enter your choice [x]: 3
 
Select what you want to modify in service `drd1`:
 
    g)  General service information
    a)  Automatic service placement (ASP) policy
 
    x)  Exit without modifications
 
Enter your choice [x]: g
 
The following lists the current configuration for DRD service "drd1"
 
 
Enter the option you wish to modify
 
     )  /dev/rdrd/drd1 -> /dev/rrz21b
    a)  Add LSM volume or physical disk
    d)  Delete LSM volume or physical disk
     )  Service name -> drd1
    q)  Quit without making any changes
    x)  Exit (done with modifications)
 
Enter your choice [x]: [Return]
 
Enter 'y' to modify service 'drd1' (y/n): y
Stopping old service information...
 
Deleting old service information...
 
Adding new service information...
 
Starting new service information...
 
Service successfully updated.
 
	Storage configuration for DRD service `drd1`
 
DRD Device Special File:    /dev/rdrd/drd1
 
Underlying Storage:         /dev/rrz21b
 
NOTE: In order to access the DRD devices in this service from cluster 
      members outside of this ASE execute the following on each cluster 
      node which is not a member of this ASE:
 
                drd_mknod -f drd1

8.5 Deleting a DRD Service

Use the asemgr utility to delete a distributed raw disk (DRD) service in the same manner as you would delete any other available server environment (ASE) service. See Section 10.7 for instructions.

When you delete a DRD service from an ASE, the asemgr utility deletes the corresponding device special file on all member systems within the same ASE as the DRD service's server. To delete the service from member systems in other ASEs in the cluster, you must manually execute the drd_mknod command on each member system. As a last step in DRD service deletion, the asemgr utility provides you with the command you must execute on each member system that is not in the same ASE. For example:

NOTE: In order to remove the device special file associated with this
      service on cluster nodes which are not a member of this ASE,
      execute the following on each node which is not a member of this ASE:
                 drd_mknod -d -f drd1

If you do not run the drd_mknod -d command, stale DRD device special files remain on member systems outside the serving ASE. If an application attempts to access the stale DRD file, it finds no server for the request, and the request times out with an error.

The drd_ivp -r command does not delete device special files, which correspond to DRD services that have been deleted. A cluster administrator must use the drd_mknod utility to remove the device special files.

8.6 Tuning the DRD Subsystem

You can tune the performance of the distributed raw disk (DRD) subsystem by setting any of a number of DRD-related parameters in the /etc/sysconfigtab file. The default settings of these parameters should be sufficient for most applications. See drd(7) for a list of the parameters and additional information.

8.6.1 Locating DRD Services

To maximize the performance of DRD devices, set up a DRD service on the member system that will be initiating the majority of the I/O operations that utilize the service. Local requests for a DRD service are inherently faster than remote requests for the same service, because they bypass the remote communication codepath and access the DRD device directly.

If you can identify the member system that issues the most I/O requests to a DRD device, use the asemgr utlity to identify that member system as the favored member participating in the DRD service's Automatic Service Placement (ASP) policy. When a given DRD service has many clients, it is difficult to identify a single major client, especially as the service's access patterns vary over time. For such DRD devices, specifying a favored member may not yield maximum performance.

In these cases, use the drd_balance utility for help in relocating the DRD service. The drd_balance utility periodically polls for I/O usage patterns for DRD devices, can make recommendations for optimally relocating DRD services, and can optionally attempt the recommended relocations itself.

Using the drd_balance utility on a given DRD service differs from selecting the available server environment (ASE) balanced service ASP. When the ASE balanced service ASP is selected, ASE tries to evenly distribute the number of services across member systems, as services are started. It does not take into account the actual system resources required to provide a service, and it cannot relocate a service that has already started. By contrast, the drd_balance utility will relocate a DRD service that has already started, based on its I/O access pattern.

Note

When a DRD service consists of multiple underlying physical devices, the devices will not be considered for relocation by the drd_balance utility.

The actual observed performance benefits of using the drd_balance utility vary based on your operating environment. In cases where the DRD device access varies considerably over short periods of time, the benefits may be small. However, if access patterns are relatively constant, there could be considerable benefit in environments encountering system constraints. A suggested approach is to benchmark your application with and without running the drd_balance utility. You could then use the drd_balance utility once to obtain information about the DRD device usage patterns, and then use that information in designating favored members as the ASE servers.

See drd_balance(8) for additional information.

You must not run the drd_balance utility while a Logical Storage Mananger (LSM) volume is in the middle of a volsave or volrestore operation. The volume save and restore operations can interfere with the cluster's ability to properly relocate DRD services.

8.6.2 Analyzing Tunable Parameters

Invoking the drd_ivp utility with the -t flag collects and analyzes selected DRD performance statistics, which can reveal the cause of performance bottlenecks. It identifies potential performance problems and suggests a resolution for each. To obtain an optimal analysis of the DRD subsystem, run the drd_ivp utility with the -t flag while the cluster is under peak load. See drd_ivp(8) for additional details.

To change the default values of DRD attributes, specify entries in the /etc/sysconfigtab file. After specifying new values for these attributes, you must reboot the system for them to take effect. See drd(7) for a list of tunable DRD attributes and instructions for viewing their values and modifying them.

8.6.3 Enabling Peer-to-Peer DMA Support for DRD

Peer-to-peer DMA (direct memory addressing) is a performance enhancement that can be used on a DRD server machine that meets certain configuration restrictions. When peer-to-peer DMA is enabled on a DRD server, the data read from a disk is sent directly from the host storage controller on the peripheral component interconnect (PCI) bus to the MEMORY CHANNEL controller on the same PCI bus without transferring via main memory on the server machine. The CPU load on the server machine is thus diminished. (Peer-to-peer DMA may be used when reading a remote disk, but never when writing to it due to lack of hardware support in the MEMORY CHANNEL.)

In order for peer-to-peer DMA to be enabled on a DRD server, the host storage controller and the MEMORY CHANNEL controller must on the same PCI bus. Peer-to-peer DMA is a global attribute for DRD; that is, all host storage (for example, SCSI) and MEMORY CHANNEL controllers used by DRD must be on the same PCI bus. The drd_dma utility runs at boot time before the TruCluster software starts. This utility analyzes the hardware configuration and automatically enables peer-to-peer DMA, if the configuration restriction is met.

In some cases, the drd_dma utility does not enable peer-to-peer DMA when the hardware configuration would actually support it. You can manually enable peer-to-peer DMA by setting the value of the drd-bss-rm-peer2peer parameter in the /etc/sysconfigtab file. Make sure no DRD disks are active before making the modification to the /etc/sysconfigtab file; otherwise, the system may panic. See drd_dma(7) for more information.

8.7 Testing and Measuring the Performance of DRD

You can use the diskx utility, provided in the optional System Exercisers subset of the DIGITAL UNIX operating system, to test and measure the performance of distributed raw disk (DRD) devices. The diskx utility performs testing in the following functional areas, depending on the flags that are specified in its command line:

Read testing

Write testing

Seek testing

Performance analysis

disktab entry verification

Other flags determine how the selected tests are run and specify test parameters.

You need root privilege to run the diskx utility. For a complete description of the tests it performs and a full list of the flags it accepts, enter the following command:

# /usr/field/diskx -h

The following example invokes the diskx utility to perform write testing on the /dev/rdrd/drd3 DRD device:

# cd /dev/rdrd
 
# /usr/field/diskx -f drd3 -w -X -x -max_xfer 8k -num_blocks 10000 -debug 1

 
DISKX - DEC OSF/1 Disk Exerciser.
 
Testing disk device drd3.
 
Program output level is 1.
 
Wed Mar  12 10:09:40 1997
 
-------------------------------------------------------------------------
 
Write Transfer Testing
 
This test verifies that writes will succeed.  The data is first
written to disk. After all writes have completed the data will be
read back for validation.  Since this test writes to the disk
there is potential for file system corruption if a file system
exists on the disk that is being tested.
 
Writes will be done using random size transfers.  The write
size will be randomly selected from the range 512 to 8192 bytes.
Writes will be issued to random locations on the disk.  To accomplish
this a seek will be issued before each write to force a write of a
different disk region.
 
Testing will continue until an interrupt signal is received.
 
Sequentially write to partition A.
Sequentially read verify partition A.
The initial write and read verification has succeeded.
Perform random writes to partition A
Random writes completed without error.
Perform random reads to partition A
Random reads completed without error.
 
Stopping testing due to receipt of a termination signal.

Disk Transfer Statistics  
 
Part Seeks Seek_Er Writes Writ_Er MB_Write  Reads Read_Er MB_Read Data_Er       
     29850       0  30000       0     39.1  29833       0    38.4       0 
------------------------------------------------------------------------- 
Wed Mar  12 10:18:47 1997 
Terminating disk exerciser.

The diskx utility is useful for measuring observed DRD performance, but it does not produce the maximum possible read and write throughput. To achieve maximum read and write throughput, an application should use asynchronous I/O operations instead of synchronous read and write system calls. For a complete description of asynchronous I/O, see aio_read(3), aio_write (3), and the DIGITAL UNIX Guide to Realtime Programming.

8.8 Other DRD Administrative Concerns

You may need to take into account the following operational concerns when preparing the distributed raw disk (DRD) services in a cluster:

A device contributing to a DRD service cannot contribute to any other available server environment (ASE) service. For example, you cannot specify a DRD device as part of another DRD service, disk service, tape service, or Network File System (NFS) service.

Like the underlying device drivers on which it is layered (such as the SCSI CAM driver and the Logical Storage Manager (LSM) driver), the DRD subsystem itself does not guarantee to service requests in any given order. For example, if two cluster applications (executing either on the same member system or on separate member systems) issue writes to the same disk block at the same time, the DRD subsystem may complete the writes in any order, possibly with unintended results.
If an application must share access to a given set of disk blocks with another application, and the ordering of their I/O requests is important, both applications should use distributed lock manager (DLM) services to lock and synchronize their access to the disk. DLM services are discussed in the TruCluster Production Server Software Application Programming Interfaces manual.

After creating a DRD service, all I/O operations to the device should be performed using the DRD device special file name (for example, /dev/rdrd/drd1). Do not access the device by means of its underlying physical device name (for example, /dev/rrz17c). To protect against data corruption resulting from unsynchronized simultaneous access to a device, ensure that cooperating applications use DLM services to coordinate access to DRD device special files.

The underlying physical device used in a DRD service must not be enabled for use by the Prestoserve^TM I/O acceleration hardware. Prestoserve hardware consists of a local disk cache on the system providing the disk service. Other cluster members cannot directly access this cache. If the cluster member providing a DRD service fails, the service cannot be relocated to another member system without risk of data corruption because that system cannot access the cache contents.

The underlying physical device used in a DRD service must not be attached to a disk controller that uses a volatile writeback cache. Use of a volatile writeback cache optimizes performance at the expense of fault tolerance. Certain failures, such as a power loss, will cause data to be lost, inasmuch as it had not been preserved in nonvolatile storage on the disk device.

8.9 Troubleshooting the DRD Subsystem

This section provides information to assist you in troubleshooting the distributed raw disk (DRD) subsystem of a cluster.

8.9.1 DRD Extensions to the kdbx Debugger

TruCluster software extends the kdbx debugger to allow it to display the contents of the DRD map table. You can use this feature on both crash dumps and the running system. (Note, though, that the asemgr utility is the preferred means of obtaining status information on DRD services.)

Note

The kdbx debugger is included the optional base system subset titled "Kernel Debugging Tools". You can use the DRD extension to the kdbx debugger only if you have previously installed that subset.

The drd extension to the kdbx debugger is defined as follows:

drd [flags] [number]: Displays the DRD map table. Valid flags for the drd extension are as follows:
-full: Displays all map entries in long form.
-terse: Displays all map entries in brief form. This is the default.

The number argument causes the DRD map for only the specified DRD number to be displayed.

The following example shows the short form display of a DRD map listing:

(kdbx) pr /var/ase/sbin/drd

Name                Server   Local Name             Struct address
 
/dev/rdrd/drd1      rclu4    /dev/rvol/dg1/vol01    0xfffffc000ff54c80
 
/dev/rdrd/drd10001  rclu12   /dev/rrz9h             0xfffffc0005202000
 
/dev/rdrd/drd2      rclu4    /dev/rvol/dg1/vol02    0xfffffc0005202640
 
/dev/rdrd/drd10002  rclu12   /dev/rrz10h            0xfffffc000ff552c0

The following example shows the short form display of a map listing for a specific DRD number:

(kdbx) pr /var/ase/sbin/drd 7
 
/dev/rdrd/drd7      rclu3    /dev/rvol/dg4/vol01    0xfffffc000d5aec80

The following example shows the long form display of a map listing for a specific DRD number:

(kdbx) pr /var/ase/sbin/drd 7 -full
 
-----------------------------------------------------
 
Name: /dev/rdrd/drd7
 
Minor Number: 7
 
Structure address: 0xfffffc000d5aec80
 
drd_local_devt: 0x0, (0, 0)
 
drd_local_bdev: 0x4200007, (66, 7)
 
Local Device Name: /dev/rvol/dg4/vol01
 
Server Hostname: rclu3
 
State Flags: 0x2
 
Management State Flags: 0x501
 
Ref Count: 0
 
Drain Pending: 0
 
Delete Count: 0
 
MC Node Number: 4
 
Maxphys: 65536
 
Spare: 0

8.9.2 Debugging a Nonoperational DRD Service

A DRD service can fail to work properly for a number of reasons as follows:

The DRD subsystem was not properly installed or configured. See the TruCluster Software Products Software Installation manual for a discussion about how you can verify correct installation and configuration.

A failure has occurred in another cluster subsystem.

A failure has occurred in the DRD subsystem.

The underlying physical disk (or Logical Storage Manager (LSM) volume) is nonoperational.

8.9.2.1 Verifying Cluster Operation

Because DRD errors are often a symptom of problems within other cluster subsystems, look for related problems, as follows:

Check the console messages and error log files on each cluster member. Log files are typically found in the /var/adm/syslog.dated directory.

Run the cluster installation verification procedure (clu_ivp) to see if it points out any abnormalities. See clu_ivp(8) for more information on the clu_ivp utility.

8.9.2.2 Verifying DRD Subsystem Operation

To verify that the various DRD system components are operational, use the following command:

# drd_ivp -p -v -c
 
Cluster Configuration Information
 
Hostname               ASE_ID   BSSD   BSSD    DRD    Lic
                                 Reg   Resp   Conf    Reg
----------------------------------------------------------
mcclu11                      0    Yes    Yes    Yes    Yes
mcclu12                      0    Yes    Yes    Yes    Yes
mcclu3                       1    Yes    Yes    Yes    Yes
mcclu4                       1    Yes    Yes    Yes    Yes
 
DRD configuration validation tests succeeded.
ASE_ID validation tests succeeded.

For more information, see drd_ivp(8).

8.9.2.3 Checking the Status of DRD Services

Use the asemgr utility to query the status of a DRD service. Select the "Display the status of a service" option from the Managing ASE Services menu. The following example shows the status display for a DRD service named drd_svc_4:

        Status for DRD service `drd_svc_4`
 
 Status:             Relocate:  Placement Policy:       Favored Member(s):
 on mcclu12           yes       Balanced_Services         None
 
        Storage configuration for DRD service `drd_svc_4`
 
DRD Device Special File:    /dev/rdrd/drd4
Underlying Storage:         /dev/rrz13g
 
NOTE: In order to access the DRD devices in this service from cluster
      members outside of this ASE execute the following on each cluster
      node which is not a member of this ASE:
                drd_mknod -f drd4

Of particular importance in the asemgr utility's output is the Status field. This field indicates that member system mcclu12 is the server. If the Status field indicates that the service is off line or unassigned, use the asemgr utility to try to bring the service on line.

Keep in mind that the status of a service can change from one moment to the next, given the load balancing and failover that can occur within an available server environment (ASE). For example, DRD service drd4 may be served by mcclu12 at the time of the status request, but it may be relocated to mcclu11 at the very next instant.

8.9.2.4 Tracking the Failure of a Specific Service

If a specific DRD service is not working properly, follow these steps to identify the problem:

Run the clu_ivp utility. This utility checks a wide range of cluster functions. See Section 2.6 for more information.

Verify network connectivity. Verify on each member system that you can ping the other member systems over the MEMORY CHANNEL interface (mc0). For example:

# ping mcclu11
 
	PING mcclu11.sun.ra.com (4.0.0.11): 56 data bytes
	64 bytes from 4.0.0.11: icmp_seq=0 ttl=255 time=0 ms
 
	----mcclu11.sun.ra.com PING Statistics----
	1 packets transmitted, 1 packets received, 0% packet loss
	round-trip (ms)  min/avg/max = 0/0/0 ms

Verify that the physical devices participating in the service are working. For example, in Section 8.9.2.3, the asemgr utility showed the following disk as participating in DRD service drd4:
```
Underlying Storage:         /dev/rrz13g
 
```
Use the file command on the member system that is serving the DRD service to verify that the disk device is at least minimally operational:
```
# file /dev/rrz13g
 
/dev/rrz13g: character special (8/21510) SCSI #1 RZ28 disk #104
                                                           (SCSI ID #5)
```
This output shows that the device can identify itself as a SCSI disk of type RZ28. It is likely that the disk itself is operational. If the device were not present, or if a SCSI device reservation is being held by another member of the ASE, the file command would display output like the following:
```
/dev/rrz30a:    character special (8/55296)
```
Because this output does not show the disk type, the disk may not be configured. Check the system startup messages to see if the disk was identified along with the other disks.

Verify DRD device special file access. Use the file command on the member system that is serving the DRD service to verify that the DRD device special file name is properly recognized:
```
# file /dev/rdrd/drd4

/dev/rdrd/drd4: character special (65/4) SCSI #1 RZ28 disk #104
                                                       (SCSI ID #5)
```
If the underlying physical device is an LSM volume, the device identification is different than the SCSI identification. For example:
```
# file /dev/rdrd/drd2
 
/dev/rdrd/drd2:  character special (66/2) special_device #255
```
In the following example, the file command fails to identify the DRD service:
```
# file /dev/rdrd/drd99
 
/dev/rdrd/drd99: character special (63/99)
```
In this case, it may take a long time before the file command completes. This type of output indicates that a user-level program has attempted to access a DRD service that has no server. The DRD subsystem attempts to determine which member is the nonexistent server, but it eventually gives up. This situation usually indicates the existence of stale files in the /dev/rdrd directory. You may also see the following error from the file command:
```
# file /dev/rdrd/drd99
 
file: Cannot get file status on /dev/rdrd/drd99.
/dev/rdrd/drd99: cannot open for reading
```
This message indicates that the device special file does not exist. For DRD services within the same ASE, the asemgr utility creates the appropriate device special files. If your cluster is composed of multiple ASEs, you must explicitly create the device special files on those member systems that are in other ASEs. To do this, enter a drd_mknod or drd_ivp -r command on these member systems.

After you verify access to the DRD devices in the DRD service from the server, enter the file command on the member systems to verify that they, too, have access to the DRD devices.

8.9.2.5 Examining DRD Device Special File Permissions

The asemgr utility creates DRD device special files with the permissions, ownership, and group as shown in the following example:

# ls -l /dev/rdrd/drd2

crw-r--r--   1 root     system    66,  2 May 19 07:17 /dev/rdrd/drd2

You may use the chmod, chown, or chrgp command to modify the permissions of the DRD device special files to make them more accessible. Note that if you change the permissions of the DRD device special files on one member system, these changes are not automatically propagated to other member systems. To allow highly available applications to fail over to other member systems and keep the same access permissions, you must modify the permissions in the same way on the device special files on each member system.

When a DRD service configuration is modified using asemgr, the DRD device special file permissions are reset to the default value.

8.9.2.6 Reading from the DRD Disk

After successfully completing the steps in the previous sections, try to read from the DRD disk. Enter the dd command first on the server and then on each other member system. The following example issues 20 8-KB read requests:

# dd if=/dev/rdrd/drd4 of=/dev/null bs=8k count=20

20+0 records in
 
20+0 records out

In this example, the 20 read operations succeeded.

In the following example, the dd command attempts to read from a DRD device for which there is no server. This command takes a long time to return with the error message, because the DRD retries the command in order to determine which member system is the server until it times out. An error such as this may occur if the DRD device special file (for example, /dev/rdrd/drd99) corresponds to a DRD service that is provided by another ASE that is not up, or a service that has been deleted from another ASE.

# dd if=/dev/rdrd/drd99 of=/dev/null bs=8k count=20

/dev/rdrd/drd99: No such device

8.9.2.7 Writing to the DRD Disk

In general, performing read requests is enough to verify correct DRD operation.

Caution

Although reading from a DRD disk is a rather harmless operation, writing to a disk can be destructive, unless you exercise the appropriate caution. Before you attempt to write to the disk, ensure that you will not be writing over valid data.

The following example performs 20 8-KB write requests:

# dd if=/dev/zero of=/dev/rdrd/drd4 bs=8k count=20

20+0 records in
 
20+0 records out

In this example, the 20 write operations succeeded.

A write to a DRD device may fail if there is a disk label on the underlying physical device. For example:

# dd if=/dev/zero of=/dev/rdrd/drd4 bs=8k count=20
 
dd write error: Read-only file system
 
8+0 records in
 
0+0 records out

In this example, none of the write operations succeeded.

The Read-only file system error message indicates that the cause of this problem was an attempt to write to the first block of a disk with a disk label. To write to block 0, you must delete the disk label by first placing the DRD service off line, and then zeroing the label, as follows:

# disklabel -z /dev/rrz13c

Note that this is not a DRD-specific behavior. The same error would have occurred had you specified /dev/rrz13c in the dd command line.

The diskx utility is useful for performing more comprehensive read/write data validation testing. See the description of using diskx with DRD in Section 8.7.