502
High Availability Cluster Multi-Processing for AIX Administration guide

High Availability Cluster Multi-Processing for AIX ...public.dhe.ibm.com › systems › power › docs › powerha › 61 › ...vi High Availability Cluster Multi-Processing for

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • High Availability Cluster Multi-Processing for AIX

    Administration guide

    ���

  • High Availability Cluster Multi-Processing for AIX

    Administration guide

    ���

  • NoteBefore using this information and the product it supports, read the information in “Notices” on page 481.

    This edition applies to HACMP 6.1 for AIX and to all subsequent releases and modifications until otherwiseindicated in new editions.

    © Copyright IBM Corporation 2004, 2015.US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contractwith IBM Corp.

  • Contents

    About this document . . . . . . . . . vHighlighting . . . . . . . . . . . . . . vCase-sensitivity in AIX . . . . . . . . . . . vISO 9000. . . . . . . . . . . . . . . . vHACMP publications . . . . . . . . . . . vHACMP/XD publications. . . . . . . . . . viHACMP Smart Assist publications . . . . . . . vi

    Administration guide . . . . . . . . . 1What's new in Administering HACMP . . . . . 1Administering an HACMP cluster . . . . . . . 1

    Options for configuring an HACMP cluster . . . 1Configuration tasks . . . . . . . . . . . 2Maintaining an HACMP cluster . . . . . . . 5Monitoring the cluster . . . . . . . . . . 7AIX files modified by HACMP . . . . . . . 8Changing script behavior for fatal errors inHACMP . . . . . . . . . . . . . . 11

    Managing HACMP using WebSMIT . . . . . . 11Working with WebSMIT . . . . . . . . . 12Managing multiple clusters with WebSMIT . . . 14Using the Enterprise view . . . . . . . . 14Configuring HACMP using WebSMIT . . . . 19Viewing the cluster components . . . . . . 25Viewing cluster configuration information inWebSMIT . . . . . . . . . . . . . . 28Viewing HACMP documentation in WebSMIT. . 30Customizing WebSMIT colors . . . . . . . 31Enabling Internationalization in WebSMIT . . . 33

    Configuring an HACMP cluster (standard) . . . . 34Overview of configuring a cluster . . . . . . 34Configuring a two-node cluster or using SmartAssists . . . . . . . . . . . . . . . 37Defining HACMP cluster topology (standard) . . 38Configuring HACMP resources (standard) . . . 39Configuring HACMP resource groups (standard) 42Configuring resources in resource groups(standard) . . . . . . . . . . . . . . 43Verifying and synchronizing the standardconfiguration . . . . . . . . . . . . . 45Viewing the HACMP configuration . . . . . 46

    Configuring HACMP cluster topology and resources(extended) . . . . . . . . . . . . . . . 47

    Understanding the Extended Configurationoptions . . . . . . . . . . . . . . . 47Configuring an HACMP cluster using theExtended SMIT menu . . . . . . . . . . 48Discovering HACMP-related information . . . 50Configuring cluster topology (extended) . . . . 50Configuring HACMP resources (extended) . . . 65

    Configuring HACMP resource groups (extended). . 96Configuring resource groups . . . . . . . 97Configuring resource group runtime policies 100Configuring dependencies between resourcegroups . . . . . . . . . . . . . . 101

    Adding resources and attributes to resourcegroups using the extended path . . . . . . 116Customizing inter-site resource group recovery 120Reliable NFS function . . . . . . . . . 121Forcing a varyon of volume groups . . . . . 123Running a resource group in an AIX WPAR . . 125Testing your configuration . . . . . . . . 127

    Configuring cluster events . . . . . . . . . 127Considerations for pre- and post-event scripts 127Configuring pre- and post-event commands . . 128Configuring pre-event and post-eventprocessing . . . . . . . . . . . . . 129Tuning event duration time until warning . . . 130Configuring a custom remote notificationmethod . . . . . . . . . . . . . . 132

    Verifying and synchronizing an HACMP cluster 137Running cluster verification . . . . . . . 137Automatic verification and synchronization . . 138Verifying the HACMP configuration using SMIT 141Inactive components report. . . . . . . . 149Managing HACMP file collections . . . . . 149Adding a custom verification method . . . . 156List of reserved words . . . . . . . . . 156

    Testing an HACMP cluster . . . . . . . . . 157Overview for testing a cluster . . . . . . . 157Running automated tests . . . . . . . . 160Understanding automated testing . . . . . 161Setting up custom cluster testing . . . . . . 164Description of tests . . . . . . . . . . 168Running custom test procedures . . . . . . 181Evaluating results . . . . . . . . . . . 183Recovering the control node after clustermanager stops . . . . . . . . . . . . 184Error logging . . . . . . . . . . . . 184Fixing problems when running cluster tests . . 190

    Starting and stopping cluster services . . . . . 193Starting cluster services . . . . . . . . . 194Stopping cluster services . . . . . . . . 199Maintaining cluster information services . . . 203

    Monitoring an HACMP cluster . . . . . . . 205Periodically monitoring an HACMP cluster . . 205Monitoring clusters with Tivoli distributedmonitoring . . . . . . . . . . . . . 207Monitoring clusters with clstat . . . . . . 221Monitoring applications . . . . . . . . . 229Displaying an application-centric cluster view 231Measuring Application Availability . . . . . 231Using resource groups information commands 235Using HACMP topology information commands 241Monitoring cluster services . . . . . . . . 241HACMP log files . . . . . . . . . . . 242

    Managing shared LVM components . . . . . . 247Shared LVM: Overview . . . . . . . . . 248Understanding C-SPOC . . . . . . . . . 248Maintaining shared volume groups . . . . . 250Maintaining logical volumes . . . . . . . 261

    © Copyright IBM Corp. 2004, 2015 iii

  • Maintaining shared file systems . . . . . . 265Maintaining physical volumes . . . . . . . 268Configuring cross-site LVM mirroring . . . . 274

    Managing shared LVM components in a concurrentaccess environment . . . . . . . . . . . 277

    Understanding concurrent access and HACMPscripts . . . . . . . . . . . . . . . 278Maintaining concurrent volume groups withC-SPOC . . . . . . . . . . . . . . 279Maintaining concurrent access volume groups 281

    Managing the cluster topology . . . . . . . 283Reconfiguring a cluster dynamically . . . . . 283Viewing the cluster topology . . . . . . . 284Managing communication interfaces in HACMP 285Changing a cluster name . . . . . . . . 291Changing the configuration of cluster nodes . . 292Changing the configuration of an HACMPnetwork . . . . . . . . . . . . . . 293Changing the configuration of communicationinterfaces . . . . . . . . . . . . . . 297Managing persistent node IP labels . . . . . 299Changing the configuration of a global network 300Changing the configuration of a networkmodule . . . . . . . . . . . . . . 301Changing the configuration of a site . . . . . 309Removing a site definition . . . . . . . . 310Synchronizing the cluster configuration. . . . 310Dynamic reconfiguration issues andsynchronization . . . . . . . . . . . 310

    Managing the cluster resources . . . . . . . 312Reconfiguring a cluster dynamically . . . . . 312Requirements before reconfiguring . . . . . 313Dynamic cluster resource changes . . . . . 313Reconfiguring application servers . . . . . 314Changing or removing application monitors . . 316Reconfiguring service IP labels as resources inresource groups . . . . . . . . . . . 318Reconfiguring communication links . . . . . 320Reconfiguring tape drive resources . . . . . 323Using NFS with HACMP . . . . . . . . 324Reconfiguring resources in clusters withdependent resource groups . . . . . . . . 327Synchronizing cluster resources . . . . . . 328

    Managing resource groups in a cluster . . . . . 329Changing a resource groups . . . . . . . 329Resource group migration . . . . . . . . 340

    Managing users and groups . . . . . . . . 355Overview for AIX users and groups . . . . . 355Managing user accounts across a cluster . . . 356Managing password changes for users . . . . 358Changing the password for your own useraccount . . . . . . . . . . . . . . 361Managing group accounts . . . . . . . . 362

    Managing cluster security . . . . . . . . . 364Configuring cluster security . . . . . . . 365Standard security mode . . . . . . . . . 365

    Setting up Cluster Communications over a VPN 367Configuring message authentication andencryption . . . . . . . . . . . . . 368

    Saving and restoring cluster configurations . . . 375Relationship between the OLPW clusterdefinition file and a cluster snapshot . . . . 376Information saved in a cluster snapshot . . . 377Format of a cluster snapshot . . . . . . . 377clconvert_snapshot utility . . . . . . . . 378Defining a custom snapshot method. . . . . 379Changing or removing a custom snapshotmethod . . . . . . . . . . . . . . 379Creating a snapshot of the cluster configuration 379Restoring the cluster configuration from asnapshot . . . . . . . . . . . . . . 380Changing a snapshot of the clusterconfiguration . . . . . . . . . . . . 382Removing a snapshot of the clusterconfiguration . . . . . . . . . . . . 382

    7x24 maintenance . . . . . . . . . . . . 382Planning for 7x24 maintenance . . . . . . 383Runtime maintenance . . . . . . . . . 390Hardware maintenance . . . . . . . . . 395Preventive maintenance . . . . . . . . . 397

    Resource group behavior during cluster events . . 399Resource group event handling and recovery 400Selective fallover for handling resource groups 403Handling of resource group acquisition failures 408Recovering resource groups when nodes join thecluster . . . . . . . . . . . . . . . 409Handling of resource groups configured withIPAT via IP aliases. . . . . . . . . . . 410Examples of location dependency and resourcegroup behavior . . . . . . . . . . . . 411

    HACMP for AIX commands . . . . . . . . 436Overview of contents . . . . . . . . . . 437HACMP for AIX commands . . . . . . . 439HACMP for AIX C-SPOC commands . . . . 451

    Using DLPAR and CUoD in an HACMP cluster 459Overview of DLPAR and CUoD . . . . . . 459HACMP integration with the CUoD function 460Planning for CUoD and DLPAR . . . . . . 462Configure CUoD in HACMP . . . . . . . 465Application provisioning in HACMP . . . . 474Using pre- and post-event scripts. . . . . . 478Troubleshooting DLPAR and CUoD operationsin HACMP . . . . . . . . . . . . . 479

    Live Partition Mobility . . . . . . . . . . 479

    Notices . . . . . . . . . . . . . . 481Privacy policy considerations . . . . . . . . 483Trademarks . . . . . . . . . . . . . . 483

    Index . . . . . . . . . . . . . . . 485

    iv High Availability Cluster Multi-Processing for AIX: Administration guide

  • About this document

    This guide introduces the High Availability Cluster Multi-Processing for AIX (HACMP) software. Thisinformation is also available on the documentation CD that is shipped with the operating system.

    HighlightingThe following highlighting conventions are used in this document:

    Bold Identifies commands, subroutines, keywords, files, structures, directories, and other items whose names arepredefined by the system. Also identifies graphical objects such as buttons, labels, and icons that the userselects.

    Italics Identifies parameters whose actual names or values are to be supplied by the user.

    Monospace Identifies examples of specific data values, examples of text similar to what you might see displayed,examples of portions of program code similar to what you might write as a programmer, messages fromthe system, or information you should actually type.

    Case-sensitivity in AIXEverything in the AIX® operating system is case-sensitive, which means that it distinguishes betweenuppercase and lowercase letters. For example, you can use the ls command to list files. If you type LS, thesystem responds that the command is not found. Likewise, FILEA, FiLea, and filea are three distinct filenames, even if they reside in the same directory. To avoid causing undesirable actions to be performed,always ensure that you use the correct case.

    ISO 9000ISO 9000 registered quality systems were used in the development and manufacturing of this product.

    HACMP publicationsThe HACMP™ software comes with the following publications:v HACMP for AIX Release Notes® in /usr/es/sbin/cluster/release_notes describe issues relevant to

    HACMP on the AIX platform: latest hardware and software requirements, last-minute information oninstallation, product usage, and known issues.

    v HACMP for AIX: Administration Guidev HACMP for AIX: Concepts and Facilities Guidev HACMP for AIX: Installation Guidev HACMP for AIX: Master Glossaryv HACMP for AIX: Planning Guidev HACMP for AIX: Programming Client Applicationsv HACMP for AIX: Troubleshooting Guidev HACMP on Linux: Installation and Administration Guidev HACMP for AIX: Smart Assist Developer’s Guide

    © Copyright IBM Corp. 2004, 2015 v

  • HACMP/XD publicationsThe HACMP Extended Distance (HACMP/XD) software solutions for disaster recovery, added to thebase HACMP software, enable a cluster to operate over extended distances at two sites. HACMP/XDpublications include the following:v HACMP/XD for Geographic LVM (GLVM): Planning and Administration Guidev HACMP/XD for Metro Mirror: Planning and Administration Guide

    HACMP Smart Assist publicationsThe HACMP Smart Assist software helps you quickly add an instance of certain applications to yourHACMP configuration so that HACMP can manage their availability. The HACMP Smart Assistpublications include the following:v HACMP Smart Assist for DB2® User’s Guidev HACMP Smart Assist for Oracle User’s Guidev HACMP Smart Assist for WebSphere® User’s Guidev HACMP Smart Assist Release Notes in /usr/es/sbin/cluster/release_notes_assist

    vi High Availability Cluster Multi-Processing for AIX: Administration guide

  • Administration guide

    This guide provides information necessary to configure, manage, and troubleshoot the High AvailabilityCluster Multi-Processing for AIX (HACMP) software.

    Note: PowerHA® SystemMirror® is the new name for HACMP. This book will continue to refer toHACMP

    What's new in Administering HACMPRead about new or significantly changed information for the Administering HACMP topic collection.

    How to see what's new or changed

    In this PDF file, you might see revision bars (|) in the left margin that identifies new and changedinformation.

    June 2015

    The following information is a summary of the updates that are made to this topic collection:v Added information about the migration process that uses Live Partition Mobility (LPM) on a HACMP

    node in a cluster in the “Live Partition Mobility” on page 479 topic.

    June 2014

    The following information is a summary of the updates made to this topic collection:v Added information about configuring pre-event and post-event processing in the “Configuring

    pre-event and post-event processing” on page 129 topic.v Updated information about configuring authorization for users in the “Configuring authorization for

    users” on page 360 topic.v Updated information about disabling first failure data capture with the FFDC_COLLECTION

    environment variable in the “First failure data capture” on page 190 topic.

    Administering an HACMP clusterThese topics provide a list of the tasks you perform to configure, maintain, monitor, and troubleshoot anHACMP system, related administrative tasks, and a list of AIX files modified by HACMP.

    Options for configuring an HACMP clusterIn HACMP, you can configure a cluster using one of the several different HACMP tools.

    These tools include:v HACMP SMIT user interface.v WebSMIT utility:For information on using this utility, see Administering a cluster using WebSMIT.v Online Planning Worksheets (OLPW):This tool provides a convenient method for documenting your

    cluster configuration: You can use the tool to configure a new cluster or to document an existingcluster. For instructions, see the chapter on Using Online Planning Worksheets in the Planning Guide.

    v Two-Node Cluster Configuration Assistant: Use this tool to configure a basic two-node HACMP cluster.You supply the minimum information required to define a cluster, and HACMP discovers the

    © Copyright IBM Corp. 2004, 2015 1

  • remainder of the information for you. See the section on Using the Two-Node Cluster ConfigurationAssistant in the chapter on Creating a Basic HACMP Cluster in the Installation Guide.

    v GLVM Cluster Configuration Assistant: Use this tool to configure basic Two-Site HACMP configurationand perform automatic GLVM mirroring for existing volume groups. See the section Creating a basicTwo-Site HACMP configuration with GLVM mirroring in the Installation Guide.

    v General Configuration Smart Assist: Start with your installed application and configure a basic cluster(any number of nodes). If you are configuring a WebSphere, DB2 UDB or Oracle application, see thecorresponding HACMP Smart Assist guide. See Configuring an HACMP cluster (standard).

    v Cluster Snapshot Utility: If you have a snapshot of the HACMP cluster configuration taken from a priorrelease, you can use the Cluster Snapshot utility to perform the initial configuration.

    Related concepts:“Managing HACMP using WebSMIT” on page 11HACMP includes a Web-enabled user interface (WebSMIT).“Configuring an HACMP cluster (standard)” on page 34These topics describe how to configure an HACMP cluster using the SMIT Initialization and StandardConfiguration path.Related reference:“Saving and restoring cluster configurations” on page 375You can use the cluster snapshot utility to save and restore cluster configurations. The cluster snapshotutility allows you to save to a file a record of all the data that defines a particular cluster configuration.This facility gives you the ability to recreate a particular cluster configuration, provided the cluster isconfigured with the requisite hardware and software to support the configuration.Related information:Using Online Planning WorksheetsInstallation guideCreating a basic Two-Site HACMP configuration with GLVM mirroring

    Configuration tasksThe HACMP configuration tasks are described in these topics. You can choose to use either the standardor the extended SMIT path for the initial configuration, although the standard configuration path isrecommended.

    The major steps in the process are:v First, configure the cluster topology, and then HACMP resources and resource groups using the

    standard configuration pathor

    First, configure the cluster topology, and then HACMP resources and resource groups using theextended configuration path.

    v (Optional) Configure pre- and post-events, remote notification, HACMP File Collections, clusterverification with automatic corrective action, and other optional settings.

    v Verify and synchronize the HACMP configuration.v Test the cluster.

    Configuring HACMP using the Standard Configuration pathUsing the options under the Initialization and Standard Configuration SMIT menu, you can add thebasic components of the HACMP cluster to the HACMP Configuration Database (ODM) in a few steps.This configuration path significantly automates the discovery and selection of configuration informationand chooses default behaviors.

    The prerequisites and default settings of this path are:

    2 High Availability Cluster Multi-Processing for AIX: Administration guide

  • v Connectivity for communication must already be established between all cluster nodes. Automaticdiscovery of cluster information runs by default. That is, once you have configured communicationinterfaces/devices and established communication paths to other nodes, HACMP automatically collectsHACMP-related information and automatically configures the cluster nodes and networks based onphysical connectivity. All discovered networks are added to the cluster configuration. This helps you inthe configuration process.To understand how HACMP maintains the security of incoming connections, see Maintaining anHACMP cluster.

    v IP aliasing is used as the default mechanism for binding service IP labels/addresses to networkinterfaces.

    v You can configure the most common types of resources. Customization of resource groupfallover/fallback behavior supports the most common scenarios.

    Configuring an HACMP cluster (standard) takes you through the configuration process if you plan to usethe Initialization and Standard Configuration path in SMIT. Once you have configured the basiccomponents, you can use the Extended Configuration path to customize your configuration.Related concepts:“Maintaining an HACMP cluster” on page 5HACMP systems have different maintenance tasks.“Configuring an HACMP cluster (standard)” on page 34These topics describe how to configure an HACMP cluster using the SMIT Initialization and StandardConfiguration path.Related information:Planning cluster network connectivity

    Configuring HACMP using the Extended Configuration pathIn order to configure the less common HACMP elements, or if connectivity to each of the cluster nodes isunavailable, you can manually enter the information. When using the menu panels under the ExtendedConfiguration SMIT path, if any components are on remote nodes, you must manually initiate thediscovery of cluster information. That is, the discovery process used by HACMP is optional when usingthis path (rather than automatic, as it is when using the Initialization and Standard Configuration SMITpath).

    Using the options under the Extended Configuration SMIT menu, you can add basic components to theHACMP Configuration Database (ODM), as well as additional types of resources. Use the ExtendedConfiguration path to customize the cluster for all the components, policies, and options that are notincluded in the standard configuration menus.

    Configuring topology and resources

    Configuring an HACMP cluster (standard) describes all the SMIT menus and options available forconfiguring cluster topology and all the various types of resources supported by the software.

    There is an option to configure a distribution preference for the aliases of the service IP labels that areplaced under HACMP control. A distribution preference for service IP label aliases is a network-wide attributeused to control the placement of the service IP label aliases on the physical network interface cards on thecluster nodes.

    For more information, see Distribution preference for service IP label aliases: Overview.

    Configuring resource groups and assigning resources

    Configuring HACMP resource groups (extended) describes how to configure different types of resourcegroups. The Extended Configuration menus include options for configuring various runtime policies for

    Administration guide 3

  • resource groups as well as for customizing fallover, fallback and startup behavior. It also includes theprocedure for adding resources to a resource group.

    Configuring dynamic LPAR and Capacity Upgrade on Demand resources

    Using DLPAR and CUoD in an HACMP cluster describes how to plan, integrate, configure, andtroubleshoot application provisioning for HACMP through the use of dynamic LPAR (DLPAR) andCapacity Upgrade on Demand (CUoD) functions available on some System p servers. It also includesexamples and recommendations about customizing your existing pre- and post-event scripts.Related concepts:“Configuring an HACMP cluster (standard)” on page 34These topics describe how to configure an HACMP cluster using the SMIT Initialization and StandardConfiguration path.“Using DLPAR and CUoD in an HACMP cluster” on page 459These topics describe how to configure and use HACMP in a hardware and software configuration thatuses Dynamic Logical Partitions (DLPARs) and the Capacity Upgrade on Demand (CUoD) function.Related reference:“Distribution preference for service IP label aliases: Overview” on page 67You can configure a distribution preference for the service IP labels that are placed under HACMPcontrol. HACMP lets you specify the distribution preference for the service IP label aliases. These are theservice IP labels that are part of HACMP resource groups and that belong to IPAT via IP Aliasingnetworks.“Configuring HACMP resource groups (extended)” on page 96You may have already used the Standard Configuration panels to configure some resources and groupsautomatically. Use the Extended Configuration SMIT panels to add more resources and groups, to makechanges, or to add more extensive customization.

    Configuring cluster eventsThe HACMP system is event-driven. An event is a change of status within a cluster. When the ClusterManager detects a change in cluster status, it executes the designated script to handle the event andinitiates any user-defined customized processing.

    To configure customized cluster events, you indicate the script that handles the event and any additionalprocessing that should accompany an event. Configuring cluster events describes the procedures forcustomization of event handling in HACMP.

    Configuring remote notification for cluster events

    The remote notification function allows you to direct SMS text-message notifications to any addressincluding your cell phone.

    With previous versions of HACMP, you could alter event scripts to send email when connected to theInternet. Alternately, the remote notification subsystem could send numeric or alphanumeric pagesthrough a dialer modem, which uses the standard Telocator Alphanumeric Protocol (TAP) protocol.

    For more information, see Defining a new remote notification method.Related tasks:“Defining a new remote notification method” on page 134You can define a new remote notification method using SMIT.Related reference:“Configuring cluster events” on page 127The HACMP system is event-driven. An event is a change of status within a cluster. When the ClusterManager detects a change in cluster status, it executes the designated script to handle the event andinitiates any user-defined customized processing.

    4 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Verifying and synchronizing the configurationVerifying the cluster configuration assures you that all resources used by HACMP are properlyconfigured, and that ownership and takeover of those resources are defined and are in agreement acrossall nodes. By default, if the verification is successful, the configuration is automatically synchronized.

    You should verify the configuration after making changes to a cluster or node. The Verifying andsynchronizing an HACMP cluster section describes the SMIT menus for verification, explains the contentsand uses of the clverify.log file, and describes how to verify your cluster.

    Verifying and synchronizing an HACMP cluster also explains how to create and maintain HACMP FileCollections. Using the HACMP File Collections utility, you can request that a list of files is automaticallykept synchronized across the cluster. You no longer have to manually copy an updated file to everycluster node, verify that the file is properly copied, and confirm that each node has the same version of it.If you use the HACMP File Collections utility, HACMP can detect and warn you if one or more files in acollection is deleted or has a zero value on one or more cluster nodes during cluster verifications.Related reference:“Verifying and synchronizing an HACMP cluster” on page 137Verifying and synchronizing your HACMP cluster assures you that all resources used by HACMP areconfigured appropriately and that rules regarding resource ownership and resource takeover are inagreement across all nodes. You should verify and synchronize your cluster configuration after makingany change within a cluster. For example, any change to the hardware operating system, nodeconfiguration, or cluster configuration.

    Testing the clusterHACMP includes the Cluster Test Tool to help you test the recovery procedures for a new cluster beforethe cluster becomes part of your production environment.

    You can also use the tool to test configuration changes in an existing cluster, when the cluster services arenot running. Testing an HACMP cluster explains how to use the Cluster Test Tool.Related reference:“Testing an HACMP cluster” on page 157These topics describe how to use the Cluster Test Tool to test the recovery capabilities of an HACMPcluster.

    Maintaining an HACMP clusterHACMP systems have different maintenance tasks.

    Starting and stopping cluster servicesVarious methods for starting and stopping cluster services are available.

    Maintaining shared logical volume manager componentsAny changes to logical volume components must be synchronized across all nodes in the cluster.Using C-SPOC (the Cluster Single Point of Control) to configure the cluster components on onenode and then synchronize the cluster saves you time and effort.

    Managing the cluster topologyAny changes to cluster configuration must be propagated across all nodes. Managing the clustertopology describes how to modify cluster topology after the initial configuration. You can makemost changes on one node and then synchronize the cluster.

    These topics also include information about the HACMP Communication Interface ManagementSMIT menu that lets you configure communication interfaces/devices to AIX without leavingHACMP SMIT.

    Managing cluster resourcesAny changes to cluster resources require updating the cluster across all nodes. You can makemost changes on one node and then synchronize the cluster.

    Administration guide 5

  • Managing cluster resource groupsThe Managing resource groups in a cluster section describes how to modify cluster resourcegroups after the initial configuration. You can add or delete resources and change the runtimepolicies of resource groups.

    You can dynamically migrate resource groups to other nodes and take them online or offline,using the Resource Group Management utility (clRGmove) from the command line or throughSMIT.

    Managing users and groups in a clusterHACMP allows you to manage user accounts for a cluster from a Single Point of Control(C-SPOC). Use the C-SPOC SMIT panels on any node to create, change, or remove users andgroups from all cluster nodes by executing a C-SPOC command on any single cluster node.

    Managing cluster security and inter-node communicationsYou can protect access to your HACMP cluster by setting up security for cluster communicationsbetween nodes. HACMP provides security for connections between nodes, with higher levels ofsecurity for inter-node communications provided through virtual private networks (VPN). Inaddition, you can configure authentication and encryption of the messages sent between nodes.

    Understanding the /usr/es/sbin/cluster/etc/rhosts fileThe /usr/es/sbin/cluster/etc/rhosts file

    A Cluster Communications daemon (clcomd) runs on each HACMP node to transparentlymanage inter-node communications for HACMP.

    In other words, HACMP manages connections for you automatically:v If the /usr/es/sbin/cluster/etc/rhosts file is empty (this is the initial state of this file, upon

    installation), then clcomd accepts the first connection from another node and adds entries tothe /etc/rhosts file. Since this file is empty upon installation, the first connection from anothernode adds IP addresses to this file. The first connection usually is performed for verificationand synchronization purposes, and this way, for all subsequent connections, HACMP alreadyhas entries for node connection addresses in its Configuration Database.

    v clcomd validates the addresses of the incoming connections to ensure that they are receivedfrom a node in the cluster. The rules for validation are based on the presence and contents ofthe /usr/es/sbin/cluster/etc/rhosts file.

    v In addition, HACMP includes in the /usr/es/sbin/cluster/etc/rhosts file the addresses for allnetwork interface cards from the communicating nodes.

    v If the /usr/es/sbin/cluster/etc/rhosts file is not empty, then clcomd compares the incomingaddress with the addresses/labels found in the HACMP Configuration Database (ODM) andthen in the /usr/es/sbin/cluster/etc/rhosts file and allows only listed connections. In otherwords, after installation, HACMP accepts connections from another HACMP node and addsthe incoming address(es) to the local file, thus allowing you to configure the cluster withoutever editing the file directly.

    v If the /usr/es/sbin/cluster/etc/rhosts file is not present, clcomd rejects all connectionsTypically, you do not manually add entries to the /usr/es/sbin/cluster/etc/rhosts file unless youhave specific security needs or concerns.

    If you are especially concerned about network security (for instance, you are configuring a clusteron an unsecured network), then prior to configuring the cluster, you may wish to manually addall the IP addresses/labels for the nodes to the empty /usr/es/sbin/cluster/etc/rhosts file. Forinformation on how to do it, see the section Manually configuring /usr/es/sbin/cluster/etc/rhostsfile on individual nodes.

    After you synchronize the cluster, you can empty the /usr/es/sbin/cluster/etc/rhosts file (but notremove it), because the information present in the HACMP Configuration Database would besufficient for all future connections.

    6 High Availability Cluster Multi-Processing for AIX: Administration guide

  • If the configuration for AIX adapters was changed after the cluster has been synchronized,HACMP may issue an error. See the section Troubleshooting the Cluster Communicationsdaemon or Checking the cluster communications daemon in the Troubleshooting Guide forinformation on refreshing the clcomd utility and updating /usr/es/sbin/cluster/etc/rhosts.

    The ~/.rhosts FileHACMP does not use native AIX remote execution (rsh) so you do not need to configure a~/.rhosts file unless you intend to use Workload Partitions (WPAR) which have their ownrequirements on this file.

    Saving and restoring HACMP cluster configurationsAfter you configure the topology and resources of a cluster, you can save the clusterconfiguration by taking a cluster snapshot. This saved configuration can later be used to restorethe configuration if this is needed by applying the cluster snapshot. A cluster snapshot can alsobe applied to an active cluster to dynamically reconfigure the cluster.

    Additional HACMP maintenance tasksAdditional tasks that you can perform to maintain an HACMP system include changing the logfile attributes for a node and performance tuning.

    Related reference:“Starting and stopping cluster services” on page 193These topics explain how to start and stop cluster services on cluster nodes and clients.“Managing the cluster topology” on page 283These topics describe how to reconfigure the cluster topology.“Managing the cluster resources” on page 312Use these topics to manage the resources in your cluster. The first part describes the dynamicreconfiguration process. The second part describes procedures for making changes to individual clusterresources.“Managing resource groups in a cluster” on page 329These topics describe how to reconfigure the cluster resource groups. It describes adding and removingresource groups, and changing resource group attributes and processing order.“Troubleshooting the Cluster Communications daemon” on page 367In some cases, if you change or remove IP addresses in the AIX adapter configuration, and this takesplace after the cluster has been synchronized, the Cluster Communications daemon cannot validate theseaddresses against the /usr/es/sbin/cluster/etc/rhosts file or against the entries in the HACMP'sConfiguration Database, and HACMP issues an error.“Saving and restoring cluster configurations” on page 375You can use the cluster snapshot utility to save and restore cluster configurations. The cluster snapshotutility allows you to save to a file a record of all the data that defines a particular cluster configuration.This facility gives you the ability to recreate a particular cluster configuration, provided the cluster isconfigured with the requisite hardware and software to support the configuration.Related information:Checking the cluster communications daemon

    Monitoring the clusterBy design, failures of components in the cluster are handled automatically, but you need to be aware ofall such events.

    Monitoring an HACMP cluster describes various tools you can use to check the status of an HACMPcluster, the nodes, networks, and resource groups within that cluster, and the daemons that run on thenodes.

    The HACMP software includes the Cluster Information Program (Clinfo), based on SNMP. The HACMPfor AIX software provides the HACMP for AIX MIB, associated with and maintained by HACMP. Clinforetrieves this information from the HACMP for AIX Management Information Base (MIB).

    Administration guide 7

  • The Cluster Manager gathers information relative to cluster state changes of nodes and interfaces. TheCluster Information Program (Clinfo) gets this information from the Cluster Manager and allows clientscommunicating with Clinfo to be aware of a cluster's state changes. This cluster state information isstored in the HACMP MIB.

    Clinfo runs on cluster server nodes and on HACMP client machines. It makes information about the stateof an HACMP cluster and its components available to clients and applications via an applicationprogramming interface (API). Clinfo and its associated APIs enable you to write applications thatrecognize and respond to changes within a cluster.

    The Clinfo program, the HACMP MIB, and the APIs are described in the Programming Client ApplicationsGuide.

    Although the combination of HACMP and the high availability features built into the AIX system keepssingle points of failure to a minimum, there are still failures that, although detected, can cause otherproblems.

    For suggestions on customizing error notification for various problems not handled by the HACMPevents, see the Planning Guide.Related reference:“Monitoring an HACMP cluster” on page 205These topics describe tools you can use to monitor an HACMP cluster.Related information:Programming client applicationsPlanning guide

    AIX files modified by HACMPThese topics discuss the different AIX files are modified to support HACMP. They are not distributedwith HACMP.

    /etc/hostsThe cluster event scripts use the /etc/hosts file for name resolution. All cluster node IP interfaces must beadded to this file on each node.

    HACMP may modify this file to ensure that all nodes have the necessary information in their /etc/hostsfile, for proper HACMP operations.

    If you delete service IP labels from the cluster configuration using SMIT, we recommend that you alsoremove them from /etc/hosts. This reduces the possibility of having conflicting entries if the labels arereused with different addresses in a future configuration.

    Note that DNS and NIS are disabled during HACMP-related name resolution. This is why HACMP IPaddresses must be maintained locally.

    /etc/inittabThe /etc/inittab file is modified in several different cases.

    These cases include:v HACMP is configured for IP address takeover.v The Start at System Restart option is chosen on the SMIT System Management (C-SPOC) > Manage

    HACMP Services > Start Cluster Services panel.v The /etc/inittab file has the following entry in the /user/es/sbin/cluster/etc/rc.init:

    hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init

    8 High Availability Cluster Multi-Processing for AIX: Administration guide

  • This entry starts the HACMP Communications Daemon, clcomd, and the clstrmgr subsystem.

    Modifications to the /etc/inittab file due to IP address takeover

    The following entry is added to the /etc/inittab file for HACMP network startup with IP addresstakeover:harc:2:wait:/usr/es/sbin/cluster/etc/harc.net # HACMP network startup

    Modifications to the /etc/inittab file due to system boot

    The /etc/inittab file is used by the init process to control the startup of processes at boot time.

    When the system boots, the /etc/inittab file calls the /usr/es/sbin/cluster/etc/rc.cluster script to startHACMP. The entry is added to the /etc/inittab file if the Start at system restart option is chosen on theSMIT System Management (C-SPOC) > Manage HACMP Services > Start Cluster Services panel orwhen the system boots:hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init

    This starts the HACMP Communications Daemon, clcomd, and the clstrmgr subsystem.

    Because some of the daemons that are started by rc.tcpip are needed at boot up, HACMP adds an inittabentry for the harc.net script with a runlevel of 2. The harc.net script runs at boot time and starts thesesubsystems:v syslogdv portmapv inetd

    The harc.net script also has code to start the following daemons:v nfsdv rpc.mountdv rpc.statdv rpc.lockd

    The code to start these nfs related daemons is commented out, and is only uncommented if needed.

    Only the syslogd, portmap, and inetd subsystems are common to the rc.tcpip and harc.net scripts, butthere is always the possibility that the NFS related subsystems could have been added to rc.tcpip scriptby the customer.

    See Starting and stopping cluster services section for more information about the files involved in startingand stopping HACMP.Related reference:“Starting and stopping cluster services” on page 193These topics explain how to start and stop cluster services on cluster nodes and clients.

    /etc/servicesThe /etc/services file defines the sockets and protocols used for network services on a system. The portsand protocols used by the HACMP components are defined here.clinfo_deadman 6176/tcpclinfo_client 6174/tcpclsmuxpd 6270/tcpclm_lkm 6150/tcpclm_smux 6175/tcpgodm 6177/tcp

    Administration guide 9

  • topsvcs 6178/udpgrpsvcs 6179/udpemsvcs 6180/udpclcomd 6191/tcp

    Note: If, in addition to HACMP, you install HACMP/XD for GLVM, the following entry for the portnumber and connection protocol is automatically added to the /etc/services file on each node on the localand remote sites on which you installed the software:rpv 6192/tcp

    This default value enables the RPV server and RPV client to start immediately after they are configured,that is, to be in the available state. For more information, see HACMP/XD for GLVM Planning andAdministration Guide.

    Related information:Geographic LVM Planning and administration

    /etc/snmpd.confThe SNMP daemon reads the /etc/snmpd.conf configuration file when it starts up and when a refresh orkill -1 signal is issued. This file specifies the community names and associated access privileges andviews, hosts for trap notification, logging attributes, snmpd -specific parameter configurations, andSMUX configurations for the snmpd.

    Note: The default version of the snmpd.conf file for AIX is snmpdv3.conf.

    The HACMP installation process adds a clsmuxpd password to this file. The following entry is added tothe end of the file, to include the HACMP MIB supervised by the Cluster Manager:smux 1.3.6.1.4.1.2.3.1.2.1.5 clsmuxpd_password # HACMP/ES for AIX clsmuxpd

    The Simple Network Management Protocol (SNMP) community name used by HACMP depends on theversion of SNMP you are running on your system. The SNMP community name is determined as follows:v If your system is running SNMP V1, the community name is the first name found that is not private

    or system in the output of the lssrc -ls snmpd command.v If your system is running SNMP V3, the community name is found in the VACM_GROUP entry in the

    /etc/snmpdv3.conf file.

    The Clinfo service also gets the SNMP community name in the same manner. The Clinfo service supportsthe -c option for specifying SNMP community name but its use is not required. The use of the -c optionis considered a security risk because doing a ps command could find the SNMP community name. If it isimportant to keep the SNMP community name protected, change permissions on /var/hacmp/log/hacmp.out, /etc/snmpd.conf, /smit.log, and /usr/tmp/snmpd.log to not be world readable.Related information:snmpd.conf fileSNMP for network management

    /etc/snmpd.peersThe /etc/snmpd.peers file configures snmpd SMUX peers.

    During installation, HACMP adds the following entry to include the clsmuxpd password to this file:clsmuxpd 1.3.6.1.4.1.2.3.1.2.1.5 "clsmuxpd_password" # HACMP/ES for AIX clsmuxpd

    /etc/syslog.confThe /etc/syslog.conf configuration file is used to control output of the syslogd daemon, which logssystem messages.

    10 High Availability Cluster Multi-Processing for AIX: Administration guide

  • During the install process, HACMP adds entries to this file that direct the output from HACMP-relatedproblems to certain files.# example:# "mail messages, at debug or higher, go to Log file. File must exist."# "all facilities, at debug and higher, go to console"# "all facilities, at crit or higher, go to all users"# mail.debug /usr/spool/mqueue/syslog# *.debug /dev/console# *.crit *# *.debug /tmp/syslog.out rotate size 100k files 4# *.crit /tmp/syslog.out rotate time 1dlocal0.crit /dev/consolelocal0.info /var/hacmp/adm/cluster.loguser.notice /var/hacmp/adm/cluster.logdaemon.notice /var/hacmp/adm/cluster.log

    The /etc/syslog.conf file should be identical on all cluster nodes.

    /var/spool/cron/crontabs/rootThe /var/spool/cron/crontabs/root file contains commands needed for basic system control. Theinstallation process adds HACMP logfile rotation to the file.

    During the install process, HACMP adds entries to this file that direct the output from HACMP-relatedproblems to certain files.0 0 * * * /usr/es/sbin/cluster/utilities/clcycle 1>/dev/null 2>/dev/null # HACMP for AIX Logfile rotation

    Changing script behavior for fatal errors in HACMPIf the SRC detects that the clstrmgr daemon has exited abnormally, it executes the /usr/es/sbin/cluster/utilities/clexit.rc script to halt the system. If the SRC detects that any other HACMP daemon has exitedabnormally, it executes the clexit.rc script to stop these processes, but does not halt the system.

    You can change the default behavior of the clexit.rc script by configuring the /usr/es/sbin/cluster/etc/hacmp.term file to be called when the HACMP cluster services terminate abnormally. You can customizethe hacmp.term file so that HACMP will take actions specific to your installation.

    Managing HACMP using WebSMITHACMP includes a Web-enabled user interface (WebSMIT).

    WebSMIT provides many capabilities, including:v HACMP SMIT configuration and management functionsv Interactive cluster status display, including self-updating statusv Intelligent HACMP online documentationv Interactive graphical displays of cluster topology and resource group dependencies.v User authorization settings. In addition to giving users unrestricted access, administrators can specify a

    group of users that have read-only access. Those users have permissions to view configuration andstatus, and navigate through SMIT panels, but cannot execute commands or make changes.

    v Support for Mozilla-based browsers (Mozilla 1.7.3 for AIX and FireFox 1.0.6 and higher) in addition toInternet Explorer versions 6.0 and higher. For the latest version information, see the WebSMITREADME.

    v Option of installing WebSMIT outside of a cluster. Additionally, the architecture has been changed toone-to-many; one instance/installation of WebSMIT can be used to manage multiple clusters(unlimited) and can handle multiple, concurrent user connections.

    v Interactive, consolidated view of multiple clusters

    Administration guide 11

  • v Single sign-on capability, as each registered WebSMIT user only needs to log in once to have access tomultiple clusters.

    v A view of multiple clusters, collecting and displaying cluster and node status for each one. Thisprovides an "at a glance" health check for the entire, deployed HACMP enterprise.

    Because WebSMIT runs in a Web browser, you can access it from any platform.

    You can install the WebSMIT interface either on an HACMP cluster node, or on a stand-alone, non-clusterserver.

    See the /usr/es/sbin/cluster/wsm/README file for information on setting up WebSMIT to work withyour web server, the default security mechanisms in place when installing HACMP, and the configurationfiles available for customization. For more information about configuring WebSMIT, see Installing andConfiguring WebSMIT in the Installation Guide.

    Working with WebSMITThis topic describes the WebSMIT display.

    Here is the WebSMIT main display page:

    The WebSMIT display is divided into three frames: The header frame, the navigation frame, and theactivity frame.

    Header frame

    The Header frame appears at the top of WebSMIT. The Header frame displays the current connectioninformation. For example, in the screen above, zodiac_cluster is the name of the cluster and (aquarius) isthe name of the node that is currently connected. If the cluster is not currently connected, then Nocluster connection appears. The top frame displays the name of the cluster and provides a logout link.

    12 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Below the cluster and connection information is the current authorization level. In the screen above, theAccess Mode displayed is unrestricted. If the current node connection changes, the connectioninformation is automatically updated with the new node that WebSMIT falls over to.

    Clicking either IBM or HACMP directs you to the IBM and HACMP home pages.

    You can log out of WebSMIT by selecting the logout icon on the right.

    Navigation frame

    The left frame displays three tabbed views from which you can navigate your cluster, as well asconfiguration menus. These navigation tabs display items in an expandable, hierarchical view. Selectingan item updates the content displayed in the activity frame to reflect the current selection. Open andcontract any of the trees in this frame, as needed, to show or hide the subcomponents by clicking on the+ or - symbol or by selecting ↓ or ↑. In the previous figure, the WebSMIT display shows the navigationframe with the Nodes and Networks tab selected.

    You may select the following tabs from the navigation frame:v SMIT tab. Provides hierarchical navigation of the SMIT menus to configure and manage your cluster.

    Clicking on a menu item from the SMIT tab displays the corresponding SMIT panel in the activityframe in the Configuration tab.

    v N&N (Nodes and Networks) tab. Provides an expandable hierarchical view based on the clustertopology (either site- or node-centric depending on the cluster definition).The icons to the left of the hierarchical menu items indicate the type and state of the correspondingtopological object.Clicking on the items displayed within the N&N tree results in a textual report being displayed in theDetails tab, and a graphical report in the Associations tab, both within the activity frame.

    v RGs (Resource Groups) tab. Provides an expandable hierarchical view, based on the cluster resources.The icons to the left of the hierarchical menu items indicate the type and state of the correspondingcluster resource.Clicking on the items displayed within the RGs tree results in a textual report being displayed in theDetails tab, and a graphical report in the Associations tab, both within the activity frame.

    Activity frame

    The right frame displays five tabbed views from which you can configure and manage your clusters. Youmay select the following tabs from the activity frame:v Configuration tab. Displays SMIT panels. The SMIT panels can be selected from with in this tab. The

    panels can also originate from selections made in any of the tabs in the navigation frame or from theEnterprise tab.

    v Details tab. Displays detailed reports pertaining to cluster components selected in the N&N or RGstabs.

    v Associations tab. Displays a graphical representation of a cluster component selected from the N&N orRGs tabs. Logical and physical relationships may be displayed, as well as resource contents.

    v Documentation tab. Displays the HACMP for AIX documentation bookshelf page, providing access tothe HACMP documentation installed on the WebSMIT server, or to equivalent online documentation,when available. The displayed product documentation links are always made version appropriate, tomatch the current cluster connection (if any). This tab also provides links to various HACMP for AIXonline resources.

    v Enterprise tab. Displays all the clusters that have been registered with this WebSMIT server that youhave authorized to see. Current status is displayed graphically for each cluster and that cluster's nodes.A full WebSMIT connection may be established to any of the displayed clusters at any time.

    Administration guide 13

  • Help is available where indicated by a question mark, slightly smaller than the surrounding text. Click onthe question mark to view the available help text in a small window.

    Note: When help is available for an item, the pointer changes to the help available cursor (typically aquestion mark) when a mousing over that item. When this occurs, clicking on that item results in a helpwindow.

    To display a context-sensitive popup menu of actions that you may take for a given item, right-click onthe item. This capability is provided for the items in the N&N, RGs, Associations, and Enterprise tabs.Not all items will provide a context-sensitive menu.

    Additionally, when you right-click in the background of the Enterprise tab, you are given actions thateither affect multiple clusters or the WebSMIT server behavior.Related information:Installing and configuring WebSMIT

    Managing multiple clusters with WebSMITYou can use a single installation of WebSMIT to manage more than one cluster.

    You can also install multiple WebSMIT servers, each with redundant cluster registrations. Then, if one ofthose WebSMIT servers goes down, administrators can simply log in to another one.

    Cluster registrations are initiated through the Enterprise tab. After registration, clusters are then displayedin the Enterprise tab.

    You can perform any of the traditional tasks for any of the clusters that you manage.Related information:Planning for WebSMIT

    Using the Enterprise viewUse the Enterprise view to register and manage all available clusters.

    From the Enterprise view, you can add/remove clusters, change user access and view cluster and nodestatus.

    14 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Registering and deregistering clustersYou can control which clusters are available in WebSMIT by registering the clusters. If WebSMIT isinstalled on a cluster node, that cluster node is automatically registered. Additionally, clusters that areupgraded are automatically registered in WebSMIT.

    WebSMIT includes a configuration variable, RESTRICTED_ACCESS, to manage upgrades. Whileupgrading, the variable is set to 0 to ensure cluster registration and user access for the local cluster. Afterupgrading, you should consider changing the variable to 1 to fully engage your access controls.

    You can add any number of clusters to the cluster registration. Be aware, however, that including a largenumber of registered clusters has the potential to affect WebSMIT performance. You will need to balancethe number of registered clusters against the performance of WebSMIT.

    To register a cluster with WebSMIT, follow these steps:1. On one node within the remote cluster, use the /usr/es/sbin/cluster/utilities/wsm_gateway utility to

    allow the WebSMIT server to communicate with that cluster.2. In WebSMIT, right-click over any blank space within the Enterprise view, and select Add a Cluster.3. On the WebSMIT Cluster Registration panel, enter the host name of the node used in step 1

    Administration guide 15

  • Once the registration is complete, the cluster is immediately available for those administrators that have"ALL" access (the next time they log in to WebSMIT, or reload the Enterprise view). Note that while thecluster has been registered, that only makes it available for use. Access to it must still be granted to theappropriate administrators.

    Note: You can also manually register clusters using the /usr/es/sbin/cluster/wsm/utils/wsm_registerutility. However, using this approach will not automatically update the Enterprise view; you will need torefresh the view.

    You can deregister a cluster by right-clicking the cluster icon and selecting Remove Cluster or by usingthe –r option in the wsm_register utility. Removing a cluster removes all cluster registration data on theWebSMIT server and removes WebSMIT access capability from each node in the remote cluster.

    Creating user accessIn addition to enabling WebSMIT login capability for a new user (for example, by including that user inthe ACCEPTED_USERS list in the WebSMIT configuration file, wsm_smit.conf, or by using htaccess),you must also give a user specific access to the clusters that are registered with WebSMIT.

    You must establish some level of access for each WebSMIT user or they will not be able to work with anyHACMP clusters through WebSMIT. Access may be granted via the WebSMIT GUI itself, or by using thecommand-line.

    To grant access using the WebSMIT GUI, follow these steps:1. In WebSMIT, right-click over any blank space within the Enterprise view, and select Add Access.

    16 High Availability Cluster Multi-Processing for AIX: Administration guide

  • 2. On the Add Access to WebSMIT panel, enter the users login ID in the Accessor Name box, select theuser Accessor Type, and select the cluster(s) and/or cluster group(s) that the user is allowed to access.Note that to create a cluster group, choose group for the Accessor Type.To allow access to all clusters, including those clusters that are created in the future, use ALL for theCluster / Group IDs field.

    To grant access using the command line, use the wsm/utils/wsm_access utility. This utility is used by theAccess options available in the Enterprise tab's right-click menu.

    If a user belongs to a system group (/etc/group file), that user is automatically given access to thatgroup’s clusters, provided that the system group has also been defined as a cluster group in WebSMIT

    Connecting to a clusterIn order to perform tasks for a remote cluster, it must be connected to WebSMIT. After you haveregistered a cluster, you can connect to it.

    To connect to a cluster, right-click the cluster in the Enterprise view and select Connect to Cluster.

    Administration guide 17

  • WebSMIT queries the active nodes on that cluster until it is able to connect to one of them. If WebSMIT isunable to connect to any of the cluster nodes, the connection fails.

    Note: During the course of the connection, if communications are lost with that node, WebSMITautomatically attempts to switch the connection to another node within that cluster. The connectioninformation in the Header Frame is updated accordingly.

    Viewing cluster statusYou can use the Enterprise tab to view the status of all the currently displayed clusters. View node statusby moving your cursor over an active cluster.

    View status by moving your cursor over a connected cluster.

    The Enterprise tab polls for status at regular intervals, polling all the clusters that it currently has ondisplay. This polling interval is controlled by the ENTERPRISE_POLLING_INTERVAL option in thewsm_smit.conf file (values must be in whole seconds).

    These values may be adjusted to either improve the performance of the WebSMIT GUI (by increasing thepolling intervals) or to improve the response/notification time of cluster changes (by decreasing thepolling intervals).

    Note that because a cluster is a logical construction that is representative of two or more physical nodes,there is no guarantee which node is used to provide cluster status. However, since the nodes within acluster are periodically synchronized, it should rarely make any difference which node is used

    18 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Detailed status

    If you want to know more about a cluster's status, you can retrieve and display detailed information. Forexample, knowing that a cluster has an error status can be helpful, but you might need more informationin order to diagnose and correct the error.

    To view detailed status, right-click any cluster within the Enterprise view and select a log or report fromthe menu. The log or report is displayed in the Details view. You can retrieve detailed status for anycluster that is currently displayed, even if there is an active connection to a different cluster.

    Configuring HACMP using WebSMITClicking the SMIT tab from the navigation frame displays the expandable SMIT menus used to configureand manage your cluster. Clicking a SMIT menu item, displays the corresponding SMIT panel in theactivity frame on the Configuration tab.

    A fastpath text box is located at the bottom of the display area. You can enter another SMIT panelfastpath in this text box to display that SMIT panel.

    Note: WebSMIT FastPath is enabled for all HACMP SMIT panels and may work with panels outside ofHACMP, but is not tested and supported outside of HACMP. SMIT panels that use interactive input, suchas entering a password, are not supported.

    Common WebSMIT panel optionsLook here for common WebSMIT panel options.

    Administration guide 19

  • The common WebSMIT panel options are mapped as follows:

    Key or Command Action

    F1 help Displays context help for the current screen. For menus, a popup menu (tell me more) displays thehelp text in a separate window.

    F4 list Pressing F4 or selecting List next to an item creates a popup selection list.

    F6 show command Displays the command that was created from the input provided.

    Enter = do Runs the command by pressing Enter.

    Fast Paths At the bottom of each panel is a text entry where you can enter an HACMP SMIT fast path. Thecurrent panel ID is displayed. Clicking on the Home icon

    restores the initial view of WebSMIT.

    Browser controls

    Clicking the browser Stop button stops the current page from loading; it does not stop any commandsthat are already running on the server. You cannot stop a command from WebSMIT after it has beenexecuted, even if the command originated from WebSMIT.

    Clicking the browser reload button reloads the current page. If the page being reloaded has just executeda command, then that same command is executed again.

    Functional limitations

    SMIT panels that use interactive input, such as entering a password or Mount volume 2 on cd0 and pressENTER to continue, are not supported. WebSMIT displays a default page when you attempt to accessthese pages directly.

    In WebSMIT, you cannot run the cluster verification process in an interactive mode.

    WebSMIT logs

    All operations of the WebSMIT interface are logged to the wsm_smit.log file and are equivalent to thelogging done with smitty -v. Script commands are also captured in the wsm_smit.script log file. All logsgo to the /usr/es/sbin/cluster/wsm/logs directory. The location of the WebSMIT log files cannot bemodified.

    The wsm_log file contains authentication and authorization information. Both the wsm_smit.log andwsm_log files must be owned by the same user that the HTTP server runs as (such as nobody, andapache), as defined in the httpd.wsm.conf file.

    The client_log contains information about activities occurring within each connected browser. This filemust be owned by the same user that the HTTP server runs as (such as nobody, or apache), as defined inthe httpd.wsm.conf file.

    All WebSMIT logs employ restricted growth, and roll over periodically (resulting in backups of the logs,with a numeric extension; for example, wsm_smit.log.3). This is to prevent runaway growth from fillingup the available disk space.

    Configuring and managing nodes and networks in WebSMITThe N&N (Nodes and Networks) tab contains an expandable hierarchical view based on the clustertopology, and the status of the cluster sites, nodes, and resource groups for clusters that are connected.

    20 High Availability Cluster Multi-Processing for AIX: Administration guide

  • If a connection has not been established with any cluster, then you need to click the Enterprise tab andestablish a connection.

    The cluster object displays first, followed by the next item in the hierarchy: Sites if one or more sitesexist, or Nodes otherwise. WebSMIT displays the N&N tab by default upon login and updates thecontents automatically, as long as a connection to the cluster exists.

    Note: The tree is not a representation of "configuration" but of state. Therefore, if the cluster is down, thehierarchy will not be shown

    To view configuration information of an item selected in the N&N tab, select the Details tab from theactivity frame on the right.

    To view a site or node-centric graphical display of an item selected in the N&N tab, select theAssociations tab from the activity frame.

    The following figure shows the N&N tab displayed in the navigation frame and the Details tab displayedin the activity frame.

    WebSMIT nodes and networks status indicators

    Status icons adjacent to the items in the hierarchy on the N&N tab indicate the status of each item,according to the icon's color and shape. A flashing icon indicates that a sub-item is in an ERROR state.Items not available are grayed out.

    Administration guide 21

  • Note: You must have cluster services running on at least one node for the hierarchy to display.Additionally, you must have clinfoES running and functional, because SNMP queries are used to obtainthe information displayed in the N&N tab.

    If "clstat -a" is not working on at least one node within the cluster, than it will probably not be possiblefor WebSMIT to retrieve any information about that cluster.

    WebSMIT N&N tab right-click SMIT options

    Right-click on a menu item to display a list of SMIT entries. Selecting a SMIT menu displays theappropriate SMIT panel in the activity frame on the Configuration tab as shown below:

    Menu Item SMIT Entries

    Nodes Start Cluster Services

    Stop Cluster Services

    Add a Node to the HACMP Cluster

    Remove a Node in the HACMP Cluster

    Node Start Cluster Services

    Stop Cluster Services

    Add a Node to the HACMP Cluster

    Change/Show a Node in the HACMP Cluster

    Remove a Node in the HACMP Cluster

    Cluster Start Cluster Services

    Stop Cluster Services

    Show Cluster Services

    Add/Change/Show an HACMP Cluster

    Remove an HACMP Cluster

    Add a Site

    Add Nodes to an HACMP Cluster

    Manage HACMP Services

    Discover HACMP-related Information from Configured Nodes

    Verify and Synchronize HACMP Configuration

    Site Add a Site

    Change/Show a Site

    Remove a Site

    Network Add a Network to the HACMP Cluster

    Change/Show a Network in the HACMP Cluster

    Remove a Network from the HACMP Cluster

    Add Communication Interfaces/Devices

    22 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Menu Item SMIT Entries

    Resource Group Add a Resource Group

    Change/Show a Resource Group

    Change/Show Resources and Attributes for a Resource Group

    Remove a Resource Group

    Move a Resource Group to Another Node

    Bring a Resource Group Online

    Bring a Resource Group Offline

    Configuring and managing resources in WebSMITThe RGs (resource groups) tab contains an expandable hierarchical menu based on the cluster resourcegroups, and the status of the cluster resource components. The contents of this tab update automaticallyas changes occur in clusters that are connected.

    If a connection has not been established with any cluster, then you need to click the Enterprise tab andestablish a connection.

    Selecting a menu item from the RGs tab displays its configuration information under the Details tab inthe activity frame.

    To view a resource group-centric graphical display the hierarchy of an item selected in the RGs tab, selectthe Associations tab from the right pane.

    The following figure shows the RGs tab displayed in the navigation frame and the Details tab displayedin the activity frame.

    Administration guide 23

  • WebSMIT RGs tab status indicators

    The status icons displayed on the RGs View tab indicate the state of the top-level cluster object. Thestatus icons also display adjacent to the items in the tree to indicate the status of the each item. Thesestatus indicators give you the state of the cluster object. Items not available are grayed out.

    Note: You must have cluster services running on at least one node for the hierarchy to display.Additionally, you must have clinfoES running and functional, because SNMP queries are used to obtainthe information displayed in the N&N tab.

    If "clstat -a" is not working on at least one node within the cluster, than it will probably not be possiblefor WebSMIT to retrieve any information about that cluster.

    WebSMIT RGs tab right-click SMIT options

    Right-click on a menu item to display a list of SMIT entries that, when selected, displays the appropriateSMIT panel in the activity frame on the Configuration tab as shown below:

    24 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Menu Item SMIT Entries

    Nodes Start Cluster Services

    Stop Cluster Services

    Add a Node to the HACMP Cluster

    Remove a Node in the HACMP Cluster

    Node Start Cluster Services

    Stop Cluster Services

    Add a Node to the HACMP Cluster

    Change/Show a Node in the HACMP Cluster

    Remove a Node in the HACMP Cluster

    Cluster Start Cluster Services

    Stop Cluster Services

    Show Cluster Services

    Add/Change/Show an HACMP Cluster

    Remove an HACMP Cluster

    Add a Site

    Add Nodes to an HACMP Cluster

    Manage HACMP Services

    Discover HACMP-related Information from Configured Nodes

    Verify and Synchronize HACMP Configuration

    Site Add a Site

    Change/Show a Site

    Remove a Site

    Network Add a Network to the HACMP Cluster

    Change/Show a Network in the HACMP Cluster

    Remove a Network from the HACMP Cluster

    Add Communication Interfaces/Devices

    Resource Group Add a Resource Group

    Change/Show a Resource Group

    Change/Show Resources and Attributes for a Resource Group

    Remove a Resource Group

    Bring a Resource Group Online

    Bring a Resource Group Offline

    Viewing the cluster componentsFrom the activity frame, click the Associations tab to view a graphical display of the cluster componentshierarchy corresponding to the item selected in the navigation frame: the N&N tab or the RGs tab.WebSMIT continuously updates the navigation frame status icons to indicate the state of the currentcluster components that are connected.

    Administration guide 25

  • If a connection has not been established with any cluster, then you need to click the Enterprise tab andestablish a connection.

    Both the N&N view and the RGs view show the cluster components in a hierarchical manner. Thedifference between the Associations generated from the two views is:v The N&N selections result in the display of site- or node-centric, topological associations.v The RGs selections result in the display of resource group-centric summary, including where each

    resource group is online and any resource group dependencies.

    When reviewing the N&N associations, if the graph becomes too complicated, you can remove theApplication Servers, Storage, and the Networks from this view by de-selecting the check boxes at thebottom of the page. Similarly, when reviewing the RGs associations, if the graph becomes toocomplicated you can remove the Parent/Child, Online on Different Nodes, Online on Same Nodes, andOnline on Same Site from the view by de-selecting the check boxes.

    The following figures show examples of the associations displays:

    26 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Administration guide 27

  • Viewing cluster configuration information in WebSMITThe Details tab provides configuration information about the item selected in the navigation frame.

    The following table shows the command(s) used to retrieve the information for each cluster component,along with a brief description of that information

    Component Command Details

    Cluster cltopinfo Lists the cluster topology information.

    Nodes cltopinfo Lists all node topology information.

    Node cltopinfo -n Shows the configuration for the specified nodeand displays the status of the HACMPsubsystems.

    Node clshowsrv -v Shows all HAMPC-related subsystems

    Network cltopinfo -w cllsif

    Shows detailed information about the networksconfigured in the cluster.

    resource groups clshowres Shows the resources defined for all groups.

    Resource Group clshowres -g Shows the resources defined to the selectedgroup.

    Volume Group cl_lsvg Shows the status of the volume group.

    Service IP cl_harvestIP_scripts -u cllsif -cSn

    Lists the Service IP information.

    28 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Component Command Details

    Boot IP cltopinfo -i Shows all interfaces configured in the cluster.

    Application Server cllsserv -n Lists application servers by name.

    Site cllssite -uc Displays information about the selected site.

    Sites cllssite Displays information about all available sites.

    File System cllsfs -ncllsfs -g

    Displays information about the selected filesystem.-n displays information for all files systems

    Tape cllstape Displays information about the selected file taperesource.

    For more information on these commands, see the man page or the description in HACMP for AIXcommands.

    The Details tab contains static information about the component selected in the left pane. Thisinformation is not automatically updated as changes are made to the cluster. To view updatedinformation, click the item again.

    The following example shows the detail information shown for a specific resource group:

    Related reference:

    Administration guide 29

  • “HACMP for AIX commands” on page 436Look here for a quick reference to commands commonly used to obtain information about the clusterenvironment or to execute a specific function. Syntax diagrams and provides examples for using eachcommand are included.

    Viewing HACMP documentation in WebSMITThrough WebSMIT, you can access the HACMP documentation installed on the WebSMIT server, as wellas online.

    To view the WebSMIT bookshelf page, click the Documentation tab in the Activity frame.

    WebSMIT attempts to find the level of documentation that corresponds to the level of the cluster that youare currently connected to. For example, if you are using WebSMIT to connect to a cluster that is atversion 5.3, WebSMIT attempts to find the version 5.3 documentation and to display it here.

    If the information is not installed locally, WebSMIT attempts to find the documentation online. If anonline source is found, WebSMIT provides a link to the documentation. You need external Internet accessto link to the information.

    If WebSMIT cannot find a the documentation locally installed or online, the title of the documentation isdisplayed for informational purposes, but no links are provided.

    30 High Availability Cluster Multi-Processing for AIX: Administration guide

  • Customizing WebSMIT colorsThe colors used for the various frames offered in WebSMIT can be customized to suit the preferences ofthe viewer. This can be useful to compensate for a vision deficit, or a faulty monitor. Customizations arestored as a cookie in the browser, and are not associated with the user ID. Therefore, logging in toWebSMIT through a different browser could result in a different set of customizations.

    The customization panel can be accessed through the Customize WebSMIT option, within the ExtendedConfiguration panel. The customization panel is shown below:

    In this figure, the Customize WebSMIT option is on display in the navigation frame on the left.

    The panel provides a large, scrollable list of available colors on the left, and to the right, a list of theregions or resource types that are currently available for customization. Between the two lists, three textareas are provided for displaying the current color selections. These text areas are editable, allowing anytext to be entered, to test how that text looks with the chosen colors.

    Original colors

    This area displays the colors that are in effect for the selected region at the time the customization panelwas loaded. These colors can be restored to the selected region at any time by clicking Restore.

    Current colors

    This area displays the colors that have been chosen for the selected region during this session that havebeen either previewed or saved. Previewed colors displayed in this area can be undone by clicking Reset.Saved colors displayed in this area can be undone by clicking Defaults.

    Administration guide 31

  • New colors

    This area displays the colors that have been chosen for the selected region during this session, but not yetsaved. The colors displayed in this area can be undone by clicking Reset.

    When a customizable region is selected, its original and current colors are displayed in the appropriatetext areas, along with any associated colors. For example, when a region is chosen that applies abackground color, whenever its background color is on display, the matching text color is also displayed.Because these colors are a set, it is important to display them with each other, to see how well they worktogether. When a region is selected, its current color selection is also automatically highlighted in the listof available colors.

    As new color selections are made for a customizable region, they are displayed only in the New Colorstext area. This remains true until the colors are either previewed or saved.

    The Customize WebSMIT panel provides a row of buttons at the bottom of the frame that provide controlover the color selections that have been made. Most of these buttons work globally, meaning that theytake into account all of the customizable regions, with the only exception being Restore. These buttons donot operate only on the currently selected region, but affect all of the regions that have been customized.Only the Restore is limited to the selected customizable region or resource.

    Reset

    The Reset button reverses any new color selections that are made, even if they were previewed. Clickingthis button puts all of the original colors (the colors that were in effect when the customization panel wasloaded) back into place. This button does not reverse any customizations that have been saved.

    Defaults

    The Defaults button reverses all new and saved color selections, erasing the cookie that is created by theSave button. This results in the default WebSMIT colors being restored.

    Display

    The Display button opens a window that shows all the regions or resources that are currently beencustomized, but not saved. Each region is displayed in the window, along with the new color that ischosen for that region.

    Preview

    The Preview button applies each new color selection to the appropriate region or resource in theWebSMIT GUI. The new colors are temporary until saved, and remain in effect only until the affectedregion or resource is reloaded.

    Restore

    The Restore button applies the original colors (that were in effect when the customization panel wasloaded) to the currently selected region or resource in the WebSMIT GUI. This effectively reverses anychanges made to that region during the current session.

    Save

    The Save button applies each new color selection to the appropriate region/resource in the WebSMITGUI. The new colors are saved in a cookie within the browser, and are automatically restored the nexttime WebSMIT is accessed from that browser. Because the customizations are stored in a cookie in thebrowser, they are not available in any other browser.

    32 High Availability Cluster Multi-Processing for AIX: Administration guide

  • In addition to the graphical customization approach described above, WebSMIT provides administratorswith a manual customization approach in the htdocs/en_US/wsm_custom.css file, a cascading stylesheet. Any changes made in this file will affect all users of WebSMIT, unless they have overridden thecustomizations locally, using the panel described above. The file contains some examples of what can bedone. Only simple, visual changes can be made in this file, such as colors and fonts. Any other changesmight affect WebSMIT in a functional manner and are not supported. Changes made to the associatedwsm_default.css file are not supported.

    The standard WebSMIT link bar is provided at the bottom of the customization panel. If you selectRefresh, any previewed color settings might be lost, even though they may still be on display inWebSMIT. If this problem occurs, reload WebSMIT.

    Enabling Internationalization in WebSMITWebSMIT uses English (en_US) messages by default.

    To enable non-English messages in WebSMIT there are three steps required:1. install the desired locale filesets for the operating system (for example, bos.loc.iso.ja, bos.loc.com.JP,

    bos.loc.pc)2. install the desired message catalog filesets for HACMP (for example, cluster.msg.Ja_JP.es.server,

    cluster.msg.Ja_JP.cspoc)3. set the primary language and encoding in the browser

    These three steps are described in more detail, below:1. Install the desired locale filesets for the operating system

    a. Retrieve and install the desired locale filesets using your standard nstallation procedures. Thetypical method for accomplishing this is by using SMIT/smitty: System Environments -> ManageLanguage Environment -> Add Additional Language Environments

    The installation CDs will be needed for this operation.The installation of the needed filesets may be verified by using the lslpp command. For example:lslpp -l bos.loc.iso.ja

    b. Once the filesets are installed, you can verify the new locale by using the following commands(using Japanese for the example):export LC_ALL=Ja_JPlocale

    You should see something like the following from the locale command:LANG=en_USLC_COLLATE="Ja_JP"LC_CTYPE="Ja_JP"LC_MONETARY="Ja_JP"LC_NUMERIC="Ja_JP"LC_TIME="Ja_JP"LC_MESSAGES="Ja_JP"LC_ALL=Ja_JP

    c. If all the needed locale filesets are installed, then all of the LC_* variables should be set to thelocale that you chose (for example, Ja_JP). Note that LANG is set independently of the localevariables, so LC_ALL does not affect it.The WebSMIT server is now enabled for the locale that you installed.

    2. Set the primary language and encoding in the browserTo display messages in WebSMIT using the new locale it is necessary to select the desired language inthe browser being used along with the appropriate, matching encoding.For example, in order to switch to Japanese messages in the browser it is necessary to not only set theprimary language to Japanese, but also the encoding. Follow the instructions below for your browser:v Firefox.

    Administration guide 33

  • a. Language settings are managed under Tools->Options->Advanced->General->ChooseLanguage. If necessary, add Japanese. Then select Japanese in the list, and move it up to the topof the list.

    b. Japanese encoding can be selected via View->Character Encoding->More Encodings->EastAsian->Japanese.

    v Mozilla:a. Language settings are managed under Edit->Preferences->Navigator->Languages. If necessary,

    add Japanese. Then select Japanese in the list, and move it up to the top of the list.b. Japanese encoding can be selected via View->Character Encoding->More->East

    Asian->Japanese.v Internet Explorer:

    a. Language settings are managed under Tools->Internet Options->Languages. If necessary, addJapanese. Then select Japanese in the list, and move it up to the top of the list.

    b. Japanese encoding can be selected via View->Encoding->More->Japanese

    Note: If the Ja_JP locale is in use, then the encoding will need to be set to the Shift_JIS variant, if itis available. For ja_JP, use EUC-JP. Some browsers are capable of detecting the variationsautomatically, so they only offer a single Japanese encoding.

    When the encoding is changed most browsers will typically refresh all of the displayed pagesimmediately. If that fails to happen simply reload WebSMIT to begin using it with the new language.

    Note: WebSMIT relies on the presence of the HACMP message catalogs for any given locale to determineif it is valid to display messages for that locale. For example, if a browser connects to WebSMIT andrequests Japanese messages, whether or not WebSMIT will attempt to display Japanese messages dependsentirely upon the presence and validity of the appropriate HACMP Japanese message catalogs. Nochecking is performed to validate the operating systems settings on the server. It is assumed that if theadministrator installed the Ja_JP HACMP catalogs that they have also installed and verified thecorresponding locales such that the server is fully prepared at the operating system level to handle theJa_JP locale.Related information:Cannot change languages in WebSMIT

    Configuring an HACMP cluster (standard)These topics describe how to configure an HACMP cluster using the SMIT Initialization and StandardConfiguration path.

    Have your planning worksheets ready to help you through the configuration process. See the PlanningGuide for details if you have not completed this step.

    Overview of configuring a clusterUsing the options under the SMIT Initialization and Standard Configuration menu, you can add thebasic components of a cluster to the HACMP Configuration Database (ODM) in a few steps. ThisHACMP configuration path significantly automates the discovery and selection of configurationinformation and chooses default behaviors.

    If you are setting up a basic two-node cluster, use the Two-Node Cluster Configuration Assistant tosimplify the process for configuring a two-node cluster. For more information, see the section Using theTwo-Node Cluster Configuration Assistant in the Installation Guide.

    You can also use the General Configuration Smart Assist to quickly set up your application. You are notlimited to a two-node cluster with this Assist.

    34 High Availability Cluster Multi-Processing for AIX: Administration guide

  • You can use either ASCII SMIT or WebSMIT to configure the cluster. For more information aboutWebSMIT, see Administering a cluster using WebSMIT.Related concepts:“Managing HACMP using WebSMIT” on page 11HACMP includes a Web-enabled user interface (WebSMIT).Related information:Using the two-node cluster configuration assistant

    Prerequisite tasks for using the Standard PathThis topic discusses several prerequisite tasks for using the Standard Path.

    Before using the Standard Configuration path, HACMP must be installed on all the nodes, andconnectivity must exist between the node where you are performing the configuration and all othernodes to be included in the cluster. That is, network interfaces must be both physically and logicallyconfigured (to AIX) so that you can successfully communicate from one node to each of the other nodes.The HACMP discovery process runs on all server nodes, not just the local node.

    Once you have configured and powered on all disks, communication devices, serial networks and alsoconfigured communication paths to other nodes in AIX, HACMP automatically coll