45
Installation Checklist for SunCluster 3.2 Systems Customer: Sun Order Number: CASE Number: Technician: Version EIS-DVD: Date: It is recommend that the EIS web pages are checked for the latest version of this checklist prior to commencing the installation. It is assumed that the installation is carried out with the help of the current EIS-DVD. The idea behind this checklist is to help the installer achieve a "good" installation. It is assumed that the installer has attended the appropriate training classes. It is not intended that this checklist be handed over to the customer. This installation checklist for SunCluster is to be used together with the installation checklist(s) for Sun Fire Server, workgroup server, storage etc.. This checklist is for Sun Cluster 3.2 (SPARC & x86) up to & including Sun Cluster 3.2 1/09 Update2 in combination with Solaris 9 8/05 Update8 (SPARC only) or Solaris 10 5/08 Update5 (or higher). For supported configurations please refer to the official configuration guide for Sun Cluster. If this cluster is to be installed as part of a Geo-Cluster, refer also to the EIS installation checklist for Sun Cluster Geographic Edition 3.2. If Sun Cluster is to be installed in LDom guest domains please refer to http://wikis.sun.com/display/SunCluster/%28English %29+Sun+Cluster+3.2+2-08+Release+Notes#%28English %29SunCluster3.22-08ReleaseNotes-optguestdomain for details. A series of tests for the completed cluster are available in the Cluster Verification document (StarOffice format) from within the EIS web site (under EIS Planning Documents). This document should be available during the installation. Serial# hostid hostname Admin Workstation Node1 Node2 Node3 Node4 Node5 Node6 Node7 Node8 Sun Internal and Approved Partners Only Page 1 of 45 Vn 1.11 Created: 20. Apr. 2009

Cluster 32

  • Upload
    tarpy

  • View
    583

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Cluster 32

Installation Checklist for SunCluster 3.2 Systems

Customer:

Sun Order Number:

CASE Number:

Technician:

Version EIS-DVD:

Date:

• It is recommend that the EIS web pages are checked for the latest version of this checklist prior to commencing the installation.

• It is assumed that the installation is carried out with the help of the current EIS-DVD.• The idea behind this checklist is to help the installer achieve a "good" installation.• It is assumed that the installer has attended the appropriate training classes.• It is not intended that this checklist be handed over to the customer.• This installation checklist for SunCluster is to be used together with the installation

checklist(s) for Sun Fire Server, workgroup server, storage etc..• This checklist is for Sun Cluster 3.2 (SPARC & x86) up to & including Sun Cluster

3.2 1/09 Update2 in combination with Solaris 9 8/05 Update8 (SPARC only) or Solaris 10 5/08 Update5 (or higher). For supported configurations please refer to the official configuration guide for Sun Cluster.

• If this cluster is to be installed as part of a Geo-Cluster, refer also to the EIS installation checklist for Sun Cluster Geographic Edition 3.2.

• If Sun Cluster is to be installed in LDom guest domains please refer to http://wikis.sun.com/display/SunCluster/%28English%29+Sun+Cluster+3.2+2-08+Release+Notes#%28English%29SunCluster3.22-08ReleaseNotes-optguestdomain for details.

• A series of tests for the completed cluster are available in the Cluster Verification document (StarOffice format) from within the EIS web site (under EIS Planning Documents). This document should be available during the installation.

Serial# hostid hostnameAdmin WorkstationNode1Node2Node3Node4Node5Node6Node7Node8

Sun Internal and Approved Partners Only Page 1 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 2: Cluster 32

Task Comment Check

PREPARATION

Site-Audit complete?

System planning complete?• Cluster Build document• Cluster Verification document

EIS Planning documents for Sun / Solaris Cluster (on EIS website).

Ensure that the Configuration Guide for Sun Cluster was consulted.

http://sundoc.central/SunWINPublicView.jsp?token=126955

If non-Sun storage is involved ensure that the information on the Open Storage Program was consulted.

http://www.sun.com/software/cluster/osp

Installation Specification Document signed?

Part of the EIS Administrative Planning Document (on EIS website).

Provide appropriate Service List to customer.

http:www.sun.com/service/servicelist

MANDATORY: Ensure that the licenses are available for Sun Cluster Framework & Agents.

Task Comment Check

INSTALLATION OF THE ADMINISTRATION WORKSTATIONSolaris-Ready Installation as for server. Installation according to the appropriate

checklist (Entire Distribution)

Install the SunCluster Console Panel from the product CD.

pkgadd -d . SUNWcconOptional: SUNWscman

If ssh is needed in the Console Panel then install the required Sun Cluster 3.2 core patches on administration workstation with EIS-DVD ≥26JUN07.Note the structure under .../sun/patch/SunCluster/3.2The SC3.2u1 package includes the ssh feature.

Configure PATH: /opt/SUNWcluster/binConfigure MANPATH:/usr/cluster/man /opt/SUNWcluster/man

Probably already in file /.profile-EIS for user root.

Configure in directory /etc the files clusters, serialports and hosts.E.g.: for /etc/clusters: clustname host hostE.g.: for /etc/serialports host tc tcport- for a SSP or SF15K SC:domain ssp 23- for a SunFire with access to ssc console:domain tc tcport- for a SunFire without access to scc console:domain ssc domain_portdomain_port = 5001 for domain A; 5002 for domain B; 5003 for domain C; 5004 for domain D

Detailed explanation in the man page.

Configure the terminal server.If using Micro Annex please refer to (lso available on the EIS-DVD: ...sun/docs/MISC):http://sunweb.germany/SSS/MCSC/ET/suncluster/clusttips/annexXL.html#Inst

Sun Internal and Approved Partners Only Page 2 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 3: Cluster 32

Task Comment Node

1 2 3 4

BASIC CLUSTER CONFIGURATIONATTENTION: Due to the following issues with Sun Cluster 3.2 Update2 EIS recommend to install Sun Cluster 3.2 Update1 with the SC core patches -19. The product and the patches are on EIS-DVD 16DEC08 and EIS-DVD 28APR09.• Sun Alert 254908 / EIS-Alert#240: memory leak in rgmd.• Sun Alert 256368: wrong mount order of nested mounts (HAStoragePlus).• Bug 6825948: Invalid infrastructure when setting numvirtualclusters=0.• EIS-Alert#235: scalable resource does not failover in case of network outage (new

SC3.2u2 core patches already released which fix this issue).If the customer need the features or fixes of Sun Cluster 3.2 Update2 then use EIS-DVD 31MAR09 for the product and EIS-DVD 28APR09 for the SC3.2u2 core patches which are located in /sun/patch/SunCluster/3.2/update2corepatches. Furthermore open a SR and an escalation to get an IDR for the rgmd memory leak.

Solaris-Ready Installation.Installation according to the appropriate server checklist(s). Please note:• If you install Oracle 10g RAC:

1. it is NOT possible to use capitals within the hostname. Use lower case letters (Oracle Bug 4632899).

2. /var should NOT be a separate partition (Bug 4891227).

• Do not forget the 512Mb (SC3.2 strongly recommended) partition for/globaldevicesNote: In case of Sun Cluster 3.2 u2 a lofi mount can be used instead of the /globaldevices slice. The scinstall command prompt within installation!

• If using SDS/SVM reserve slice 7 with 32MB (EIS recommendation) for metaDB/replicas.

• Ensure that the SAN Foundation Suite packages are installed and patched.(SAN is necessary for fibre server/storage or VxVM).

Minimum required MetaCluster:• Full Distribution + OEM

Support (SUNWCXall) for E10000 & SF15K/12K/E25K/E20K.

• End User System Support (SUNWCuser) for other systems1.

EIS recommends SUNWCXall for all installations.

In case of upgrades maybe you are not able to increase the metaDB slice size. Choose a suitable configuration for your upgrade.

The absolute minimum for metaDB is 20MB.

1 The Sun Cluster Software Installation Guide for Solaris OS (820-2555) Chapter 1, Section Planning the Solaris OS: You might need to install other Solaris software packages that are not part of the End User Solaris Software Group. The Apache HTTP server packages (SUNWapchr & SUNWapchu) are one example. Third-party software, such as ORACLE®, might also require additional Solaris software packages. See your third-party documentation for any Solaris software requirements.TIP: To avoid the need to manually install Solaris software packages, install the Entire Solaris Software Group Plus OEM Support. See also Technical Instruction 206774.

Sun Internal and Approved Partners Only Page 3 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 4: Cluster 32

Task Comment Node

1 2 3 4

Only Sun Cluster 3.2 u2: The zfs boot can be used. Please refer to EIS standards “Boot Disk Layout & Mirroring” for details. Notes regarding the various combinations of shared storage with ZFS Boot are:

• ZFS shared storage: Good combination if global filesystem is not required. Only failover filesystem are possible.

• SVM shared storage will require additional partitions/disks for local replicas.

• VxVM shared storage is not recommended until bug 6803117 is fixed or a suitable work-around is available.

Not in case of lofi mounted /global/.devices/node@<nodeid>:Expand numbers of inodes on the /globaldevices partition, e.g:# newfs -i 512 /dev/rdsk/c0t0d0s6

Before newfs: umount and inactive entry in /etc/vfstab necessary. After newfs:mount /globaldevices again.

Configure /etc/inet/hosts.Additionally for Solaris10 11/06 Update3 and lower configure /etc/inet/ipnodes.Note: Since Solaris 10 8/07 Update 4 ipnodes is a link to the file hosts.

Enter all cluster nodes and all the logical Hosts. Also logical hosts of none-global zones.

Define the scsi-initiator-id.

For sparc: in the OBP.Start probe-scsi-all on both nodes simultaneously. Completed without error?

For x86: in "LSI Logic Configuration" utilities menu within the bootup. Also add scsi-initiator-id=<x> into file /kernel/drv/mpt.conf due to bug 6222114.

Only necessary for dual hosted scsi devices!Technical Instruction 210263, 207207.

The SE3310 SCSI RAID can isolate the host channels from each other. No action required!

Only for Solaris x86 cluster:Edit /kernel/drv/sd.conf if you have more than one LUN per target.

If you are using Dual Ultra3 SCSI HBA you must use Method 2 of Technical Instruction 210263 or use:For JBOD:http://docs.sun.com/app/docs/doc/817-5681/6ml4dh4j1?q=SCSI+JBOD&a=view

SunAlert: 52687Note: Use 370-5396-02 revision of the SE3310 JBOD I/O module or use the out ports of 3310 JBOD.Problem Resolution 2230462.

If you are using Dual Ultra3 SCSI HBA (X6758 / PN375-3057) set the jumpers J4,J5, J8 and J9 to 2-3!

SunAlert 52681

Sun Internal and Approved Partners Only Page 4 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 5: Cluster 32

Task Comment Node

1 2 3 4

When using shared SCSI storage add the following entries to file /etc/system

set sd:sd_io_time=30set sd:sd_retry_count=3 (not Solaris10)set vxdmp:dmp_retry_count=1set scsi_reset_delay=500

Additional if you use the glm driver to file/kernel/drv/glm.conf

scsi-selection-timeout=64

Bug 4529645: Cluster hangs for some minutes if D1000 is powered off.

Bug 6437172: A failover should not happen when a scsi cable is pulled out.

Hardware configuration of PCI-SCI e.g for one adapter:connect (node1 IN) to (node2/Switch OUT)connect (node1 OUT) to (node2/Switch IN)

Software installation with pkgadd:• SUNWrsm SUNWrsmo SUNWrsmox

SUNWrsmx from SolarisCD.• SUNWscrif SUNWscrdt SUNWsci SUNWscid

SUNWscidx from ClusterCD.Within scinstall setup do not accept the default when you are prompted for the adapter connection. Instead, provide the port name (0,1,2,3) found on the Dolphin switch itself, to which the node is physically cabled.Verify that the SCI driver is loaded:# prtconf | grep pci11 pci11c8,0, instance #0 pci11c8,0, instance #1Note: The instance number must be in the same order on all nodes if you like to connect sci0 to sci0!

Don't forget the RSM Patch 111796 for Sol8!

Maybe you have to configure file sci.conf in directory /kernel/drv.Technical Instruction 206995.

Only configure sci switches when you have installed switches!

Use the -G option for pkgadd in case of Solaris10.

If you use Infiniband as cluster transport:• You must use one Infiniband switch for each private connection. If

one HCA is installed each of its two ports must be connected to a different InfiniBand switch.

• If two HCAs are installed leave the second port unused.• You cannot directly connect the HCA to each other.• Jumbo frames are not supported.• VLANs are not supported on a cluster which uses Infiniband

switches.• Sun Infiniband switches support up to 8 nodes in a cluster.

When using Sun StorEdge 69x0 the minimum “Service Processor Image version 2.1.2” is required.

FIN# I0870-1FIN# I0876-1

Sun Internal and Approved Partners Only Page 5 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 6: Cluster 32

Task Comment Node

1 2 3 4

When using SE99x0 (HDS) set System Options mode:• 185 was only needed for non-leadville HBAs on

SE9910/9960 (can always be on).• 186 is for VCS only. More details in Technical

Instruction 208789.

Since microcode 21-08-05-00/00 the default settings (185=on, 186=off, 254=off) should work for SC3.x. The modes 186 and 254 are only necessary for Veritas Cluster Server.

Only for 9985/9990 no special mode settings are required for SC 3.x. All three modes should be left off! (Mode 185 is not defined. Mode 186 and 254 have been renumbered and changed to host group option modes).

Solaris10 requires V08 microcode 50-08-05 V07+1 (50-07-72 is NOT sufficient).

Experience on this issue can be found at:http://sunweb.germany/SSS/MCSC/ET/suncluster/clusttips/sc3x_99x0modes.html

FIN: I1130-1

System Option 111 is for lun security. Must be on for SC 3.x and 9910/9960 only.The min. required microcode 1-10-00-00/10 .

Host Mode should be set to 09 for Sun Cluster ports (also for Veritas Cluster). If you use Veritas Cluster Server and Sun Cluster on the same storage unit you should use Host Mode 29 for the Sun Cluster ports (min. required microcode 21-11-02-00/00).

Please give feedback if you have problems with these modes!

If using active/passive storage controllers like SE6140, SE6540 or FLX380:Force scsi3 reservations (see Technical Instruction 211284).# cluster set -p global_fencing=prefer3Firmware 6.60.11.xx (part of CAM 6.1) is required for 6140, 6540, FLX380. (SunAlert 231801).

Alternative workaround is to set option “Allow Reservation on Unowned LUNs” to 0x00 in NVSRAM.Details in CR 6704488.

Solaris 10 only:When using StorEdge 2530 apply patch 125081-14 (sparc) or 125082-14 (x86) or later version.

SunAlert 200159

These patches are on EIS-DVD ≥29JAN08.

Sun Internal and Approved Partners Only Page 6 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 7: Cluster 32

Task Comment Node

1 2 3 4

If desired you can use MPxIO:Solaris 9:• To activate edit /kernel/drv/scsi_vhci.conf

mpxio-disable="no";• Exclude controllers from MPxIO in

/kernel/drv/qlc.confe.g: name="qlc" parent="/pci@6,2000" unit-address="2" mpxio-disable="yes";or /kernel/drv/fp.conf (recommended for SAN 4.3 and higher)e.g: name="fp" parent="/pci@6,2000/SUNW,qlc@2" port=0 mpxio-disable="yes";

Solaris 10:# stmsboot -D fp -eDo not run this command if SunCluster is already installed.

Activate MPxIO before installing VxVM.

This enables MPxIO on all (external & internal) FC controller ports!

If using MPxIO on shared devices setauto-failback="disable"in /kernel/drv/scsi_vhci.conf

This prevents reservation conflict panics.Bug 6661528.

For safety reasons clean the devices with:devfsadm -C on all nodes.

The command stmsboot -L is not working after devfsadm you hit bug 6253821. Use cfgadm or luxadm instead of stmsboot.

You may experience problems within the Sun Cluster installation if the devices are not clean.

Notice: If you use VLANs for the cluster transport:• one separated VLAN per cluster transport,• no network traffic should be possible between the different cluster transport VLANs,• VLANs should behave like separated physical segments.TAKE CARE of SunAlert 253948: KU patch 13888[89] may stop Sun Cluster node joining the Cluster. Affected are all GLD(v3) interfaces like e1000g, nxge, bge or ixgb. The best workaround at the moment: Configure VLAN tagging on the ethernet switch and on Sun Cluster interconnect interfaces (EIS-Alert #235).

If you have a back-to-back connection for cluster transport AND if you use the ce driver (no cross-over cable required) do NOT disable auto-negotiation. Otherwise you must specify the link master.

If you intend to use DR with your cluster you should bring the "kernel cage" AND the default "bindings" of the SC onto one SB. This means you should install the SC software with only one active SB.Details in Technical Instruction 216797.

This checklist is written so that Sun Cluster will be installed before the VxVM/SDS/SVM software. This approach is also used by the Installation Guide. However it is also possible to install VxVM/SDS/SVM before the Sun Cluster Software. In this case you can also use this checklist. The approach is a little different but the hints mentioned still apply.

Sun Internal and Approved Partners Only Page 7 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 8: Cluster 32

Task Comment Node

1 2 3 4

When using supported network adapters (X1027A-Z/X4447A-Z) which use the *nxge* driver for private transport, install patch 138048-03 (sparc)/138049-03 (x86) or higher (EIS-DVD ≥29JUL08).

Solaris 10 8/07 update 4 or later is required for the nxge driver.

Bug 6525548Technical Instruction 213624.

Use the Java ES installer for software installation.

# cd /cdrom/cdrom0/<Solaris_arch># ./installer -nodisplay

Do not install the full set of JES!Select at least in the Main Menu:Sun Cluster 3.2 & All Shared Components.

Choose 'Configure Later''Configure Now' does not work this time!

Optional: To use the GUI of the installer be sure that the DISPLAY is set correctly.

Optional: Install Sun Cluster Manager on all nodes or on none!

Note: The Java installer stores files in /var/sadm/prod/entsys. If you need to uninstall enter:SC3.2:/var/sadm/prod/entsys/uninstall -nodisplaySC3.2U1: /var/sadm/prod/SUNWentsyssc32u1/uninstall SC3.2U2: /var/sadm/prod/SUNWentsyssc32u2/uninstall

Only Sun Cluster 3.2 Update1:Reinstall latest Service Tags packages: Run script setup-standard again from EIS-DVD ≥30SEP08 Affected packages are SUNWservicetagr, SUNWservicetagu, SUNWstosreg

Bug: 6749968 – SC3.2u1 JES installer removes all Service Tags which are higher than 1.1.1.

Fixed in SC3.2u2

Only Solaris 10 10/08 Update6 & SC 3.2u1: (fixed in SC3.2u2)

Bug 6584336: Java ES installer reinstalls the wrong version of SUNWcacaort package.

On all nodes:# /usr/sbin/smcwebserver stop# /usr/sbin/cacaoadm stop# pkgrm SUNWcacaort (version of Java ES installer)# pkgadd SUNWcacaort (version of Solaris 10 10/08 Update6)Also on EIS-DVD ≥16DEC08 ../sun/tools/SunCluster

On one node:# cd /etc/cacao/instances/default# tar cf /tmp/security.tar security# cp the security.tar files to all other nodes into the same directory and unpack them.

On all nodes:# /usr/sbin/smcwebserver start# /usr/sbin/cacaoadm start

Only Solaris 9: Ignore the following message at boot time.cacao: Error: Fail to start cacao agent. (instance default)Error: Fail to start cacao agent. (instance default)

In CR 6461294 the work-around to copy the security keys does not work. No solution available at the time.

Sun Internal and Approved Partners Only Page 8 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 9: Cluster 32

Task Comment Node

1 2 3 4

Install required Sun Cluster 3.2 patches with EIS-DVD ≥26JUN07.Note the structure under .../sun/patch/SunCluster/3.2and .../sun/patch/cacao/2.0 if Sun Cluster Manager is used.If using Solaris 10 5/08 Update5 the EIS-DVD ≥29APR08 is required. For details see Sun Alert 239307.For SC3.2u1 use -19 core patches (EIS-DVD 16DEC08 or 28APR09)For SC3.2u2 use EIS-DVD 28APR09.

Configure PATH:SunCluster 3.x: /usr/cluster/bin /usr/cluster/lib/sc /usr/sbinDiagnostic Toolkit: /usr/cluster/dtk/binVxVM 3.x: /opt/VRTSvmsa/bin /opt/VRTSvxvm/bin /etc/vx/bin

Configure MANPATH:SunCluster 3.x: /usr/cluster/man /usr/manDiagnostic Toolkit: /usr/cluster/dtk/manVxVM 3.2: /opt/VRTS/man

Normally provided from the EIS-DVD by .profile-EIS. Just log out & back in to set the environment correctly.

Note: /usr/cluster/man should before /usr/man to get the cluster command manpages.

Solaris10 only: Ensure that the local_only property of rpcbind is set to false:# svcprop network/rpc/bind:default | grep local_only

if not false run:# svccfgsvc:> select network/rpc/bindsvc:/network/rpc/bind> setprop config/local_only=falsesvc:/network/rpc/bind> quit# svcadm refresh network/rpc/bind:default

This is not false if you have installed with “secure by default” option of Solaris10u3.

Is needed for cluster communication.

Note: With the “netservices open” command you can reverse the “secure by default” feature which was enabled by the Solaris 10 U3 (or higher) installation.

Sun Cluster 3.2 u2 only ( Bug 6825948): ATTENTION: If using non-default netmask for cluster interconnect the following question comes up within scinstall:Maximum number of virtual clusters expected [12]? 1Answer the question with a value between 1 and 12 even if NOT using zone clusters features. DO NOT ENTER 0, this will corrupt the CCR.

Establish SunCluster 3.2 on node1 using scinstall. Select menu 1) & 2)Interactive Q+A

Alternative: Use the “create a new cluster” option to configure all nodes at once. Beware the node which run the scinstall command get the highest nodeid. The last node in the order list within scinstall will be nodeid 1. If you use this approach you can skip the next step.

(Partition for /globaldevices required.)

The sponsor node (node1) must be rebooted and member of the cluster before starting SC 3.2 installation on the other nodes.

In case of Quorum Server or NAS device (NetApp filer only): you must disable quorum auto-configuration.

Sun Internal and Approved Partners Only Page 9 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 10: Cluster 32

Task Comment Node

1 2 3 4

Install Sun Cluster 3.x on all additional nodes via scinstall. Select “Add this machine as a node in an existing cluster”

Note: Maybe autodiscovery is not working for nxge interfaces. Use the manual approach to select the interfaces (Bug 6449139).

Interactive Q+A(Partition for global devices required.)

Configure Quorum via clsetup if not already done!The rule is number of nodes minus 1.But it also depends on the topology!

Only on 1 node!(min. 1 Quorum).

Optional for 3nodes or more.

Enable automatic node reboot if ALL monitored disk paths fail.#clnode set -p reboot_on_path_failure=enabled <node1> <node2>

And Disable the monitoring on all local disks.# cldev status (to look for single attached disks)# cldev unmonitor <did_device>

To verify run clnode show or scdpm -p all:all

See Bug 6479327, node will only be rebooted if local disks are unmonitored.Fixed in: Solaris 9, 125510-01& Solaris 10, 125511-01 &Solaris 10_x86, 125512-01 – all on EIS-DVD ≥26JUN07.

See Bug 6563949, memory leak in scdpmd if Fixed in: Solaris 9, 126105-04& Solaris 10, 126106-04 &Solaris 10_x86, 126107-04 – all on EIS-DVD ≥26FEB08.

Also fixed with SC3.2u1!

If using active/passive storage controllers like SE6140, SE6540 or FLX380:

Force scsi3 reservations (see Technical Instruction 211284).# cluster set -p global_fencing=prefer3

Firmware 6.60.11.xx (part of CAM 6.1) is required for 6140, 6540, FLX380. (SunAlert 231801).

Alternative workaround is to set option “Allow Reservation on Unowned LUNs” to 0x00 in NVSRAM.

Details in CR 6704488.

Ensure that the Perl packages are installed. The scsnapshot command needs the packages SUNWpl5u, SUNWpl5v and SUNWpl5p.

They are NOT included in Solaris10 End User Software Group (CR 6477905)

Enable DRP (Dynamic Resource Pools)# svcadm enable pools/dynamicotherwise the following message appear within a node shutdown.Oct 3 14:44:09 Cluster.RGM.fed: SCSLM thread WARNING pools facility is disabled

CR 6616774

The problem exists after installation of the patchsparc; 120629-08 or 120011-14x86: 120012-14

Fixed in Solaris10 5/08 Update5

Sun Internal and Approved Partners Only Page 10 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 11: Cluster 32

Task Comment Node

1 2 3 4

Ensure that the tcp_listen property of webconsole is set to true:# svcprop /system/webconsole:console | grep tcp_listen

if not true run:# svccfgsvc:> select system/webconsolesvc:/system/webconsole> setprop options/tcp_listen=truesvc:/system/webconsole> quit# svcadm refresh# /usr/sbin/smcwebserver restart

This is not set true if you have installed with “secure by default” option of Solaris10u3.

Is needed for Sun Cluster Manager communication.

To verify if the port is listening to *.6789 you can execute# netstat -a | grep 6789

For Sun Cluster Manager GUI do: (Fixed in Sun Cluster 1/09 Update2)

# cd /usr/cluster/lib/SunClusterManager/WEB-INF/classes# rm ds# cp -r /usr/cluster/lib/ds ds# /usr/sbin/smcwebserver restart

Bug: 6701123

The ds directory should NOT be a link!

Ensure that Cluster Manager is working:Access via: https://<nodename_or_IP>:6789

Check that cacao is enabled at startup:Use # cacaoadm status

To start and stop cacao:# /opt/SUNWcacao/bin/cacaoadm stop# /opt/SUNWcacao/bin/cacaoadm start

If it DISABLED run cacaoadm enable

To check the versions run# java -version# cacaoadm -V

Setup the NTP Configuration:

You are free to configure NTP as best meets your individual needs.

If you do not have your own /etc/inet/ntp.conf file, the /etc/inet/ntp.conf.cluster file will be used as your NTP configuration file automatically.

NOTE: All nodes must be synchronized to the same time!

Do not rename ntp.conf.cluster file as ntp.conf!

Further information:Technical Instruction 208564 and 213924.Problem Resolution 230332.

POST INSTALLATION TASKS AND CHECK POINTS

SPARC only (not x86):Set/Check local-mac-address in OBP. scinstall sets to true automatically!

Only true is supported in SC3.2 and higher.

Check file /etc/name_to_major for Global Devices: did 300When using SDS/SVM: md 85Note: The above numbers are examples!

All nodes must have identical numbers. Numbers are chosen by the installation scripts.

Do not configure cluster nodes as routers!Set up file /etc/defaultrouter

scinstall will touch file/etc/notrouter per default.

Sun Internal and Approved Partners Only Page 11 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 12: Cluster 32

Task Comment Node

1 2 3 4

Check/Modify /etc/nsswitch.conf if applicable:hosts: cluster files <any other hosts database>netmasks: cluster files <any other netmasks database>

Solaris10: if the cluster nodes are nis clients:ipnodes: files nis [TRYAGAIN=0] (see bug6257112)

Note:• scinstall enters cluster as the first entry for hosts & netmasks.• files should always be in the second place following cluster.• Enter dns when node is dns client.• All further databases (e.g: nis, nisplus ...) can be added to the end of

the line.

Additional requirements which depend on the dataservice:HA-NFS:hosts: cluster files [SUCCESS=return] (see bug 4511699)rpc: files HA-Oracle: passwd: files (setup help for passwd in Problem Resolution 229579)group: files publickey: files project: files HA-SAP, HA-Siebel, HA-Sybase:group: files [NOTFOUND=return] HA-LiveCache:protocols: files nis passwd: files nis [TRYAGAIN=0] (see bug 4904975)group: files [NOTFOUND=return] project: files [NOTFOUND=return]publickey: files [NOTFOUND=return] (see bug 4836272)HA-Samba:passwd: files winbindgroup: files winbind

Fixed in Sun Cluster 3.2 1/09: Due to the bugs 6632298 & 6634592 (nscd performance degradation), with Solaris 10 8/07 or KU 120011-14 onwards (EIS-DVD ≥25SEP07).The example is for a 2 node cluster with default setup. If you have more nodes or changed the default values for cluster interconnect use ifconfig and scconf -pvv|grep -i private commands on all nodes to identify the values. Modify the following files on all nodes.Add to /etc/hosts:172.16.0.129 clusternode1-priv-physical1172.16.1.1 clusternode1-priv-physical2172.16.4.1 clusternode1-priv172.16.0.130 clusternode2-priv-physical1172.16.1.2 clusternode2-priv-physical2172.16.4.2 clusternode2-privAdd to /etc/netmasks:172.16.0.128 255.255.255.128172.16.1.0 255.255.255.128172.16.4.0 255.255.254.0Remove 'cluster' entry for hosts and netmasks in /etc/nsswitch.conf e.g:hosts: files <any other hosts database>netmasks: files <any other netmasks database>

Sun Internal and Approved Partners Only Page 12 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 13: Cluster 32

Task Comment Node

1 2 3 4

Check/Configure# ndd -set /dev/ip ip_strict_dst_multihoming 0

To check the value enter:# ndd /dev/ip ip_strict_dst_multihoming

You should do this in a startscript. You can use S68net-tune from EIS-DVD.See also Problem Resolution 230655.

When using supported network adapters which use the *ce* driver for private transport, uncomment in /etc/system:set ce:ce_taskq_disable=1

IMPORTANT: Remove the entry from /etc/systemif Solaris10 5/08 Update5 or higher &Sun Cluster core patches 12610[67]-18is used. Use EIS-DVD 16DEC≥ 08).

Note: In case of Solaris 9 the required patches are: 112817-32, 126849-01 and 126105-18

Bug 4746175If you use ce only for public network ce_taskq_disable=0 (the default) is ok. See also Technical Instruction 229074.

With the mentioned patches the Sun Cluster automatically choose the best configuration for the ce driver (RFEs 6281341 & 6487117).

When using supported network adapters which use the *ixge* driver for private transport, uncomment in /etc/system:set ixge:ixge_taskq_disable=1

Bug 6672126If you use ixge only for public network ixge_taskq_disable=0 (the default) is ok.

Sun Internal and Approved Partners Only Page 13 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 14: Cluster 32

Task Comment Node

1 2 3 4

Configure IPMP (probe-based or link-based):Setup IPMP groups on all nodes for all public network interfaces which are used for a HA dataservice.

Example probe-based IPMP group active-active with interfaces qfe0 and qfe4 with one production IP:Entry of /etc/hostname.qfe0:<production_IP_host> netmask + broadcast + group ipmp1 up \addif <test_IP_host> netmask + broadcast + deprecated -failover upEntry of /etc/hostname.qfe4:<test_IP_host> netmask + broadcast + group ipmp1 deprecated -failover upThe IPMP group name “ipmp1” is freely chosen in this example!

If the defaultrouter is NOT 100% available please readTechnical Instructions 214668 and 202448.Notes:• Do not use test IP for normal applications.• There is no need for a IPMP test address if you have only 1 NIC in the IPMP

group. (RFE 4511634, 4741473)e.g: of /etc/hostname.qfe0 entry:<production_IP_host> netmask + broadcast + group ipmp1 up

• Test IP for all adapters in the same IPMP group must belong to a single IP subnet.

Example link-based IPMP group active-active with interfaces qfe0 and qfe4 with one production IP:Entry of /etc/hostname.qfe0:<production_IP_host> netmask + broadcast + group ipmp1 up \Entry of /etc/hostname.qfe4:<dummy_IP_host> netmask + broadcast + deprecated group ipmp1 up Notes:• Do NOT use the 0.0.0.0 IP as dummy_IP_host for link based (Bug 6457375). • The recommendation is to use valid IP addresses instead of a dummy IP address.• The bug 6457375 is fixed with the next KU 13888[89].

Further details in Technical Instruction 211105.

Hints / Checkpoints for all configurations:• You need an additional IP for each logical host.• If there is a firewall being used between clients and a HA service running on this

cluster and if this HA service is using UDP and does not bind to a specific address, the IP stack choses the source address for all outgoing packages from the routing table. So, as there is no guarantee that the same source address is chosen for all packages - the routing table might change - it is necessary to configure all addresses available on a network interface as valid source addresses in the firewall.

• IPMP groups as active-standby configuration is also possible.• In the /etc/default/mpathd file, the value of

TRACK_INTERFACES_ONLY_WITH_GROUPS must be yes (default).• Esc 1-14949272: Use only one IPMP group in the same subnet. It's not supported

to use more IPMP groups in the same subnet.• The SC installer adds an IPMP group to all public network adapters. If desired

remove the IPMP configuration for network adapters that will NOT be used for HA dataservices.

• Remove IPMP groups from dman interfaces (SunFire 12/15/20/25K) if exists. (Bug 6309869)

Sun Internal and Approved Partners Only Page 14 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 15: Cluster 32

Task Comment Node

1 2 3 4

No cross mounts? Should not be necessary with global FS.

NIS/NIS+ configuration OK? Only NIS/NIS+ client is supported!

Beware no other real time processes are allowed on the cluster!

Exception:xntpd

Sun Cluster Security Hardening is supported with SC3.x. This is part of the CIS service offered by SunPS. Detailed Information at: http://www.sun.com/security/blueprints

Sun Internal and Approved Partners Only Page 15 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 16: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS FOR QUORUM SERVERYou can install the quorum server software on any system connected to your cluster through public network. The quorum server should NOT run on a cluster node.

NOTE: SC3.2u2 framework deliver a quorum server monitor feature. Earlier releases of SC does not monitor the quorum server!

Install the Qurom Server via JES installer.# cd /cdrom/cdrom0/<Solaris_arch># ./installer -nodisplay

Select “Quorum Server”

Choose 'Configure Later'

Install requires SC3.2 quorum server patch with EIS-DVD ≥28AUG07.Note the structure under .../sun/patch/SunCluster/3.2Currently SC3.2u2 has all patches integrated.

Configure PATH & MANPATH:Qurom Server: /usr/cluster/bin Quroum Server: /usr/cluster/man

Normally provided from the EIS-DVD by .profile-EIS. Just log out & back in to set the environment correctly.

Configure Quorum Server.Add to /etc/scqsd/scqsd.conf /usr/cluster/lib/sc/scqsd -i QS -p 9000 -d /var/scqsdExplanation:-i instancename. Unique name for quorum server instance. Can freely be choosen. (optional)-p port. The port number on which the quorum server listens. (required) -d quorumdirectory. Must be unique for each quorum server. (required)

• You can add multiple quorum server on a single host.

• It's recommended to use different quorum servers for different clusters. Only one cluster per quorum server.

• If you like more than one quorum server per cluster you need multiple single hosts as quorum servers.

Start the Quorum Server# clqs start QS or # clqs start +

If you setup the Quorum Server in different subnet you should check the functionality several times. Boot all nodes several times in order and monitor the reservations on the Quorum Server.

Bugs 6449588 & 6371284: ' The bugs report problems when the quorum server is in a different subnet.

Check with # clqs show

Manually configure routing information about QS on all cluster nodes:• Add the QS host name to /etc/inet/hosts and /etc/inet/ipnodes

Note: Since Solaris 10 8/07 Update4 ipnodes is a link to the file hosts.

• Add the quorum server host netmask to /etc/inet/netmasks.• Add “netmask +” to /etc/hostname.<adapter> file(s)

Add the cluster nodes netmasks to /etc/inet/netmasks to the QS host machine.

If present, the “Spanning Tree Algorithm” on any router or switch between the cluster nodes and the QS host must be disabled.

Sun Internal and Approved Partners Only Page 16 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 17: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS FOR NETAPP FILER• All SC dataservices supported on NAS device EXCEPT NFS itself.• NAS device must always be booted before the Sun Cluster.

Requirements on NetApp Filer:• It is assumed that the NetApp Filer has already been installed.• Cluster physical nodenames should be in the same LAN as the filer.• NTP server should be the same as for SC.• DNS is configured.• The package SUNWzlib is required (Not in End User Solaris Software Group!).• HTTP administration facility must be enabled for the user you use for SC.• Add physical node names into /etc/hosts.• Exporting NAS directories, for use with SC, with read/write permissions

e.g: -sec=sys,rw=<phys_hostname1>:<phys_hostname2>,root=<phys_hostname1>:<phys_hostname2>(Use exportfs to check)

Register each Filer in the cluster (only on one node):# clnasdevice add -t netapp -p userid=<fileruser_used_for_sc> <filer_hostname>

Register NetApp Filer directories into SC.• add the filesystems to /etc/vfstab e.g:

<filer_hostname>:/vol/data/1 - /data/1 nfs 1 yes bg,hard,intr,rsize=32768,wsize=32768,forcedirectio,proto=tcp,timeo=60

• run scnasdir for all NAS directories# clnasdevice add-dir -d <filer_volume> <filer_hostname>

This is a recommendation for the mountoptions! In case of Oracle RAC data files you should use at least forcedirectio, noac and proto=tcp

For -d option do NOT use the mountpoint!Check with # clnasdevice list

Additional requirements on NetApp Filer when used as quorum:• Install iSCSI license on the Filer.• Add physical node names into hosts.equiv file

e.g: <phys_hostname1> root <phys_hostname2> root

• Create iSCSI LUN with lun setup.Image must be: solarisLun Size: Choose smallest available Lun SizeInitiator Group: Should be the clusternameType of Initiator: iSCSIInitiator Node Name must be: iqn.1986-05.com.sun:cluster-iscsi-quorum.<phys_hostname1>,iqn.1986-05.com.sun:cluster-iscsi-quorum.<phys_hostname2>Each phys_hostname of the cluster which should access the quorum device on the Filer must be configured (only one quorum device per filer is possible).

Install the NTAPclnas package. This should be obtained from Net Applianceshttp://www.netapp.com. New NTAPquorum binary needed for Data ONTap 7.0.2p4.

Add NetApp Filer hostnames to /etc/hosts.

Check that the Filer is registered:# clnasdevice show -v

If not registered see some steps before...

Sun Internal and Approved Partners Only Page 17 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 18: Cluster 32

Task Comment Node

1 2 3 4

Register the quorum device. Use clsetup or# clquorum add -t netapp_nas, -p filer=<filer_hostname> -p lun_id=<0> <Initiator_Group_Name(NAS_quorum)>

Bug 6504743: scconf/clquorum needs to support for SSH for adding NAS quorumOn the filer doing “options rsh.enable on”. You can disable it after quorum registration.Note: rsh is need again for quorum changes!

Commands on the Filer to find out the names:

* lun show -v or -m* igroup show* iscsi show

Is fixed with SC3.2 core patch 125510-02 (S9)125511-02 (S10) 125512-02 (S10x86)

Above patches are on EIS-DVD ≥26JUN07.

Sun Internal and Approved Partners Only Page 18 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 19: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS FOR USING SVMMaybe adjust /kernel/drv/md.confnmd=128 default (max. metadevices)md_nsets=4 default (max. disksets)Changes require boot -rSolaris10 creates volumes dynamically!

Possible max values:nmd=8192md_nsets=32

Set the values to fit your requirements!

Run scgdevs so that the changes from md.conf will work properly.

This is due to bug 4678724.

Create replicas on the local disks:# metadb -afc 3 c0t0d0s7 c1t0d0s7 c2t0d0s7

Recommendation: Place replica on slice7.

Always put the replicas on different physical drives and if possible on 3 different controllers.

Mirror the root file systems.For the mirror of /, swap and /global/.devices/node@<id> (and all other local filesystems) the physical devices must be used. Not the did devices!

Attention: Metadevices MUST be unique (RFE4979330) within the cluster for /global/.devices/node@<id>e.g for (c0t0d0s0) root filesystem:metainit -f d101 1 1 c0t0d0s0metainit d102 1 1 c1t0d0s0metainit d100 -m d101metaroot d100 (only for root)(for swap and other filesystems you must update file /etc/vfstab)lockfs -fa init 6metattach d100 d102

Additional x86 for rootmirror:#installgrub -fm \ /boot/grub/stage1 \ /boot/grub/stage2 \ /dev/rdsk/c1t0d0s0

SPARC: Technical Instruction 216346.X86: Technical Instruction 210887SDS templates: .../sun/tools/MISC/SDS

The boot mirrors should be on different controllers!

If available you can also use the internal RAID1 feature for boot disk mirroring.

Recommendation: Use unique metadevices within the cluster. e.g:d10-d19 for Node1d20-d29 for Node2

or setup:d100-d199 for Node1d200-d299 for Node2(take care, jumpstart can only create devices in namerange d0 ... d127)-> think about a wise config because it depends on the amount of devices!

SPARC: Do not forget to set the boot-device in OBP for both boot drives!X86: Set altbootpath in bootenv.rcSEE page 23!

Update dump device. dumpadm

Optional: Only if you have an even number of disks or controllers for local replicas consider adding the following to file /etc/system:set md:mirrored_root_flag=1

See Technical Instruction 214357 and discuss implications with customer.

Additional helpful information:http://www.sun.com/blueprints/1102/817-0656.pdf

Managing Shared Storage in a SC3.x/SVM.

Sun Internal and Approved Partners Only Page 19 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 20: Cluster 32

Task Comment Node

1 2 3 4

Only for active-active IPMP groups with 2 production IPs and SVM (Bug 4897681).Ignore following message after a meta command.ERROR: <hostname>: rpc.metacld 13

Normally you use only one production IP in a IPMP group.

Campus clusters using SDS/SVM for managing shared data. Its recommended to use a quorum server instead of traditional quorum device.In case of quorum disk you should consider creating a preferred site by putting an additional statedb in each of the metasets. The preferred site should be the same site as that which contains the quorum disk. Doing this overcomes the problem of instantaneous loss of half your storage and servers in the preferred site. If remaining site still happens to panic the preferred site can still boot (more than half statedb available).

Adding an extra statedb to a metaset requires slice 7 to more than 4MB created by the standard metainit command. So, prior to using metainit, format the disk with an 20MB slice 7 and mark the slice with the wu flags. Then initialise the metadb commands in the next step.

When using Oban please jump to RAC install section.

Set up the metasets.metaset -s <setname> -a -h <node1> <node2>metaset -s <setname> -a <drivename1> <drivename2> ...# removes the one copy if necessarymetadb -s <setname> -d -f <drivename># Adds back two copies (maybe you must expand slice7)metadb -s <setname> -afc 2 <drivename>

Use the did devices.e.g: /dev/did/dsk/d0

Slice 7(VTOC) or Slice 6(EFI) size for metadb is fixed when it's added to the metaset.

Check replicas of the disksets!

Set up the Mediator Host for each diskset when the cluster matches the two-hosts and two-strings criteria.metaset -s <setname> -a -m <node1> <node2>

See also % man mediator

Create metadevices.Recommendation for slices:Slice 7 - Replica (Starts at cylinder 0)Slice 0 - Metadevice

md.tab in /etc/lvm

Replica are pre-configured on Slice 7(VTOC) or Slice 6(EFI).

If you need to, you can slice the disk up! Beware of CR6407862 in case of Solaris10 and KU118833-17 or greater. Set NOINUSE_CHECK=1 in your shell environment to prevent warning/error messages of the filesystem utilities commands.

Consider installing the metacheck script & activate via CRON.

On EIS-DVD under...sun/tools/MISC/SDS

Jump to common actions for filesystems and boot devices on page 24.

Sun Internal and Approved Partners Only Page 20 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 21: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS FOR WITH VXVM/CVMThis checklist covers only VxVM 4.1 or higher:• A rootdg is optional. Therefore you can use SVM for the rootmirror and VxVM for

the shared devices (this protects you from the VxVM encapsulation administration).• SC3.2u1 is the first release which supports VxVM 5.0 MP1 on x86.

HEADS-UP about Bug 6626457: boot time increases exponentially with S10U4 & VxVM 5.0 & SunCluster 3.2. Please examine bug for details but boot times up to 30 minutes have been observed with VxVM 5.0. Also with VxVM 4.1 the boot time can increase to more than 20 minutes. So, if nothing happen on the console for several minutes then VxVM is starting. Fixed in 127718-04 for sparc (EIS-DVD ≥26 FEB08).

Solaris 9: Ensure that the SAN Foundation Suite has been installed and patched.

On EIS-DVD in .../sun/progs/SAN &.../sun/patch/SAN/9

VxVM install: Use the installer from the storage_foundation or installvm script from Veritas Volume Manager directory.Configure VxVM later!

License required!!!

Circa 1.1 GB required in / filesystem!

Install required VxVM patches from EIS-DVD (sun/patch/veritas)

VxVM 4.1 only:Ensure that you have installed the required ASL for your storage- see EIS-DVD at sun/tools/storage/ASL

Refer to the storage checklist for details.

VxVM 5.0 has all ASLs included this time.

Enable enclosure_based naming. Check that vxconfigd is running. If not, start it.cd /opt/VRTS/install./installvm -configure

It's not necessary to configure a default diskgroup!

Note: Refer to Problem Resolution 208533 if you have problems with the enclosure-based naming of internal disks.

Consult SunAlert 102151. Maybe it's necessary to use native device names in case of EMC.

The root disk group is optional.

Reboot the nodes with init 6 The vxencap will fail if you do not reboot the nodes after VxVM installation/activation.

Sun Internal and Approved Partners Only Page 21 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 22: Cluster 32

Task Comment Node

1 2 3 4

Initializing VxVM on a cluster node:# clvxvm initialize -v

Note: If using zfs boot the command fails due to bug 6803117. Check that /etc/name_to_major has the same vxio number on all nodes.

ORInitializing VxVM on a cluster node and encapsulating the root disk# clvxvm encapsulate -v

This selects a cluster-wide vxio number.

Only SC32u2: Bug 6824660: clvxvm encapsulate tries to rename /.globaldevices directory in /etc/vfstab if using lofi based globaldevices.

Bug 6735342: The command fails if root is already mirrored with SVM. In this case change the following line in file /usr/cluster/lib/sc/clvxvm_impl

from: 609 if [[ "${special}" != /dev/dsk/c*t*d*s[0-7] ]]; thento: 609 if [[ "${special}" != /dev/md/dsk/d* ]]; then

Check if encapsulation is working correctly # init 6

CONFIGURATION STEPS FOR ROOTMIRROR

Notice concerning CDS (Cross Platform Data Sharing) format which is the default for disks with VxVM 4.x and higher. The vxmksdpart command is NOT working for CDS disks. It is NOT possible to write a partition table to a CDS disk.

For change the default for rootdisks do:# vxdisksetup -i c0t1d0s0 format=sliced or create the file /etc/default/vxdisk with the entry:format=sliced

Also for vxdg command the default is CDS. To change do:# vxdg init <datadg> c1t5d0 cds=offor create the file /etc/default/vxdg with the entry:cds=off

Note: CDS format and sliced format can NOT be mixed in the same diskgroup.

Initialise the rootmirror disk:

# vxdisksetup -i <cxtxdx> format=sliced

Add the disk to the diskgroup containing the boot disk:

# vxdg -g <vxvm_bootdg> adddisk rootmirror=<cxtxdxs2>

Mirror the boot disk: # vxmirror -g <vxvm_bootdg> rootdisk rootmirror

Sun Internal and Approved Partners Only Page 22 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 23: Cluster 32

Task Comment Node

1 2 3 4

Ensure that there are underlying partitions on the rootmirror.

Use vxmksdpart to create the underlying partitions on the rootmirror disk where they are not already present. The rootvol will already have a partition but it may be necessary to create partitions for the remaining volumes that have been mirrored onto the rootmirror disk. Ensure that you use the correct slice number and appropriate 'tag' and 'flag' values. e.g. For /var on the rootmirror disk that uses a sub-disk rootmirror-03, partition 5 could be created using;

# vxmksdpart -g <rootdg> rootmirror-03 5 0x07 0x00

See Internal Technical Instruction 2202882 for a explanation on using vxmksdpart and prtvtoc(1m) for the list of valid tags and flags.

CONFIGURATION STEPS FOR VXVM DISKGROUP SETUP

If you use MPxIO, EMC PowerPath or HDLM ensure that DMP is enabled:Be sure that you have installed all necessary VxVM ASL libraries for your storage. Details available in the storage checklists.Note: In case of EMC Clarion Arrays refer to Problem Resolution 203669 for setup details.

Exception:• The use of Dynamic

Multipathing (*DMP*) alone to manage multiple I/O paths per node to the shared storage is not supported.

When using Hitachi HDLM use the dlmsetconf utility which deletes logical device files of sd or ssd devices. ALSO exclude devices other than the one on the primary path from VxVM.

Refer to the HDLM USER'S GUIDE for more information.

See also Technical Instruction 204950.

When using CVM please jump to RAC install section.

Set up disk groups and volumes. Enable DRL!

Check that private regions are distributed on all used storage devices. Execute:# vxdg list <diskgroup>and check the line 'config disk'. The state 'clean online' should be on more than one array.

If this is not the case, add disks from the different storage devices alternately (first a disc from one device then a disc from another...) to the diskgroup.

To activate private regions on all disks refer to Problem Resolution 204270. Beware in greater configurations this can have disadvantages!

Sun Internal and Approved Partners Only Page 23 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 24: Cluster 32

Task Comment Node

1 2 3 4

COMMON ACTIONS for filesystems & boot devices

Check/Set localonly flag for all local did devices (which means all disks which are managed only on one node). At least necessary for the root & root mirror disks:• find the matching between physical and did device

# cldev list -v <physical device or did device> (scdidadm -l)• make sure only local node in node list:

# cldg show dsk/d<N>

• if other nodes in node list, remove them:# cldg remove-node -n <other_phy_host> dsk/d<N>

• set localonly & autogen to true if not already done:# cldg set -p localonly=true dsk/d<N># cldg set -p autogen=true dsk/d<N>

NOTE: To get the did device use cldev list -n <phy_host> -v (scdidadm -l)

Run command cldev populate on one node. This causes the new diskgroups to be incorporated into the global devices.

This is for safety reason! Normally the cluster does this automatically for metadevices!

Only VxVM:Register disk groups as "device group" in the SunCluster 3.x configuration via clsetup.Do not forget to synchronise each VxVM object after you create it via clsetup!

Interactive Q+AStart with item 5 and continue with item 1.In case of RAC do NOT register the shared diskgroup. Refer to the RAC section!

Create file systems.On global file systems you can use UFS or VxFS 4.1 or 5.0.Check /etc/system after installation of VxFS. Change the rpcmod entry to:set rpcmod:svc_default_stksize=0x8000set lwp_default_stksize=0x6000

Check mount option restrictions in the Release Notes!

For x86 clusters:VxFS 4.1 is supported!VxFS 5.0 requires SC3.2u1

Enter global filesystems into /etc/vfstab on all nodes. Recommendation: mount all FS under /global!Create all mountpoints on all nodes.

Add global and logging option in /etc/vfstab if you use UFS.

Test switch of the disk groups withcldg switch -n <phy.host> <devicegroup>

SPARC: Modify NVRAM parameter:Set boot-device & diag-device to both sides of the mirror.

X86: If using software root mirror add altbootpath to bootenv.rc with# eeprom altbootpath=”<path_to_rootmirror>”

Suggested naming convention SPARC:

• Primary: rootdisk

• Secondary: rootmirror

X86:

• Techdoc 210887

• VxVM 5.0MP3 is required (patch 127336-02 or higher)

Sun Internal and Approved Partners Only Page 24 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 25: Cluster 32

Task Comment Node

1 2 3 4

Only Solaris10 x86: add to /boot/grub/menu.lst

kmdb flag to enable crashdump writing e.g:#-----------------------------------------title Solaris 10 6/06 cluster moderoot (hd0,0,a)kernel /platform/i86pc/multiboot kmdbmodule /platform/i86pc/boot_archive#-----------------------------------------

boot possibilities e.g: #-----------------------------------------title Solaris 10 6/06 non-cluster-moderoot (hd0,0,a)kernel /platform/i86pc/multiboot kmdb -xmodule /platform/i86pc/boot_archive#-----------------------------------------title Solaris 10 6/06 cluster-mode single-userroot (hd0,0,a)kernel /platform/i86pc/multiboot kmdb -svmodule /platform/i86pc/boot_archive#-----------------------------------------title Solaris 10 6/06 non-cluster-mode single-userroot (hd0,0,a)kernel /platform/i86pc/multiboot kmdb -xsvmodule /platform/i86pc/boot_archive

IMPORTANT: If kernel /platform/i86pc/multiboot kmdbdoes not work then use: kernel /platform/i86pc/multiboot -k

See also Technical Instruction 2204226 & 217327 and the EIS-checklist of the used X86 Server.

See also Technical Instruction 211199.

Modify the example as required!

Bug 6553399: kmdb changed to -k in Solaris 10 5/08 Update5.

Test booting using the new aliases.

Sun Internal and Approved Partners Only Page 25 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 26: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS FOR LOCAL VXVM DISKGROUP

Setup local diskgroup (e.g: localdg) and volumes:vxdisksetup -i Disk_5vxdg init localdg localdg01=Disk_5vxassist -g localdg make vol01 500m localdg01

It's highly recommended to have unique local diskgroup names throughout the cluster for better administration.

If you use a multihosted disk remove all other nodes from the did device which is used in the localdg. E.g:Look for the did device of vxvm device # vxdisk list Disk_5# cldev list -v c1t1d1 (scdidadm -l)# cldg list -v dsk/d16Remove other nodes (which should not access localdg) from the list# cldg remove-node -n phy-host dsk/d16# cldg list -v dsk/d16

In this example:Disk_5 = c1t1d1 = dsk/d16

Set localonly property to single node

# cldg create -t vxvm -p localonly=true -n phy-host localdg

Verify with # cldg show dsk/d16# cldg list -v dsk/d16# cldg status (should NOT show localdg)

You can also use clsetup

5) Device groups and volumes 7) Set a VxVM disk group as a local disk group

Create the filesystem:newfs /dev/vx/rdsk/localdg/vol01

Verify the that autoimport flag is setvxdisk list c1t1d1s2

The flags line must include "autoimport".

Add local filesystem to /etc/vfstab on the connected phy-host and mount it. E.g: /dev/vx/dsk/localdg/vol01 /dev/vx/rdsk/localdg/vol01 /localfs ufs 2 yes loggingmkdir /localfs; mount /localfs

Verify that the system is able to boot. init 6

Sun Internal and Approved Partners Only Page 26 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 27: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION NOTICE FOR ALL HASTORAGEPLUS FILESYSTEMS

Configure the 'fsck pass' option in /etc/vfstab for HAStoragePlus filesystems• if 'fsck pass' is set to 1 --> fsck will run sequentially• if 'fsck pass' is set to 2 or greater –> fsck will run in parallel

More details in Technical Instruction 207488.Note: Due to Bug 6572900 the parallel fsck can end up in a exit (127) of a SUNW.HAStoragePlus resource. WORKAROUNDS: a) Do not set the locales LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MESSAGES LC_ALLor b) use sequential fsckIDR Patch available through normal support channels. (Fixed in SC32u2)

CONFIGURATION STEPS FOR FAILOVER ZFS WITH HASTORAGEPLUS

Notes:• Do not add a configured quorum device to zfs because zfs will

relabel the disk. You can add the quorum after creation of zfs.• It's recommended to use full disk instead of a disk slice.• HAStoragePlus does not support file systems created on zfs volumes

only whole zfs pools can be used.• You can not use value 'legacy' or 'none' of mountpoint property.• It's recommended to use cXtXdX devices for ZFS.• Do not use did devices!

Create ZFS storage pool. e.g:stripe setup:# zpool create <data> cXtXdX cXtXdXmirror setup:# zpool create <data> mirror cXtXdX cXtXdXraidz / raidz2 setup:# zpool create <data> raidz2 cXtXdX cXtXdX cXtXdX

Use # cldev list -v to find out shared devices.

Refer to ZFS documentation for details...

Create ZFS filesystem in a pool# zfs create data/home# zfs create data/home/user1

Register HAStoragePlus# clrt register SUNW.HAStoragePlus

Create failover resource group# clrg create zfs1-rg

Create HAStoragePlus resource# clrs create -g zfs1-rg -t SUNW.HAStoragePlus -p Zpools=<data> zfs1-rs

Note: If HASP validation fails at this point. Then export the zpool and try again (Bug 6745570).

Do NOT use FilesystemMountPoints property for ZFS!

Switch resource group online# clrg online -M zfs1-rg

Test to switch the resource group# clrg switch -n <other_node> zfs1-rg

Sun Internal and Approved Partners Only Page 27 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 28: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS OF NON-GLOBAL ZONES WITH HASTORAGEPLUS

This setup explains the configuration of non-global zones. The failover filesystems, the logical host and gds resource will be switched between non-global zones of different cluster nodes.

It's also possible to setup failover zones with Sun Cluster HA for the Solaris Container Agent. But the failover zones configuration is hard to update/patch. EIS currently does not cover failover zones setup. It is recommended to use the approach below without the Container Agent.

Create a non-global zone on each node# mkdir -p /zones/zone1# chmod 700 /zones/zone1# zonecfg -z zone1zone1: No such zone configuredUse create to begin configuring a new zone.zonecfg:zone1> createzonecfg:zone1> set zonepath=/zones/zone1zonecfg:zone1> set autoboot=true

Optional: Setup a network to access the zone1 all time independent of a Sun Cluster logical host.zonecfg:zone1> add netzonecfg:zone1:net> set physical=e1000g0zonecfg:zone1:net> set address=<ip_address>zonecfg:zone1:net> end

zonecfg:zone1> verifyzonecfg:zone1> commitzonecfg:zone1> exit# zoneadm -z zone1 install# zoneadm -z zone1 boot# zlogin -C zone1

The zone is located in the local filesystem /zones of the gobal zone.

The zone can have the same name on each node.

Due to autoboot=true the non-global zone is booting with the system.

You should configure your zone. e.g: Define the Locale, Term, TZ, root password. Please refer to the Zones/Container EIS checklist for more details.

Configure /etc/nsswitch.conf of non-global zonehosts: cluster files <any other hosts database>netmasks: cluster files <any other netmasks database>ipnodes: files nis [TRYAGAIN=0] (see bug6257112)

Refer to page 10 in case of nscd performance problems!

Configure /etc/inet/hosts and/etc/inet/ipnodes of non-global zone.Note: Since Solaris 10 8/07 Update4 ipnodes is a link to the file hosts.

Enter all the logical hosts of non-global zones.

Create a resource group# clrg create -n phy-host-a:zone1,phy-host-b:zone1 zone1-rg

Register HAStoragePlus# clrt register SUNW.HAStoragePlus

Sun Internal and Approved Partners Only Page 28 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 29: Cluster 32

Task Comment Node

1 2 3 4

Set up the logical host resource and bind to the zone1 resource group:# clrslh create -g zone1-rg -h <logical_host> -N ipmp1@phy-host0,ipmp1@phy-host1 <logical_host>-rs

The logical host is for the non-global zone. The -N option is used for the ipmp group in the global zone.

Note: You can also configure a shared address with clrssa.

Create SUNW.HAStoragePlus resource. You have 2 choices for mountpoints in the zone1.ExampleA: (identical mountpoints)Entries for /etc/vfstab in global zone/dev/md/zone1ds/dsk/d0 /dev/md/zone1ds/rdsk/d0 /global/db ufs 2 no logging/dev/md/zone1ds/dsk/d1 /dev/md/zone1ds/rdsk/d1 /global/db/data ufs 2 no logging

Create the same mountpoints in the global zone and non-global zone1# mkdir -p /global/db/dataCreate the SUNW.HAStoragePlus resource# clrs create -g zone1-rg -t SUNW.HAStoragePlus -p AffinityOn=TRUE -p FilesystemMountPoints= /global/db,/global/db/data zone1-hasp-rs

ExampleB: (dedicated mountpath hirarchy)Entries for /etc/vfstab in global zone/dev/md/zone1ds/dsk/d0 /dev/md/zone1ds/rdsk/d0 /zone1fs/db ufs 2 no logging/dev/md/zone1ds/dsk/d1 /dev/md/zone1ds/rdsk/d1 /zone1fs/db/data ufs 2 no logging

Create mountpoints in the global zone# mkdir -p /zone1fs/db/dataCreate mountpoints in non-global zone1# mkdir -p /db/dataCreate the SUNW.HAStoragePlus resource# clrs create -g zone1-rg -t SUNW.HAStoragePlus -p AffinityOn=TRUE -p FilesystemMountPoints=/db:/zone1fs/db,/db/data:/zone1fs/db/data zone1-hasp-rs

Keep in mind: The /zone1fs/db and the /zone1fs/db/data are mounted in the global zone. The /db and /db/data are mounted in the non-global zone. Beware that different zones can have different data in /db and /db/data.

Note: The filesystems in the Solaris global zone can be either a global or failover filesystem.

Alternative: It is also possible to use zfs with non-global-zones# clrs create -g zone1-rg -t SUNW.HAStoragePlus -p zpools=<zpool_name>This command imports the zpool into the non-global zone1 with an alternate rootpath. Details of setup zfs is described one section earlier.

Switch resource group online# clrg online -eM zone1-rg

OPTIONAL: Configure additional supported Sun Cluster Agents (resources) in the non-global zone.

Refer to SC configuration guide for supported SC agents in non-global zones.

Test via switch of the resource group# clrg switch -n phy-host:zone1 zone1-rg

Sun Internal and Approved Partners Only Page 29 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 30: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS FOR DATA SERVICE HA-NFSIF

• HA-NFS is configured with HAStoragePlus FFS (failover file system) AND

• automountd is running/used.

THENexclude: lofs

in file /etc/system

(Bug 6222652)

If both of these conditions are met, LOFS must be disabled to avoid switchover problems or other failures. If one but not both of these conditions is met, it is safe to enable LOFS.

If you require both LOFS and the automountd daemon to be enabled, exclude from the automounter map all files that are part of the highly available local file system that is exported by HA-NFS.

If you use ZFS as exported file system you must set sharenfs property to off. e.g:# zfs set sharenfs=off <zfs>

Verify with # zfs get sharenfs <zfs>

Use NFSv4 to mount external NFS file systems on cluster nodes to prevent interrupting NFS services! Which is the default in /etc/default/nfs for NFS clients.

It's recommended to use NFS v3 for NFS server. Activate in /etc/default/nfs NFS_SERVER_VERSMAX=3

Bug 6619911: Failover fails due nfssys(NFS4_SVC_REQUEST_QUIESCE) failed. -> Fixed in 127111-10 (sparc) or higher.

Bug 6325299 and 6297741: nfsv4 failover much slower than nfsv3.

Check /etc/nsswitch.conf Refer to page 10.

Edit /etc/hostsIn case of SF12K/15K & E25/20K add the internal dman network address. e.g:10.10.10.8 xcat-g

In case of Mx000 add internal sppp0. e.g:192.168.224.2 sppp0

In case of more than one network interface in a IPMP group add the following dummy:0.0.0.0 dummy

Problem Resolution 216754.

Bug 6181327

If you do not add these entries the HA-NFS will not failover correctly.

Note: HA-NFS resolves all configured network addresses which are displayed by the ifconfig command! Maybe some more entries are necessary.

Optional: Customize the nfsd or lockd startup options.

Install the NFS Data Service via the JES installer on all nodes.

Choose Configure Later

Install the SunCluster Agent patches from the EIS-DVD if available.Note the structure under .../sun/patch/SunCluster/3.2

Register resource type nfs# clrt register SUNW.nfs

Sun Internal and Approved Partners Only Page 30 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 31: Cluster 32

Task Comment Node

1 2 3 4

Set up the resource group# clrg create -n phy-host0,phy-host1 -p Pathprefix=/global/nfs1 nfs1-rg

The name of the resource group can be freely chosen!

Set up the logical host resource and bind to the resource group:# clrslh create -g nfs1-rg -h <logical_host> -N ipmp1@phy-host0,ipmp1@phy-host1 <logical_host>-rs

Use the <logical_host> for the resource name and add -rs on the end. The ipmp groups (-N option is optional) are automatically created at this point if they do not already exist.

Optional: Setup HAStoragePlus resource# clrt register SUNW.HAStoragePlus (check with clrt list)

for failover filesystem:

e.g: of /etc/vfstab/dev/md/nfsset/dsk/d10 /dev/md/nfsset/rdsk/d10 /global/nfs1 ufs 2 no logging

for global filesystem:e.g: of /etc/vfstab/dev/md/nfsset/dsk/d10 /dev/md/nfsset/rdsk/d10 /global/nfs1 ufs 2 no global, logging# clrs create -g nfs1-rg -t SUNW.HAStoragePlus -p FilesystemMountPoints=/global/nfs1 -p AffinityOn=True nfs1-hastp-rs

In case of ZFS:# clrs create -g nfs1-rg -t SUNW.HAStoragePlus -p Zpools=nfs1zpool -p AffinityOn=True nfs1-hastp-rs

Bring the resource group online.# clrg online -M nfs1-rg

Set up the directory for NFS state information in an online shared filesystem# cd /global/nfs1# mkdir SUNW.nfs# cd SUNW.nfs# vi dfstab.nfs1-server-rs(The name for the resource type can be freely chosen).share -F nfs -o rw /global/nfs1/data# cd; mkdir /global/nfs1/data

SUNW.nfs directory must be in the devicegroup. Recommendation: SUNW.nfs directory should NOT be visible by the NFS-Clients!

Create the resource of type SUNW.nfs to resource group and dfstab.nfs1-server-res# clrs create -g nfs1-rg -t SUNW.nfs nfs1-server-rs

Or if HAStoragePlus is used, setup required dependencies# clrs create -g nfs1-rg -t SUNW.nfs -p Resource_dependencies=nfs1-hastp-rs nfs1-server-rs

It's recommended to use Failover_mode soft for nfs server resource# clrs set -p Failover_mode=SOFT nfs1-server-rs

Test via switch of the resource group# clrg switch -n phy-host nfs1-rg

Sun Internal and Approved Partners Only Page 31 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 32: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS FOR HA-ORACLE WITH SUNW.HASTORAGEPLUS (THIS DOES NOT DESCRIBE THE INSTALLATION OF ORACLE)

Change permission for Oracle installatione.g: when using raw devicesin case of SVM:# chown oracle:dba /dev/md/metaset/rdsk/dn# chmod 600 /dev/md/metaset/rdsk/dn

in case of VxVM:# vxedit -g oradg set user=oracle group=dba volume# cldg sync oradg

Verify with: # ls -lL <raw device>

SVM ONLY: If you lose the settings after reconfiguration boot edit file /etc/minor_perm and add md:0,XX.blk perms owner group then reboot -- -r

Setup preconfiguration tasks:• Oracle user and environment• /etc/system values for Oracle• install Oracle• Oracle or Solaris authentication• Update listener.ora & tnsnames.ora

Test start and stop of Oracle DB outside the cluster!

Ensure you have kernel parameters in /etc/system for Oracle e.g:

set shmsys:shminfo_shmmax=4294967295set shmsys:shminfo_shmmni=100set semsys:seminfo_semmni=100set semsys:seminfo_semmsl=256

Alternative for Solaris 10:

Use Solaris Resource Management (SRM). Example for the default project which will be used by HA-Oracle.

# projmod -s -K "project.max-shm-memory=(priv,4294967295,deny);\project.max-shm-ids=(priv,100,deny);\project.max-sem-ids=(priv,100,deny);\process.max-sem-nsems=(priv,256,deny)" default

Check status via: # projects -l default

Note: You can also specfiy your own project for Oracle. Then associate the project to a resource group with property RG_project_name or to a resource with property Resource_project_name.

Install the Oracle Data Service via JES installer on all nodes.

Select Configure Later.

Install SunCluster Agent patches from the EIS-DVD ≥26JUN07.Note the structure under .../sun/patch/SunCluster/3.2

You can either use clsetup to configure HA-Oracle with the Wizard (not in case of ZFS) or use the following commands:

Register the resource types:# clrt register SUNW.oracle_server# clrt register SUNW.oracle_listener# clrt register SUNW.HAStoragePlus

Sun Internal and Approved Partners Only Page 32 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 33: Cluster 32

Task Comment Node

1 2 3 4

Create resource group# clrg create oraDB1-rg

Use the -n option if not run on all nodes.

Set up the logical host resource type and bind to the resource group:# clrslh create -g orasun-rg -h <logical_host> -N ipmp1@phy-host0,ipmp1@phy-host1 <logical_host>-rs

Use the <logical_host> for the resource name and add -rs on the end. The ipmp groups (-N option is optional) are automatically created at this point if they do not already exist.

Create the HAStoragePlus resource for failover filesystem:# clrs create -g oraDB1-rg -t SUNW.HAStoragePlus -p FilesystemMountPoints=/global/oracle -p AffinityOn=TRUE DB1-hastp-rse.g: of /etc/vfstab/dev/md/oradg/dsk/d10 /dev/md/oradg/rdsk/d10 /global/oracle ufs 2 no logging

If your database on raw device# clrs create -g oraDB1-rg -t SUNW.HAStoragePlus -p GlobalDevicePaths=oraset,/dev/md/oradg/rdsk/d10 -p AffinityOn=TRUE DB1-hastp-rs

Remember the filesystems should not be mounted!

You can use forcedirectio for database files NOT for binaries! See also Problem Resolution 212289 & 214434.

Bring resource group online:

# clrg online -emM oraDB1-rg

Now the filesystems are mounted!

Setup oracle server and listener on the node which has the oracle resource group online.

# clrs create -g oraDB1-rg -t SUNW.oracle_server -p ORACLE_HOME=/global/oracle/10.2.0 -p ORACLE_SID=DB1 -p Alert_log_file=/global/oracle/10.2.0/admin/DB1/bdump/alert_DB1.log -p Connect_string=scott/tiger -p Resource_dependencies=DB1-hastp-rs DB1-server-rs

# clrs create -g oraDB1-rg -t SUNW.oracle_listener -p ORACLE_HOME=/global/oracle/10.2.0 -p LISTENER_NAME=LISTENER -p Resource_dependencies=DB1-hastp-rs DB1-lsnr-rs

The faultmonitor user should be present in oracle.

Think about the timeouts of the Oracle server/listener resource:The defaults of SC3.1u4 are:SUNW.oracle_server:6:Thorough_probe_interval default: 30 SUNW.oracle_server:6:Retry_count default: 2 SUNW.oracle_server:6:Retry_interval default: 1330SUNW.oracle_server:6:Probe_timeout default: 300 SUNW.oracle_listener:5:Thorough_probe_interval default:30 SUNW.oracle_listener:5:Retry_count default: -1 SUNW.oracle_listener:5:Retry_interval default: 600 SUNW.oracle_listener:5:Probe_timeout default: 180

To change a value use for e.g:# clrs set -p Probe_timeout=300 DB1-server-rs # clrs set -p Thorough_probe_interval=30 DB1-server-rs

The timeout values depend on the specific configuration of the system. Maybe you have to test which values are the best. The default values of SC3.2 are normally fine.

Switch/test the resource group:# clrg switch -n <other_node> oraDB1-rg

Sun Internal and Approved Partners Only Page 33 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 34: Cluster 32

CONFIGURATION STEPS FOR ORACLE /SAP CLUSTER

Refer to the document on EIS-DVD in .../sun-internal/docs/SAP-INSTALL

We are currently enquiring whether this document is applicable to SunCluster 3.2

Sun Internal and Approved Partners Only Page 34 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 35: Cluster 32

Task Comment Node

1 2 3 4

CONFIGURATION STEPS FOR RAC CLUSTER

• Oracle RAC is only supported and working in the global zone.• CVM is only supported on SPARC.• In case of 10g you should install the Oracle CRS before the database.• Procedure only tested with Oracle10g R2.• EIS does NOT cover Oracle 9i.

Oracle requirements• The public & private interface names associated with the network adapters for

each network must be the same on all nodes. e.g: Node1(ce1) to Node2 (ce1)http://dba-services.berkeley.edu/docs/oracle/manual-10gR2/install.102/b14205/presolar.htm#BABJHGBE

• If you install Oracle 10g RAC then it's NOT possible to use capitals within the hostname. Use lower case letters! (Oracle Bug 4632899).

• /var should NOT be a separate partition (Bug 4891227).• Switches for Oracle10g & Sun Cluster private interconnect.

Check that the CVM license is installed in case of VxVM.

vxlicrep, vxlicinst

Set up the Oracle Group and User, e.g:Add following line to /etc/group:dba:*:520:root,oracleoinstall:*:520:oracle# useradd -u 120 -c "Oracle DBA" \-d /oracle -m -g oinstall -G dba \-s /bin/sh oracle# chown -R oracle:dba /oracle# passwd oracle

Create the .rhosts file for oracle user in oracle user home e.g:<node0> oracle<node1> oracleThis is necessary that oracle can use rcp (without passwd) between the cluster nodes within oracle installation. Details available in Oracle documents!

Aternative:On all nodes run:# su – oracle# ssh-keygen -t rsa

Important at this step: Hit 3 times enter to use ssh without password.on node1# cat ~/.ssh/id_rsa.pub | ssh node0 "cat >> ~/.ssh/authorized_keys"on node0# cat ~/.ssh/id_rsa.pub | ssh node1 "cat >> ~/.ssh/authorized_keys"

Take care of your IPMP config and add all interfaces of used IPMP group to the .rhosts file. If necessary add for each host e.g:node0-ce0node0-ce1Test the login without passwd before start oracle installation.

Keep in mind that the authorized_keys file will be written on the remote host.

Sun Internal and Approved Partners Only Page 35 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 36: Cluster 32

Task Comment Node

1 2 3 4

Typical example of .profile for Oracle10g RAC:ORACLE_BASE=/oracle ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1CRS_HOME=$ORACLE_BASE/product/10.2.0/CRS TNS_ADMIN=$ORACLE_HOME/network/admin DISPLAY=clustergw:2 if [ `/usr/sbin/clinfo -n` -eq 1 ]; then ORACLE_SID=sun1 fi if [ `/usr/sbin/clinfo -n` = 2 ]; then ORACLE_SID=sun2fi PATH=/usr/ccs/bin:$ORACLE_HOME/bin:$CRS_HOME/bin:/usr/bin:/usr/sbin export ORACLE_BASE ORACLE_HOME TNS_ADMIN CRS_HOME export ORACLE_SID PATH DISPLAY Available on DVD .../sun/tools/MISC/profile-oracleRAC

Bypassing NIS in /etc/nsswitch.conf when used.passwd: files nis [TRYAGAIN=0]publickey: files nis [TRYAGAIN=0]project: files nis [TRYAGAIN=0]group: files

Install the RAC Service from the CD via JES installer on all nodes. You always need SUNWscucm SUNWudlm SUNWudlmr.• For SVM you require SUNWscmd.• For CVM you require SUNWcvm &

SUNWcvmr.• For Hardware Raid Support no package

required!• SUNWscor is needed that data service

wizard is working correctly.

Shared QFS is currently supported on Hardware RAID or on top of SVM.Not working with CVM!

On SPARC platforms only:Install ORACLE udlmpkgadd -d . ORCLudlm

Version 3.3.4.9 for Oracle 64bit. Available on EIS-DVD (internal).

Install SunCluster Agent patches from the EIS-DVD if available.Note the structure under .../sun/patch/SunCluster/3.2

Set up shared memory in /etc/system.Example:set shmsys:shminfo_shmmax=268435456For Oracle 10g:set noexec_user_stack=1Optional: set semsys:seminfo_semmap=1024set semsys:seminfo_semmni=2048set semsys:seminfo_semmns=2048set semsys:seminfo_semmsl=2048set semsys:seminfo_semmnu=2048set semsys:seminfo_semume=200set shmsys:shminfo_shmmin=200set shmsys:shminfo_shmmni=200set shmsys:shminfo_shmseg=200forceload: sys/shmsysforceload: sys/semsysforceload: sys/msgsys

Alternative for Solaris 10:

Use Solaris Resource Management (SRM). An Example is mentioned in the HA-Oracle section on page 29.

Sun Internal and Approved Partners Only Page 36 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 37: Cluster 32

Task Comment Node

1 2 3 4

Reboot required:• to activate /etc/system values.• to activate CVM licence.

SC3.2 and higher does NOT start udlm automatically. You must configure the rgm before udlm will start.

SETUP RGM FOR ORACLE RAC DATASERVICE

Configure RAC Manageability Feature. Run clsetup and use Option 3 'Data Services' or the following commands:e.g: for 2 Node Cluster with OBAN# clrt register SUNW.rac_framework# clrt register SUNW.rac_udlm# clrt register SUNW.rac_svm# clrg create -n phy-host1,phy-host2 -p maximum_primaries=2 -p desired_primaries=2 -p rg_mode=Scalable rac-framework-rg# clrs create -g rac-framework-rg -t SUNW.rac_framework rac-framework-rs# clrs create -g rac-framework-rg -t SUNW.rac_udlm -p port=7000 -p resource_dependencies=rac-framework-rs rac-udlm-rs# clrs create -g rac-framework-rg -t SUNW.rac_svm -p resource_dependencies=rac-framework-rs rac-svm-rs

If you use clsetup then your udlm port is 6000 and maybe in conflict with sshd see SunAlert 200810. If you need an other port you should run the mentioned commands which use port 7000 for udlm. The port can not be changed when dlm is running!

Use SUNW.rac_cvm for CVM instead of SUNW.rac_svm.

Switch the Resource Group online.# clrg online -emM rac-framework-rg

Increase step4 timeout if using SVM/CVM and RAC is managed by the RGM.

# clrs set -p Svm_step4_timeout=360 rac-svm-rs

The value 360 is an example!See FIN I1139-1. Determine the best value for your configuration!

This is necessary when your OBAN/CVM shared disk group setup is large or complex.

Use the cvm_step4_timeout property in case of CVM.

TASKS WHEN USING SVM (OBAN) IN RAC CLUSTER

Create multi-owner disk set: #metaset -s <setname> -M -a -h <nodelist>Refer to page 20 for normal disk set and volume setup.

NOTE: If you get the messagemetaset: host1: rac-ds: node host2 is not in membership list.you should add to /var/run/nodelist1 <host1> 172.16.193.12 <host2> 172.16.193.2

Check if the multi-owner disk set is correctly configured with:# cldg show <setname>and # cldg status

The nodelist will be created after configuration of SUNW.rac_framework.Technical Instruction 214832.

Set up the Mediator Host for each diskset when the cluster matches the two-hosts and two-strings criteria.metaset -s <setname> -a -m <node1> <node2>

See also % man mediator

Sun Internal and Approved Partners Only Page 37 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 38: Cluster 32

Task Comment Node

1 2 3 4

The owner of the volumes and disksets must be oracle. Do the following commands on all nodes which can own the diskset for oracle:# chown oracle:dba volume-list# chmod u+rw volume-liste.g. volume-list:/dev/md/racset/[r]dsk/d0 link tovolume-list:/devices/pseudo/md@0:3*

Restrictions for multi-owner disk set:• only raw devices supported!• do not create filesystems!

Check the permissions with #ls-lL /dev/md/racset/rdsk

So permissions should be set on /devices/...

TASKS WHEN USING CVM IN RAC CLUSTER

ONLY on nodes which have NO storage connection you can do the following instead of whole VxVM installation:Add clusterwide vxio number to # vi /etc/name_to_majorvxio NNNInitialize vxio entry# drvconfig -b -i vxio -m NNN

You must enable VxVM cluster feature on all nodes!

Set up the shared disk group and volumes. Example:For safety reasons run: vxdctl enable on all nodes before you start with the setup.# vxdisksetup -i <see vxdisk list>e.g: vxdisksetup -i HDS99600_0 vxdisksetup -i T40_5 vxdisksetup -i SUN35100_3# vxdg -s init racdg <list your disks>Attention VxVM 4.1: It's possible that your VxVM master will panic. See CR6513521. Fixed with VxVM 4.1 MP2 (EIS-DVD ≥ 27MAR07)

Only possible on the Master! Check via# vxdctl -c modeRestrictions for shared diskgroup:• only raw devices supported!• do not create filesystems!• do not register diskgroup in

the cluster.• cannot be managed as cluster

device group.

Volumes can be set up via the GUI. Enable DRL!The usetype of the volumes must be "gen". e.g:# vxassist -g racdg -U gen make system 400m

The owner of the rawdevices must be the oracle user. e.g:# vxedit -g racdg set user=oracle group=dba system

TASKS WHEN USING SHARED QFS IN RAC CLUSTER

Requirements (For all details refer to the Sun Cluster configuration guide):• Attention Bug 6420952: If you have problems to start QFS on OBAN within cluster

reconfiguration you need IDR123295-01 (Esc: 1-19515211)• Can be used on top of HWraid or OBAN with Solaris10. Not working on top of CVM.• QFS 4.2 or higher for HWraid or QFS 4.4 or higher on top of OBAN.• Oracle9i RAC (9.2.0.5+OraclePatch3566420)/Oracle10gR1RAC (10.1.0.3) and above.• Solaris10x86 min. requires QFS4.5 and Oracle10gR2RAC (10.2.0.1) and above.• QFS license is available in Sun Cluster Advanced Edition for Oracle RAC which does not

include Sun Cluster Server license. Otherwise you need a separate license for QFS.

DO ALL STEPS FOR QFS ON ALL NODES!

Sun Internal and Approved Partners Only Page 38 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 39: Cluster 32

Task Comment Node

1 2 3 4

Install QFS software on all nodes. The packages SUNWqfsr SUNWqfsu.

See also EIS checklist for QFS for more details...

Install QFS patches from EIS-DVD if applicable.

Setup shared QFS e.g:* Laying out partitions* Edit /etc/opt/SUNWsamfs/mcfracfs 10 ma racfs - shared/dev/did/dsk/d2s0 11 mm racfs -/dev/did/dsk/d3s6 12 mr racfs -

For the filesystem layout talk to Oracle!

In case of Multi-Owner Diskset use /dev/md/racds/dsk/d0

Edit /etc/opt/SUNWsamfs/samfs.cmdsync_meta = 1mh_writeforcedirectio (can have bad performance issues in case of Oracle/SAP)qwritestripe = 1nstreams = 1024rdlease = 600wrlease = 600aplease = 600

Set rdlease, wrlease and aplease for optimum perfomance.

Further details available in the EIS checklist for QFS

Make the new configuration available. samd config

Use sam-fsd command for validation.

Create hosts.family-set-name for each QFSEdit /etc/opt/SUNWsamfs/hosts.racfsphy-host1 clusternode1-priv 1 - server phy-host2 clusternode2-priv 2 -

Create a QFS filesystem (only 1 node). sammkfs -S racfs

Create global mountpoint. mkdir /global/racfs

Add filesystem to /etc/vfstab:racfs - /global/racfs samfs - no shared

Mount the filesystem (QFS metadata server first).

# mount racfsTo determine the Metadata Server run:# samsharefs -R racfs

Set ownership and permissions. chown oracle:dba /global/racfschmod 755 /global/racfs

Sun Internal and Approved Partners Only Page 39 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 40: Cluster 32

Task Comment Node

1 2 3 4

Setup Storage Resources for Oracle Files

Configure Storage resource using clsetupSelect 3) Data Services, 4) Oracle Real Application Clusters, 1) Oracle RAC Create Configuration, 2) Storage Resources for Oracle Files

or use the following commands.

A) In case of multi-owner diskset or cvm setup scalable device group resource group:

# clrg create -n phy-host1,phy-host2 -p Desired_primaries=2 -p Maximum_primaries=2 -p RG_affinities=++rac-framework-rg -p RG_mode=Scalable scal-racdg-rg

# clrt register SUNW.ScalDeviceGroup

In case of Multi-owner diskset:# clrs create -g scal-racdg-rg -t SUNW.ScalDeviceGroup -p Resource_dependencies=rac-svm-rs -p DiskGroupName=racdg scal-racdg-rs

In case of cvm diskgroup:# clrs create -g scal-racdg-rg -t SUNW.ScalDeviceGroup -p Resource_dependencies=rac-cvm-rs -p DiskGroupName=racdg scal-racdg-rs

# clrg online -emM scal-racdg-rg

B) In case of shared QFS setup qfs resource group:

# clrg create -n phy-host1,phy-host2 qfs-meta-rg(Add -p RG_affinities=++scal-racdg-rg if you use QFS on top of SVM)

# clrt register SUNW.qfs

# clrs create -g qfs-meta-rg -t SUNW.qfs -p qfsfilesystem=/global/racfs qfs-racfs-rs(Add -p Resource_dependencies_offline_restart=scal-racdg-rs if you use QFS on top of SVM)

# clrg online -emM qfs-meta-rg

Sun Internal and Approved Partners Only Page 40 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 41: Cluster 32

Task Comment Node

1 2 3 4

C) In case of shared QFS or NAS filesystem setup scalable mountpoint resource group:

# clrg create -n phy-host1,phy-host2 -p Desired_primaries=2 -p Maximum_primaries=2 -p RG_mode=Scalable scal-racfs-rg(Add -p RG_affinities=++scal-racdg-rg if you use also scalable device group)

# clrt register SUNW.ScalMountPoint

Resource for QFS:# clrs create -g scal-racfs-rg -t SUNW.ScalMountPoint -p mountpointdir=/global/racfs -p filesystemtype=s-qfs -p targetfilesystem=racfs -p Resource_dependencies=qfs-racfs-rs scal-racfs-rs(Add -p Resource_dependencies_offline_restart=scal-racdg-rs if you use also scalable device group)

Resource for NAS:# clrs create -g scal-racfs-rg -t SUNW.ScalMountPoint -p mountpointdir=/global/racfs -p filesystemtype=nas -p targetfilesystem=<nas-device>:racfs -p scal-racfs-rs(Add -p Resource_dependencies_offline_restart=scal-racdg-rs if you use also scalable device group)# clrg online -emM scal-racfs-rg

Setup of Resources for Interoperation with Oracle 10g R2

Configure Storage resource using clsetupSelect 3) Data Services, 4) Oracle Real Application Clusters, 1) Oracle RAC Create Configuration, 3) Oracle RAC Database Resources

or use the following commands (verification still in progress!)

Create crs resource:

# clrt register SUNW.crs_framework

# clrs create -g rac-framework-rg -t SUNW.crs_framework -p Resource_dependencies=rac-framework-rs -p Resource_dependencies_offline_restart=scal-racdg-rs{local_node} crs_framework-rs

Resource_dependencies_offline_restart is only required if you have setup a scalable device resource group

Create proxy resource group for RAC database serverNote: If you not use a scalable device group then remove the dependencies to them.

# clrg create -n phy-host1,phy-host2 -p Maximum_primaries=2 -p Desired_primaries=2 -p RG_mode=Scalable -p RG_affinities=++rac-framework-rg,++scal-racdg-rg rac-proxy-rg

# clrt register SUNW.scalable_rac_server_proxy

# clrs create -g rac-proxy-rg -t SUNW.scalable_rac_server_proxy -p ORACLE_HOME=/oracle/product/10.2.0/db -p CRS_HOME=/oracle/product/10.2.0/crs -p DB_NAME=sun -p ORACLE_SID{phy-host1}=sun1 -p ORACLE_SID{phy-host2}=sun2 -p Resource_dependencies=rac-framework-rs -p Resource_dependencies_offline_restart=scal-racdg-rs,crs_framework-rs rac-proxy-rs

# clrg online -emM rac-proxy-rg

Sun Internal and Approved Partners Only Page 41 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 42: Cluster 32

Task Comment Node

1 2 3 4

Create Oracle CRS resource. This part is still in work and will be added in the future. In the meantime refer to the Oracle documentation.

Task Comment Node

1 2 3 4

UPGRADE TO SUNCLUSTER 3.2An Sun internal only document is available for download at: https://cepedia.sfbay.sun.com/index.php?title=Sun_Cluster_upgrade

The document will become public soon. It's in the Blueprint process.

Sun Internal and Approved Partners Only Page 42 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 43: Cluster 32

Nr# Date Description Implementation

FINS/FABS FOR SUNCLUSTER

239646 Jul/14/08 Dual hosted Sun StorageTek 3120 JBOD in Single-Bus configurations may experience parity errors.

REACTIVE

I1162-1 Mar/09/05 A two-node Sun Cluster may encounter SCSI-2 Reservation Conflicts when connected to IBM Shark Storage Products.

REACTIVE (As Required)

I1139-1 Oct/22/04 No end-user supported procedure to set the UCMM timeouts on running Sun Cluster prior to the version 3.1 update 1.

REACTIVE (As Required)

I0657-1 Mar/30/01 SCSI Bus resets in multi-initiator SCSI configurations might lead to a system panic.

CONTROLLED PRO-ACTIVE (per Sun Geo Plan)

Sun Internal and Approved Partners Only Page 43 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 44: Cluster 32

Task Comment Node

1 2 3 4

SYSTEM COMPLETION

Make copy of the cluster configuration# cluster export -o /var/cluster/Config_after_EIS

Reboot => everything o.k? Test Cluster Manager.

EXPLORER / INSTINFO & ANALYSIS

Run explorer with the instinfo option (see relevant server checklist):

On SunFire 3800-6800: explorer -i -w default,scextendedOn SunFire V1280 / E2900: explorer -i -w default,1280extendedOn other platforms: explorer -i

Refer also to Technical Instruction 209663 for importance of Explorer data.

Run SRAS/CLI (from EIS-DVD) locally to analyse the explorer output files.

cd /cdrom/...sun/toolscd SRASsh run-sras.sh

Examine the results.If necessary repair & repeat Explorer/SRAS sequence.

Examine the resulting report:

cd /var/tmp/SRASmore *EIS.Report.txt

Mail Explorer file to:EMEA: [email protected]: [email protected]: [email protected]

The explorer output file is normally in directory /opt/SUNWexplo/output with filename explorer.<hostid>.<hostname>-<date>.tar.gz

If e-mail from customer site not possible please transport file to your office & send the e-mail from there.

If this is a JES (Orion) cluster installation update/close the GETS opportunity.

This is needed so that support entitlement can take place.

Sun Internal and Approved Partners Only Page 44 of 45 Vn 1.11 Created: 20. Apr. 2009

Page 45: Cluster 32

Task Comment Check

HANDOVER

MANDATORY: Ensure that the licenses are available for Sun Cluster Framework & Agents.

Short briefing: the configuration.

Perform Installation Assessment tests as described in the EIS “Cluster Verification” document.

See EIS planning documentation.

Short briefing: procedure for opening calls.

Hand over telephone number for warranty customers.

If temporary licenses have been installed then draw customer's attention to this fact!

Complete documentation and hand over to customer.

Obtain customer sign-off for service completion.

Copies of the checklists are available on the EIS web pages or on the EIS-DVD. We recommend that you always check the web pages for the latest version.

Comments & RFEs are welcome. Please use ServiceDesk (Search Tasktype & enter "EIS")or mail to [email protected] if no SWAN access available – typically for a partner.

Thanks are due to Jürgen Schleich, SunCluster PRE, TSC-Storage Munich, Germany.

Sun Internal and Approved Partners Only Page 45 of 45 Vn 1.11 Created: 20. Apr. 2009