27
Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL Page 1 of 27 fujitsu.com/eternus Best practice FUJITSU Storage ETERNUS DX S3 Storage Cluster – Technical Info This document will give you some technical information about the new ETERNUS “Storage Cluster” feature. It will help you to understand how to configure, manage and use this new function which enables an ETERNUS DX S3 storage system to get high availability by connecting two ETERNUS DX S3 storage devices. Content Introduction 2 Overview 2 Requirements 3 Software 3 Licenses 3 Storage Cluster setup and configuration 4 Storage Cluster configuration 5 Storage Cluster allocating Business Volumes 7 Storage Cluster Controller setup 11 Storage Cluster processing 12 Storage Cluster bi-directional information and setup 15 Recovery procedure caused by defect RAID Group 16 Preconditions 16 1. Step 16 2. Step 17 3. Step 17 4. Step 17 5. Step 18 6. Step 19 Appendix 20 Fibre Channel Switch read-only discovery 21 Status of TFO Group Information 23 Recommendations 25 TFO Checklist 26 Abbreviations 27

Best Practice Eternus Dx s3 Storage Cluster

Embed Size (px)

Citation preview

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 1 of 27 fujitsu.com/eternus

    Best practice FUJITSU Storage ETERNUS DX S3 Storage Cluster Technical Info

    This document will give you some technical information about the new ETERNUS Storage Cluster feature. It will help you to understand how to configure, manage and use this new function which enables an ETERNUS DX S3 storage system to get high availability by connecting two ETERNUS DX S3 storage devices.

    Content

    Introduction 2 Overview 2 Requirements 3

    Software 3 Licenses 3

    Storage Cluster setup and configuration 4 Storage Cluster configuration 5 Storage Cluster allocating Business Volumes 7 Storage Cluster Controller setup 11 Storage Cluster processing 12 Storage Cluster bi-directional information and setup 15

    Recovery procedure caused by defect RAID Group 16 Preconditions 16 1. Step 16 2. Step 17 3. Step 17 4. Step 17 5. Step 18 6. Step 19

    Appendix 20 Fibre Channel Switch read-only discovery 21 Status of TFO Group Information 23 Recommendations 25 TFO Checklist 26 Abbreviations 27

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 2 of 27 fujitsu.com/eternus

    Introduction The ETERNUS Storage Cluster is a high-availability feature of the ETERNUS DX S3 family of storage devices. Assigned volumes named Transparent Failover Volumes (TFOV) used in a Storage Cluster configuration are mirrored and paired from the Primary storage system to the Secondary storage system by remote equivalent copy (array-based replication with REC synchronous mode of Advanced Copy function). In normal state, the Fibre Channel (FC) ports configured on the Primary site are linked up and the ones on the Secondary site are linked down so that business servers issues I/O to the Primary storage system. The Storage Cluster Controller is connected by LAN to both ETERNUS DX S3 storage devices for heartbeat monitoring. The Storage Cluster Controller is responsible for avoiding any kind of split-brain scenario for a Storage Cluster configuration setup in automatic failover mode. In case the Primary storage system crashes the Storage Cluster Controller is required for the decision to switchover operation to the Secondary storage system (automatic failover). If the Storage Cluster Controller is not connected a user needs to operate a manual failover to the Secondary storage system. When the failover is invoked, the Fibre Channel (FC) ports configured on the Primary storage system links down and the ones on the Secondary storage system links up, taking over the volume information including WWN/WWPN of the Primary site so that business servers issues I/O to the Secondary storage system. To achieve this functionality a user needs to configure the Storage Cluster feature using ETERNUS SF. Overview The ETERNUS Storage Cluster is a function which enables the storage system to get high availability by connecting two ETERNUS DX S3 storage systems. One of them is the Primary storage system and the other is the Secondary storage system. In case where the Primary (active) storage system is no longer available due to hardware failure or unexpected disaster, the I/O path (host connections) of the working business servers are switched to the mirrored Secondary (standby) storage system. In Auto Mode configuration this failover is transparent for both servers and applications and ensures uninterrupted operations. Additionally a user could initiate a manual failover from the Primary (active) storage system to the Secondary (standby) storage system any time. This could take place in case when a RAID Group hosting the volumes (TFOV) used in the Storage Cluster configuration is destroyed due to several disk failures and the ETERNUS DX S3 storage system is still up and running. Another approach for a manual failover could be storage system downtime due to hardware maintenance or firmware upgrades.

    The picture above illustrates the functional design of the Storage Cluster feature in a single-sided Transparent Failover (TFO) configuration.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 3 of 27 fujitsu.com/eternus

    Requirements For the Storage Cluster feature you need to connect two ETERNUS DX S3 storage systems used as a pair. Each of them could be an ETERNUS DX100 S3, ETERNUS DX200 S3, ETERNUS DX500 S3 or ETERNUS DX600 S3. The Storage Cluster feature requires firmware version V10L20-000 or later. In addition you need to have a server running the ETERNUS SF V16.1 Manager software. The operating system used on that server could be either Windows, Linux or Solaris. Read the ETERNUS SF Express V16 / Storage Cruiser V16 / AdvancedCopy Manager V16 Installation and Setup Guide for details about the supported version of each operating system. As a strong recommendation you should use a dedicated server for running the Storage Cluster Monitoring (ETERNUS SF V16.1 Storage Cruiser Agent) software. Note: The operating system of this server must be Windows based. This might be changed in the future. The zoning at the Fibre Channel (FC) switches for the business server connections to the ETERNUS DX S3 storage systems must be a WWPN based Fibre Channel (FC) zoning only. Note: Fibre Channel (FC) ports used by the Storage Cluster feature couldnt be members of any Port Group on each ETERNUS DX S3 storage system and should have exactly the same settings (speed, topology etc.) at the Primary (active) and the Secondary (standby) storage system. In addition Host Affinity must be enabled for these Fibre Channel (FC) ports. This can be checked within ETERNUS SF V16.1 for each ETERNUS DX S3 storage system under Connectivity -> FC Port.

    The volumes (TFOV) used by the Storage Cluster feature must be created with identical size and the host LUN numbers used in each LUN Group must be identical on both ETERNUS DX S3 storage systems. Make sure that nobody has a lock (is working with the ETERNUS DX S3 HW-GUI) on each of the two ETERNUS DX S3 storage systems involved by the Storage Cluster feature while configuring the Storage Cluster functionality. Ports used for the REC Path for the Storage Cluster feature could be configured RA or CA/RA. The later one is not recommended because it will have an influence to the performance of the Storage Cluster feature volumes (TFOV) used by the business servers. In addition you couldnt attach business servers using the Storage Cluster feature to CA/RA ports. These business server connections need to have dedicated CA ports only. The REC Path must be configured using ETERNUS SF V16.1 or using the ETERNUS DX S3 HW-GUI otherwise the Storage Cluster setup cant be configured. Note: Dont remove LUNs from a LUN Group used by a TFO Group which is in Phase = Maintenance ! Software The Storage Cluster functionality can be set up, configured, managed and checked through the Web Console of the ETERNUS SF V16.1 Manager software. There are two options for initiating a failover from the Primary (active) storage system to the Secondary (standby) storage system.

    Automatic Failover Manual Failover

    The Storage Cluster Monitoring function is provided by the ETERNUS SF V16.1 Storage Cruiser Agent software. Licenses For each discovered ETERNUS DX S3 storage system used for the Storage Cluster feature you need to purchase and register these kinds of licenses:

    ETERNUS SF Storage Cruiser V16 Standard License ETERNUS SF Storage Cruiser V16 Storage Cluster Option ETERNUS SF AdvancedCopy Manager V16 Remote Copy License

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 4 of 27 fujitsu.com/eternus

    Storage Cluster setup and configuration Install the ETERNUS SF V16.1 Manager software on one of your servers. Additional information related to the installation can be found in the ETERNUS SF Express V16 / Storage Cruiser V16 / AdvancedCopy Manager V16 Installation and Setup Guide. Afterwards you need to discover your two ETERNUS DX S3 storage systems and register all required licenses for both ETERNUS DX S3 storage systems needed by the Storage Cluster feature. As a strong recommendation you should discover the Fibre Channel (FC) switches, for receiving SNMP traps in case of problems at the switches, as well. This can be done in a read-only way so that ETERNUS SF V16.1 isnt able to modify any switch configuration. See Appendix for details about the read-only discovery of Fibre Channel (FC) switches. As another recommendation you should install the ETERNUS SF V16 Storage Cruiser Agent software at your business servers and discover these servers in ETERNUS SF V16.1 as well. This will enable the graphical end-to-end correlation view within ETERNUS SF V16 Manager GUI for these servers. Set up the WWPN Zoning between the ETERNUS DX S3 storage systems and your business servers at the Fibre Channel (FC) switches first. Afterwards start your setup of the Storage Cluster functionality. The setup and configuration of the Storage Cluster feature needs to be done at the ETERNUS SF V16.1 Manager GUI. All related settings needed for the Storage Cluster setup and configuration can be found under the Connectivity and the Storage Cluster selection in the category pane of a discovered ETERNUS DX S3 storage system in the ETERNUS SF V16.1 Manager GUI.

    As a rule of thumb you should use self-explanatory names for all related configuration elements used by the Storage Cluster functionality such as FC Hosts (e.g. PRI_SRV01_HBA0 and SEC_SRV01_HBA0), LUN Groups (e.g. PRI_SRV01_LG and SEC_SRV01_LG) and TFO Groups (e.g. DX600_to_DX500 or DX500#1_DX500#2). Please be aware that any kind of names you are using for the configuration of the Storage Cluster feature should not exceed 16 characters. In addition you should start the setup of the Storage Cluster feature always at the Primary (active) ETERNUS DX S3 storage system within the ETERNUS SF V16.1 Manager GUI.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 5 of 27 fujitsu.com/eternus

    Storage Cluster configuration Switch to the Primary (active) ETERNUS DX S3 storage system within the ETERNUS SF V16.1 Manager GUI and enter the Storage Cluster section. There you will find everything related to the Storage Cluster TFO Group and an entry point for creating a REC Path used by the Storage Cluster feature.

    You should start to create a REC Path first. The REC Path configuration will be done by the well-known REC Path configuration wizard. Additional information about the creation of an ETERNUS DX S3 REC Path is available in the ETERNUS SF V16 documentation. Supported protocols used by the REC Path for the Storage Cluster feature are FC and iSCSI. It is strongly recommended to use at least one port of each CM of the two ETERNUS DX S3 storage systems for the REC Path configuration. Note: Because the REC configuration runs always in synchronous mode, you should change the Priority Level at each ETERNUS DX S3 storage system, involved in the Storage Cluster functionality to the highest number. This setting can be done using the ETERNUS DX S3 HW-GUI only. Please refer to Advanced Copy -> Settings -> Copy Path -> Modify REC Multiplicity to modify the Priority Level.

    After the REC Path configuration is done, you should start with the creation of your Storage Cluster TFO Group. The Set button in the Action pane could be used to create a new or modify an existing and selected Storage Cluster TFO Group.

    Select the Remote Disk Array from the list of available storage systems. Because we started the creation of our Storage Cluster TFO Group at the Primary (active) ETERNUS DX S3 storage system, the Local option must be selected for the Primary Disk Array. Enter the name of this TFO Group and choose your Failover Mode.

    The Split Mode settings are related to the status of the REC Path. To achieve application consistency for any case of automatic failover to the Secondary (standby) ETERNUS DX S3 storage system, you may select Read as the Split Mode. Note: If you select the Read option the business servers will get an I/O error for write requests in case the REC Path is broken.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 6 of 27 fujitsu.com/eternus

    Last but not least you need to select the Fibre Channel (FC) port pairs used by the Storage Cluster feature.

    Note: You cant use CA/RA ports for the creation of the Fibre Channel (FC) port pairs used by the Storage Cluster feature.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 7 of 27 fujitsu.com/eternus

    Storage Cluster allocating Business Volumes Next thing to do is to register all WWN names of your business servers Host Bus Adapters (HBA). Open the Connectivity category, select Host and press the Add FC Host button to set up the FC Host.

    You should find all WWN numbers of your business servers HBA's already connected to Fibre Channel (FC) ports (e.g. CM#0 CA#0 Port#3) at the Primary (active) ETERNUS DX S3 storage system.

    Therefore identify the Channel Adapter (CA) port on that ETERNUS DX S3 storage system and register the names of each WWN number. As mentioned above use dedicated names (e.g. PRI_SRV01_HBA0) to identify these FC Hosts for future reference. You should note down the WWN numbers for manual registration of each FC Host at the Secondary (standby) ETERNUS DX S3 storage system later on.

    Enter the name of this FC Host, select the Host Response, press the Next button and confirm your settings at the next screen. Repeat these steps for all WWN numbers of your business servers Host Bus Adapters (HBA) connected to the Primary (active) ETERNUS DX S3 storage system. After you have completely finished this part, you need to setup the LUN Group including the volumes for your business servers. Select Affinity/LUN Group in the Connectivity category of the Primary (active) ETERNUS DX S3 storage system and create the LUN Group.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 8 of 27 fujitsu.com/eternus

    Again you should choose a self-explanatory name for the LUN Group used by the Storage Cluster feature. Enter the Host LUN Number for the selected volumes and add them to the list of Assigned Volumes. Press the Next button and confirm your settings at the next screen.

    Note: You should write down the LUN No. including the Capacity of each volume added to the list of Assigned Volumes for the creation of the corresponding LUN Group at the Secondary (standby) ETERNUS DX S3 storage system later on. Note: After adding the volumes to the LUN Group you need to check the reservation status of each volume using the ETERNUS DX S3 HW-GUI. If there are still persistent reservations left over you need to remove them from each volume first. Select the volume and use the Release Reservation Action button for this purpose. The picture below will show details about Reservation.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 9 of 27 fujitsu.com/eternus

    After finishing this part of the Storage Cluster feature setup, you need to create the Host Affinity for your business servers. Switch to the Host Affinity section in the Connectivity category pane of the Primary (active) ETERNUS DX S3 storage system.

    Press the Create button and start configuring the Host Affinity using the created FC Host and the associated LUN Group attached to the Fibre Channel (FC) port of the Primary (active) ETERNUS DX S3 storage system. You need to repeat this process for each WWN of your business servers Host Bus Adapters (HBA).

    Note: You couldnt create the Host Affinity using Host Group, Port Group and LUN Group at the ETENRUS DX S3 HW GUI. This wont work with the Storage Cluster feature. Important Note: After removing a TFO volume from the LUN Group at the Secondary (standby) ETERNUS DX S3 storage system you must

    change the unique identifier (UID) of that volume to use it as a Standard Volume. For this purpose you can use the ETERNUS DX HW CLI set volume command using the -uid parameter. (See the ETERNUS CLI User's Guide for details)

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 10 of 27 fujitsu.com/eternus

    The Storage Cluster feature configuration is nearly done at the Primary (active) ETERNUS DX S3 storage system. Now you need to setup the corresponding settings at the Secondary (standby) ETERNUS DX S3 storage system. Therefore register the WWN numbers, which you noted down while creating each Host at the Primary (active) ETERNUS DX S3 storage system, of each Host Bus Adapters (HBA) belonging to your business servers manually. Use self-explanatory names (e.g. SEC_SRV01_HBA0, SEC_SRV01_HBA1) for this process.

    Enter all needed information in the input fields of each FC Host and add it to the list. Press the Next button to confirm your settings.

    Create the corresponding Affinity/LUN Group and all the Host Affinity of your business servers Host Bus Adapters (HBA) at the Secondary (standby) ETERNUS DX S3 storage system afterwards. Keep in mind that the corresponding Affinity/LUN Group (e.g. SEC_SRV01_LG) must use same number of volumes including the exact same LUN No. and the exact same Capacity of each volume added to the list of Assigned Volumes. The procedure for all of these tasks is the same as you did at the Primary (active) ETERNUS DX S3 storage system.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 11 of 27 fujitsu.com/eternus

    Storage Cluster Controller setup As already mentioned above, you should install the ETERNUS SF V16.1 Storage Cruiser Agent software used for Storage Cluster Monitoring on a dedicated server. Information about the installation of that software could be found in the ETERNUS SF Express V16 / Storage Cruiser V16 / AdvancedCopy Manager V16 Installation and Setup Guide. After the installation succeeded you need to modify two configuration files. This will enable the ETERNUS SF V16.1 Storage Cruiser Agent to be the Storage Cluster Controller for your environment. The default installation directory of the ETERNUS SF V16.1 Storage Cruiser Agent software is C:\ETERNUS_SF . Using the default installation the two files (Correlation.ini and TFOConfig.ini) are located under the C:\ETERNUS_SF\ESC\Agent\etc directory. Add the following lines at the end of the Correlation.ini file:

    #---------------- # Storage Cluster Controller Server configuration #---------------- StorageClusterController=ON

    The TFOConfig.ini file is responsible for identifying the two ETERNUS DX S3 storage systems used for the Storage Cluster functionality. Therefore you need to add the Master IP address of each ETERNUS DX S3 storage system into that file. Here comes an example how the input should look like:

    IP=192.168.100.60 IP=192.168.200.50

    After the modifications on both files took place, you need to restart the ETERNUS SF V16.1 Storage Cruiser Agent to reflect the settings. You will find additional information in the ETERNUS SF Storage Cruiser V16 Operation Guide for any kind of details. In addition you should discover the Storage Cluster Controller (using the Storage Cruiser Agent functionality) within the ETERNUS SF V16 Manager software.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 12 of 27 fujitsu.com/eternus

    Storage Cluster processing Management (such as changing TFO Group Name, Failover Mode or Split Mode) of a TFO Group can be done either at the Primary (active) or Secondary (standby) ETERNUS DX S3 storage system. Select the TFO Group and press the Set button in the Action pane for this purpose. Modifying LUN Groups (such as adding or removing Volumes to/from the LUN Group) used by the Storage Cluster feature should always started at the Primary (active) ETERNUS DX S3 storage system first. After this is done you should modify the corresponding LUN Group at the Secondary (standby) ETERNUS DX S3 storage system. You need to check the status of your Storage Cluster configuration at the Storage Cluster Controller. If you were using the default installation of the ETERNUS SF V16.1 Storage Cruiser Agent software you will find the CLI script here: C:\ETERNUS_SF\ESC\Agent\bin. Here comes an example output of the agtpatrol.bat CLI script:

    C:\ETERNUS_SF\ESC\Agent\bin> agtpatrol.bat -------------------------------------------------------------------------------- INTERVAL=1000 TARGET IP: 192.168.100.60 192.168.200.50 -------------------------------------------------------------------------------- TARGET TFO GROUP: IP ADDRESS=192.168.100.60 GROUP NAME=DX600_to_DX500 TYPE=Primary PAIR IP ADDRESS=192.168.200.50 PAIR GROUP NAME=DX600_to_DX500 STATUS=Normal INTERVAL=1000 UPDATE TIME=Mon Jun 02 10:54:26 CEST 2014 IP ADDRESS=192.168.200.50 GROUP NAME=DX600_to_DX500 TYPE=Secondary PAIR IP ADDRESS=192.168.100.60 PAIR GROUP NAME=DX600_to_DX500 STATUS=Normal INTERVAL=1000 UPDATE TIME=Mon Jun 02 10:54:26 CEST 2014

    Note: INTERVAL is the heartbeat rate in milliseconds configured on each ETERNUS DX S3 storage system.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 13 of 27 fujitsu.com/eternus

    Use the Refresh button in the Action pane to update the TFO Group Status always to get the actual status of your TFO Groups. This will create a job running in the background that will update the TFO Group Status.

    Manual Failover can only be triggered using the Storage Cluster section at the Primary (active) ETERNUS DX S3 storage system. You wont be able to press the Failover or Force-Failover button at the Secondary (standby) ETERNUS DX S3 storage system.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 14 of 27 fujitsu.com/eternus

    In case a failover took place due to manual or auto mode, you want to switch back the business servers host connections to the original ETENRUS DX S3 storage system. Before you are able to start this operation you need to check some preconditions. Make sure that the Primary (active) ETERNUS DX S3 storage system is up and running without any hardware related issues. The REC Path connection between the two ETERNUS DX S3 storage systems must be available and the volumes used by the Storage Cluster feature are in sync (Equivalent). The last one must be checked using the details of the associated TFO Group. There you have the capability to verify the status of the REC Copy process for each volume (switch view from Ports to Volumes) belonging to this TFO Group. If everything is ready for switching back the business servers host connections to the Primary (active) ETERNUS DX S3 storage system (Status = Active and Phase = Equivalent) you are able to start the failback. As you can see at the picture below, this action isnt available at the Primary (active) ETERNUS DX S3 storage system.

    Therefore you need to switch the ETERNUS SF V16.1 GUI to the Secondary (standby) ETERNUS DX S3 storage system and start the failback action using the associated TFO Group from there.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 15 of 27 fujitsu.com/eternus

    Storage Cluster bi-directional information and setup The Storage Cluster feature could be configured bi-directional as well. There is no need to create a new REC Path configuration for this purpose. The existing REC Path between the two ETERNUS DX S3 systems can be shared for that. However you must use dedicated Fibre Channel (FC) ports for the second TFO Group on each ETERNUS DX S3 system. You cant share Fibre Channel (FC) ports among TFO Groups. As a rule of thumb you should use dedicated RAID Groups for active and passive TFO Volumes on each ETERNUS DX S3 system involved in a bi-directional Storage Cluster setup. The additional TFO Group including all required resources (Volumes, LUN-Groups, FC-Hosts and Host Affinity) needs to be setup analog as described for the single-sided configuration. The picture below gives you an example how such a configuration could look like.

    Note: The status of the two TFO Groups above differs. For having always the latest status you need to press the Refresh button of the TFO Group Status for creating a job to update all your TFO Groups. This needs to be done from time to time.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 16 of 27 fujitsu.com/eternus

    Recovery procedure caused by defect RAID Group In case a Transparent Failover took place (automatic or manual) due to a broken RAID Group at the Primary (active) ETERNUS DX S3 storage system, there is a special treatment needed to recover the Storage Cluster configuration after the broken RAID Group is repaired. All related configuration steps need to be done at the Secondary (passive) ETERNUS DX S3 storage system. First of all you must login to the CLI using a User Account of the Maintainer Role, because all these steps must be executed using CLI commands of that ETERNUS DX S3 storage system. Preconditions If the setup of your TFO Group is configured as Failover Mode = Manual you need to start the failover by using the Failover Force in advance. Identify your TFO Group and the start the manual failover. In any case you must make sure that all TFO Volumes are hosted by the Secondary (passive) ETERNUS DX S3 storage system. (The Status of the Secondary TFO Group must be Active) CLI> show tfo-groups TFO Group No. [0] TFO Group Name [DX600_to_DX500] Type [Secondary] Status [Standby] Phase [Maintenance] Condition [Normal] Failover Mode [Manual] Split Mode [Read/Write] Monitor Interval [-] Pair Box ID [00ETERNUSDXMS3ET603SAU####OF4621352001##] Own Pair Port [CM#0 CA#0 Port#1 CM#0 CA#0 Port#1] [CM#1 CA#0 Port#1 CM#1 CA#0 Port#1] CLI> forced tfo-group-activate -tfog-number 0 -active-mode manual-failover CLI> show tfo-groups TFO Group No. [0] TFO Group Name [DX600_to_DX500] Type [Secondary] Status [Active] Phase [Maintenance] Condition [Normal] Failover Mode [Manual] Split Mode [Read/Write] Monitor Interval [-] Pair Box ID [00ETERNUSDXMS3ET603SAU####OF4621352001##] Own Pair Port [CM#0 CA#0 Port#1 CM#0 CA#0 Port#1] [CM#1 CA#0 Port#1 CM#1 CA#0 Port#1] Afterwards stop the Transparent Failover Replication of the TFO Volumes (Status = Error Suspend) located at the broken RAID Group of the Primary (active) ETERNUS DX S3 storage system. Follow these steps to fulfill this requirement: 1. Step Identify volumes located on the broken RAID Group which are in Error Suspend status. CLI> show tfo-pair -tfog-number 0 TFO Group Name [DX600_to_DX500] Host No. [8] Host Name [SEC_SRV01_HBA0] Own Volume Pair Volume SID Status Phase Error No. Name No. Code ----- -------------------------------- ----------- ----- ------------- ---------------- ----- 11 RM_TFO_VOL00 9 13 Error Suspend Equivalent 0x00 16 RM_TFO_VOL05 14 1 Active Equivalent 0x00 Host No. [10] Host Name [SEC_SRV01_HBA1] Own Volume Pair Volume SID Status Phase Error No. Name No. Code ----- -------------------------------- ----------- ----- ------------- ---------------- ----- 11 RM_TFO_VOL00 9 13 Error Suspend Equivalent 0x00 16 RM_TFO_VOL05 14 1 Active Equivalent 0x00

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 17 of 27 fujitsu.com/eternus

    2. Step Get detail information about the TFO Copy session with Status = Error Suspend. CLI> show tfo-pair -session-id 13 Own Volume No. [11] Own Volume Name [RM_TFO_VOL00] Pair Volume No. [9] Status [Error Suspend] Phase [Equivalent] Error Code [0x26] Source Block Address [0x0000000000000000LBA] Destination Block Address [0x0000000000000000LBA] Total Data Size [30720MB] Copied Data Size [29184MB] Direction [From Local/To Remote] Sync [Sync] Recovery Mode [Automatic] Split Mode [Automatic] Remote Session-ID [13] Remote Box-ID [00ETERNUSDXMS3ET603SAU####OF4621352001##] Time Stamp [2014-08-25 16:45:29] Elapsed Time [31 day 7 hour 36 min 30 sec] Copy Range [Totally] Secondary Access Permission [Read Only at Equivalency] Concurrent Suspend Status [Normal] 3. Step Release the copy sessions of TFO Volumes which have the Status = Error Suspend. CLI> release tfo-pair -port 001 -host-number 8 -volume-number 11 4. Step Restore the broken RAID Group and the associated volumes used as the TFO Volumes at the Primary (active) ETERNUS DX S3 storage system. Please see the maintenance manual for RAID Group recovery. There are 2 possibilities related to the failed RAID Group:

    - RAID Forced Recovery - Recovery by [DISK Hot Maintenance]

    You can use the RAID Forced Recovery options if you think the disks are still OK and the broken RAID Group was forced because of another event, e.g. DE failure. If you think the disks are really broken, then choose Recovery by [DISK Hot Maintenance]. The next screenshots are examples for Recovery by [DISK Hot Maintenance] from the ETERNUS DX HW-GUI.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 18 of 27 fujitsu.com/eternus

    Identify and exchange the broken disks.

    After exchange of the broken disks, the status of the RAID Group is Available, but the volumes are in status Readying. The next screenshot is an example how this information will be seen in the ETERNUS DX HW-GUI.

    All volumes which are in status Readying must be formatted first. Note: The format of the volume must be done using ETERNUS CLI or ETRNUS SF V16.x manager. If you try to perform the format using the ETERNUS HW-GUI, you will get the following error message:

    5. Step Go back to the CLI of the Secondary (passive) ETERNUS DX S3 storage system using a User Account of the Maintainer Role and restart the Transparent Failover Replication of the TFO Volumes. CLI> recover tfo-pair -port 001 -host-number 8 -volume-number 11 -recovery-target primary This will start a new initial copy of the TFO Volumes hosted by the former broken RAID Group. If the copy succeeded you are able to switchback (Failback) the host access to the Primary (active) ETERNUS DX S3 storage system again.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 19 of 27 fujitsu.com/eternus

    6. Step Afterwards you may want to check the status of the TFO Group and the associated TFO Volumes belonging to that TFO Group. CLI> show tfo-pair -tfog-number 0 TFO Group Name [DX600_to_DX500] Host No. [8] Host Name [SEC_SRV01_HBA0] Own Volume Pair Volume SID Status Phase Error No. Name No. Code ----- -------------------------------- ----------- ----- ------------- ---------------- ----- 11 RM_TFO_VOL00 9 2 Copying Equivalent 0x00 16 RM_TFO_VOL05 14 1 Active Equivalent 0x00 Host No. [10] Host Name [SEC_SRV01_HBA1] Own Volume Pair Volume SID Status Phase Error No. Name No. Code ----- -------------------------------- ----------- ----- ------------- ---------------- ----- 11 RM_TFO_VOL00 9 2 Copying Equivalent 0x00 16 RM_TFO_VOL05 14 1 Active Equivalent 0x00 CLI> show tfo-pair -session-id 2 Own Volume No. [11] Own Volume Name [RM_TFO_VOL00] Pair Volume No. [9] Status [Active] Phase [Copying] Error Code [0x00] Source Block Address [0x0000000000000000LBA] Destination Block Address [0x0000000000000000LBA] Total Data Size [30720MB] Copied Data Size [6144MB] Direction [From Local/To Remote] Sync [Sync] Recovery Mode [Automatic] Split Mode [Automatic] Remote Session-ID [6] Remote Box-ID [00ETERNUSDXMS3ET603SAU####OF4621352001##] Time Stamp [0000-00-00 00:00:00] Elapsed Time [0 day 0 hour 1 min 51 sec] Copy Range [Totally] Secondary Access Permission [Read Only at Equivalency] Concurrent Suspend Status [Normal]

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 20 of 27 fujitsu.com/eternus

    Appendix Here come some helpful hints while dealing with the Storage Cluster feature of ETERNUS DX S3 storage systems. You should examine the Fibre Channel (FC) zone configuration of your Fibre Channel (FC) switches from time to time. Especially check the ports at the FC-Switches involved by the Storage Cluster functionality for issues like Duplicate Port WWN detected. See this example for details: Switch01:admin> switchshow switchName: Switch01 switchType: 66.1 switchState: Online switchMode: Native switchRole: Subordinate switchDomain: 169 switchId: fffca9 switchWwn: 10:00:00:05:1e:83:12:aa zoning: ON (My_Fabric2) switchBeacon: OFF FC Router: OFF FC Router BB Fabric ID: 1 Address Mode: 0 Fabric Name: My_Fabric1 Index Port Address Media Speed State Proto ================================================== 0 0 a90000 id 8G Online FC F-Port 10:00:00:90:fa:50:34:52 1 1 a90100 id 8G Online FC F-Port 10:00:00:90:fa:50:3e:60 2 2 a90200 id 8G Online FC F-Port 21:00:00:24:ff:53:36:6f 3 3 a90300 id 8G No_Sync FC Disabled (Persistent) 4 4 a90400 id 8G No_Sync FC Disabled (Persistent) 5 5 a90500 id 8G Online FC F-Port 21:00:00:24:ff:53:36:71 6 6 a90600 id 8G No_Sync FC Disabled (Persistent) 7 7 a90700 id 8G In_Sync FC Disabled (Persistent) 8 8 a90800 id 8G No_Light FC Disabled (Persistent) 9 9 a90900 id 8G In_Sync FC Disabled (Persistent) 10 10 a90a00 id 8G Online FC F-Port 50:00:00:e0:da:80:68:20 11 11 a90b00 id 8G Online FC F-Port 50:00:00:e0:da:80:43:20 12 12 a90c00 id 8G No_Light FC 13 13 a90d00 id 8G Online FC F-Port 10:00:00:90:fa:50:34:1d 14 14 a90e00 id 8G No_Sync FC Disabled 15 15 a90f00 id 8G Online FC F-Port 50:00:00:e0:da:80:43:23 16 16 a91000 id 8G No_Sync FC Disabled (Persistent) (Duplicate Port WWN detected) 17 17 a91100 -- 8G No_Module FC 18 18 a91200 -- 8G No_Module FC 19 19 a91300 -- 8G No_Module FC 20 20 a91400 -- 8G No_Module FC 21 21 a91500 -- 8G No_Module FC 22 22 a91600 -- 8G No_Module FC 23 23 a91700 -- 8G No_Module FC 24 24 a91800 id N8 Online FC F-Port 50:00:00:e0:d4:00:01:91 25 25 a91900 id N8 Online FC F-Port 50:00:00:e0:d4:00:01:92 26 26 a91a00 id 8G No_Light FC 27 27 a91b00 id 8G No_Light FC 28 28 a91c00 -- 8G No_Module FC 29 29 a91d00 -- 8G No_Module FC 30 30 a91e00 -- 8G No_Module FC 31 31 a91f00 -- 8G No_Module FC 32 32 a92000 id 8G No_Light FC 33 33 a92100 id 8G No_Light FC 34 34 a92200 id N8 Online FC E-Port 10:00:00:27:f8:3d:bb:a7 "Switch99" (upstream)(Trunk master) 35 35 a92300 id N8 Online FC E-Port (Trunk port, master is Port 34 ) 36 36 a92400 id 8G Online FC F-Port 21:00:00:24:ff:53:36:58 37 37 a92500 id 8G Online FC F-Port 21:00:00:24:ff:53:37:2a 38 38 a92600 id N8 Online FC F-Port 50:00:00:e0:d4:00:00:90 39 39 a92700 id 8G No_Light FC

    As already mentioned above you should use the Refresh button within the Storage Cluster Overview section of the ETERNUS SF V16.1 Manager GUI to update the status of your TFO Groups. The Set action could be used to create a new TFO Group or modify an existing TFO Group, which needs to be checked before you press the Set button.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 21 of 27 fujitsu.com/eternus

    Fibre Channel Switch read-only discovery First you need to configure your Fibre Channel (FC) switch. Therefore you need to login into the switch using an administrator account and create the user account used by ETERNUS SF V16 later on. If you are using the CLI of the switch the command for creating the user would look like this: MySwitch:admin> userconfig --add ETSFuser -r user [Syntax: userconfig --add -r user] Afterwards you need to set a password for this user. The CLI command for this would be: MySwitch:admin> passwd ETSFuser [Syntax: passwd ] The last configuration step at the Fibre Channel (FC) switch is to create a read-only SNMP community. Again you can use the CLI of the switch for creating the dedicated read-only SNMP community used by ETERNUS SF V16 later on. ETERNUS SF V16 requires a SNMP community of SNMPv1. You can modify the well-known read-only community public and change it to e.g. ETSFsnmp for this purpose. The CLI command for this would be: MySwitch:admin> snmpconfig --set snmpv1

    Community (rw): [Secret C0de] Trap Recipient's IP address : [0.0.0.0] Community (rw): [OrigEquipMfr] Trap Recipient's IP address : [0.0.0.0] Community (rw): [private] Trap Recipient's IP address : [0.0.0.0] Community (ro): [public] ETSFsnmp Trap Recipient's IP address : [0.0.0.0] Community (ro): [common] Trap Recipient's IP address : [0.0.0.0] Community (ro): [FibreChannel] Trap Recipient's IP address : [0.0.0.0]

    You might need to call snmpconfig --set accessControl to set or change access-control-related parameters afterwards.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 22 of 27 fujitsu.com/eternus

    Enter the ETERNUS SF V16 Manager GUI and discover the Fibre Channel (FC) switch using the just created settings on that switch.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 23 of 27 fujitsu.com/eternus

    Status of TFO Group Information Active/Standby

    *1: "Unknown" has a meaning common to all the statuses, so is omitted hereinafter. Phase

    Status

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 24 of 27 fujitsu.com/eternus

    Halt Factor

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 25 of 27 fujitsu.com/eternus

    Recommendations

    Notes belonging to multipath settings of your Business Servers: Linux:

    no_path_retry Specify the number of retries until disable queueing, or fail for immediate failure (no queueing), queue for never stop queueing. Default is 0.

    For the Storage Cluster function with an ETERNUS DX S3 storage system you need to specify : "no_path_retry 10"

    fast_io_fail_tmo The default fast_io_fail_tmo setting for an FC remote port in seconds. If an rport has vanished from the fabric all I/O to the devices on that port will be terminated after this timeout. Should be smaller than dev_loss_tmo setting. Default is 5.

    Infos from (Fibre Channel/FCoE/iSCSI/SAS) for Linux device-mapper multipath document:

    "fast_io_fail_tmo 1"

    Windows: Windows Server 2012 R2/ Windows Server 2012/ Windows Server 2008 R2/ Windows Server 2008 Standard Multipath Driver (msdsm) Notes Various settings, such as the load balance policy and retry count, can be adjusted by using the standard multipath drivers (msdsm) for Windows Server 2012 R2, Windows Server 2012, Windows Server 2008 R2 or Windows Server 2008. However the following settings should not be changed from their default values. Screen name Parameters that may not be changed MPIO tab of Multi-Path Disk Device properties Load balance policy, [Details] button, [Edit] button Details of DSM Timer counter (path checking period, enable path checking,

    number of retries, retry interval, PDO deletion period) Details of MPIO paths Path status

    Notes for Host Response Settings: Dont use different Host Response settings Active-Active (A-A) or Active-Active Preferred (A-A/P) for the Primary (active) and the Secondary (passive) ETERNUS DX S3 storage system.

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 26 of 27 fujitsu.com/eternus

    TFO Checklist

    Quick Checklist for TFO Configurations Step Action Important Note

    1 Check the firmware of both ETERNUS DX S3 systems A minimum of V10L20-0000 is required

    2 Check that the latest version of ETERNUS SF Manager including all latest patches are installed

    ETERNUS SF V16.1 or higher

    3 Discover both ETERNUS DX S3 systems in ETERNUS SF Manager

    4 Discover the FC Switches Read-Only Mode is recommended

    5 Check the licenses for both ETERNUS DX S3 systems ETERNUS SF Storage Cruiser V16 Standard License, ETERNUS SF Storage Cruiser V16 Storage Cluster Option, ETERNUS SF AdvancedCopy Manager V16 Remote Copy License

    6 Configure FC Zoning or Direct Cabling for the REC Path A minimum of 1 path per CM is recommended

    7 Configure the REC Path with ETERNUS SF Manager or the ETERNUS DX S3 HW-GUI

    RA only Ports are recommended

    8 Configure a Storage Cluster TFO Group Use Split Mode --> Read to achieve Application Consistency. Only CA Ports can be used.

    9 Configure FC Zoning from the Business Server(s) to the Primary ETERNUS DX S3 system

    Only WWPN based zoning is supported for TFO

    10 Create the Business LUNs on both ETERNUS DX S3 systems Be sure that the LUNs on both ETERNUS DX S3 systems have the same size

    11 Register the HBAs of the Business Server(s) on the Primary ETERNUS DX S3

    Be sure to use the same Host Response settings on both ETERNUS DX S3 arrays. Note down the used Host WWPNs for later usage.

    12 Create an Affinity/LUN Group on the Primary ETERNUS DX S3 and add the Business LUNs

    Note down the used Host LUN Numbers for later usage

    13 Check the reservation status of each volume using the ETERNUS DX S3 HW-GUI

    Remove existing reservations from each volume

    14 Create a Host Affinity for the registered HBAs, the ports and the created LUN Group on the Primary ETERNUS DX S3

    It is not possible to use Host Group, Port Groups, LUN Group mechanism from the HW-GUI in TFO configurations

    15 Register the HBAs of the Business Server(s) on the Secondary ETERNUS DX S3

    Be sure to use the same Host Response settings on both ETERNUS DX S3 arrays. Add the Host WWPNs manually (info from step 11)

    16 Create an Affinity/LUN Group on the Secondary ETERNUS DX S3 and add the Business LUNs

    Be sure to use the same Host LUN Numbers as configured for the Primary ETERNUS DX S3

    17 Create a Host Affinity for the registered HBAs, the ports and the created LUN Group on the Secondary ETERNUS DX S3

    It is not possible to use Port Groups in TFO configurations. Be sure to use the Standby Ports for this Host Affinity.

    18 Check the TFO Group and TFO Volume Status

    19 Install the ETERNUS SF Storage Cruiser Agent as Monitoring instance (Storage Cluster Controller)

    Only Windows OS is supported

    20 Modify the ETERNUS SF Storage Cruiser Agent Configuration files

    Correlation.ini & TFOConfig.ini

    21 Restart the ETERNUS SF Storage Cruiser Agent Service

    22 Discover the Storage Cluster Controller Server in the ETERNUS SF Manager

  • Best practice [ETERNUS DX S3 Storage Cluster] FUJITSU CONFIDENTIAL

    Page 27 of 27 fujitsu.com/eternus

    Abbreviations Shortcut Description

    CA Abbreviation of ETERNUS DX Channel Adapter

    CM Abbreviation of ETERNUS DX Controller Module

    LUN Abbreviation of Logical Unit Number

    TFO Abbreviation of Transparent Failover. For the Storage Cluster feature, it means operation of failover transparently for operation server.

    TFOV Abbreviation of TFO Volume. A volume assigned in a Storage Cluster configuration.

    TFO Group

    A group managing connection configuration, policies, states and maintenance for failover. It includes one or more Fibre Channel (FC) CA ports and volumes allowed to access from these CA ports. The state of TFO Group is Active (accessible from operation server) or Standby (not accessible from operation server).

    CA Port Pair

    The Storage Cluster feature operates failover by sharing common WWN/WWPN with each Fibre Channel (FC) CA port of two ETERNUS DX S3 storage systems and controlling link state of each Fibre Channel (FC) CA port. This operation is called CA Port Pairing and a pair of Fibre Channel (FC) CA ports sharing common WWN/WWPN is called CA Port Pair.

    WWN / WWPN Abbreviation of World Wide Name / World Wide Port Name

    The diagram below illustrates the different components used by a TFO Group of the Storage Cluster feature, such as:

    TFO Group including TFOVs, Affinity Groups and CA Port Pairs

    Contact FUJITSU Limited Address:Shiodome City Center, 5-2, Higashi-shimbashi 1-Chome, Minato-ku, Tokyo 105-7123, Japan Website: www.fujitsu.com/eternus

    2014 Fujitsu, the Fujitsu logo, [other Fujitsu trademarks /registered trademarks] are trademarks or registered trademarks of Fujitsu Limited in Japan and other countries. Other company, product and service names may be trademarks or registered trademarks of their respective owners. Technical data subject to modification and delivery subject to availability. Any liability that the data and illustrations are complete, actual or correct is excluded. Designations may be trademarks and/or copyrights of the respective manufacturer, the use of which by third parties for their own purposes may infringe the rights of such owner.

    Primary storage

    CA #0 CA #1

    TFOV#0

    TFOV#1

    TFOV#2

    Affinity Group #0 Affinity Group #1

    Secondary storage

    CA #0 CA #1

    TFOV#0

    TFOV#1

    TFOV#2

    Affinity Group #0 Affinity Group #1

    CA Port Pair

    CA Port Pair

    TFO Group TFO Group

    Corresponding

    Standby Active

    IntroductionOverviewRequirementsSoftwareLicenses

    Storage Cluster setup and configurationStorage Cluster configurationStorage Cluster allocating Business VolumesStorage Cluster Controller setupStorage Cluster processingStorage Cluster bi-directional information and setup

    Recovery procedure caused by defect RAID GroupPreconditions1. Step2. Step3. Step4. Step5. Step6. Step

    AppendixFibre Channel Switch read-only discoveryStatus of TFO Group InformationRecommendationsTFO ChecklistAbbreviations