Module 13 Troubleshooting and Tips 20 Pages

Embed Size (px)

Citation preview

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    1/20

    Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    2/20 Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    3/20 Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    4/20 Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    5/20

    The peer storage architecture allows PS Series volumes to be connected to multiple

    hosts simultaneously.

    Since the PS Series array is a block storage device, it has no knowledge of the

    format of the data or file system(s) stored on it and so does not provide any

    means to interlock accesses from different hosts.Because ordinary, non-distributed file systems such as NTFS, FAT32 or Ext2

    do not have the necessary interlocking mechanism, you must use host-

    based software that will interlock read-and-write accesses to the volume.

    If you need to share files among multiple users:

    Mount the iSCSI volume on one server only and then install a network-

    sharing file system such as CIFS or NFS on that server to make the data

    available to multiple network users, or

    Use distributed file server software such as Microsoft DFS, RedHat GFS, or

    Sun Microsystems QFS.

    Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    6/20 Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    7/20 Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    8/20 Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    9/20 Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    10/20 Series Storage Troubleshooting 12-

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    11/20

    Most initiators allow you to establish an iSCSI connection for the duration of the system bootor establish a persistent connection such that the connection is restored automatically onreboot. The problem discussed here commonly occurs when the connection is not

    configured to be automatically restored on a system restart.

    A related issue involves ensuring that the configured-persistent iSCSI device connection is

    restored prior to the startup of the application or service that depends on the iSCSIdevices. With software initiators that typically start later in the system boot process, you need

    to ensure that the software initiator automatically starts at boot time and finishes connecting toyour iSCSI targets before your service/application/file service starts.

    The solution to these problems varies among operating systems:

    UNIX /Linux systems require ensuring that the "start order" of your software initiator

    precedes your file system mounts and startup of the application services.

    In Linux, if you are using version 3.6.2, you might consider using an adjunct tool, such

    as devlabel, to help ensure the devices are properly restored before the boot processproceeds. Read the initiator release notes for further details.

    In Windows, the iSCSI connection needs to be set persistent in the iSCSI control panel

    applet. Bind Volumes should also be selected once the system and application are upand running properly. This ensures the application restart will wait until the boundvolumes are re-attached.

    In the iSCSI documentation, Microsoft shows how to adjust the startup order of

    services to depend on or wait for the iSCSI service to start first. In addition to theabove steps, this ensures the iSCSI service is started before application services such

    as LAN Manager, Exchange, etc. This is done by editing registry keys to set thedependency order, and is described in the iSCSI User's Guide (uguide.doc). DellEqualLogic also provides examples in our Knowledge Base Articles:

    Shares are not maintained across reboots for group volumes connected

    using the Microsoft iSCSI Software Initiator

    Using the Microsoft iSCSI Software Initiator with Exchange or SQL Server

    With proper initiator settings, application startup will be reliable across systemrestarts.

    Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    12/20

    Sample show running command output shown next:

    version 12.2no service padservice timestamps debug uptimeservice timestamps log uptimeno service password-encryption

    !hostname SANSWITCH!ip subnet-zero!spanning-tree mode pvst ----- [or rapid-pvst]spanning-tree portfast default ----- [or specify on each port]no spanning-tree optimize bpdu transmissionspanning-tree extend system-id!interface GigabitEthernet1/0/1switchport access vlan 2switchport mode accessflowcontrol receive desiredspanning-tree portfast ----- [or specify default above]!interface GigabitEthernet1/0/2switchport access vlan 2

    switchport mode accessflowcontrol receive desiredspanning-tree portfast!----- [etc...]!interface Vlan1no ip address!interface Vlan2ip address 172.20.0.10 255.255.0.0!end

    Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    13/20 Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    14/20

    When reporting or resolving a problem, be prepared to provide the Dell EqualLogic

    support team with as much information as possible.

    The Dell EqualLogic customer support team may also ask you to gather diagnostic

    information from one or more Dell EqualLogic storage arrays by using the Group

    Manager command line interface (CLI) diag command.The diag command runs a program that gathers internal state and

    configuration data from an array, encodes it for transmission, and segments it

    into a number of files, which are stored on the array in an area reserved for

    diagnostic use.

    The data gathered does NOT include any user data, either from the disks or

    from the cache, nor does it include any group account passwords or other

    access information.

    Series Storage Troubleshooting 12-

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    15/20

    The Diag command can be run as a CLI command via telnet or the console and also

    starting with Firmware version 4.x as tool in the Group Manager GUI

    The diag command runs a program that gathers internal state and configuration data

    from an array, encodes it for transmission, and segments it into a number of files,

    which are stored on the array in an area reserved for diagnostic use.The data gathered does NOT include any user data, either from the disks or

    from the cache, nor does it include any group account passwords or other

    access information.

    The diag command gathers data only from the array on which it is run. If you

    have a group with multiple members, you may need to run the command

    separately on each array, and you may be instructed to do so.

    The output files from the diag command are kept on the array until they are manually

    deleted or overwritten by the next invocation of the command.

    Once the diag command completes, there are three options available for retrieving

    the data from the array:

    If e-mail notification is enabled on the array, and the array has an active

    network connection, the diag command will try to send the output segments

    to the addresses on the e-mail notification list.

    There will be a minimum of 3 e-mails and there can be up to 6 e-mails,

    depending on how much data is being delivered.

    If the array has an active network connection, you also have the option of

    using FTP to retrieve the data from the array. Steps for using FTP appear later

    in this article.

    As the diag command completes, you are given the option of having the

    output directed (dumped) to the console where you can use the text capture

    feature of your Telnet or SSH client or terminal emulator program to capturethe output.

    Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    16/20

    The Diagnostic Tool runs a program in the Group Manager GUI that gathers internal

    state and configuration data from an array, encodes it for transmission, and segments

    it into a number of files, which are stored on the array in an area reserved for

    diagnostic use.

    The data gathered does NOT include any user data, either from the disks orfrom the cache, nor does it include any group account passwords or other

    access information.

    The diag command gathers data only from the array on which it is run. If you

    have a group with multiple members, you may need to run the command

    separately on each array, and you may be instructed to do so.

    The output files from the Diagnostic Tool runs a program in the Group Manager GUI

    are kept on the array until they are manually deleted or overwritten by the next

    invocation of the command.

    Once the Diagnostic Tool runs in the Group Manager GU completes, it then sends

    the output using the configured email notification information to the appropriate

    email recipient:

    If e-mail notification is enabled on the array, and the array has an active

    network connection, the Diagnostic Tool will try to send the output segments

    to the addresses on the e-mail notification list.

    There will be a minimum of 3 e-mails and there can be up to 6 e-mails,

    depending on how much data is being delivered.

    Series Storage Troubleshooting 12-

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    17/20

    Invocation of this feature causes the diag command to generate a short, clear text

    report that only contains data specific to a particular type of error, rather than the

    normal longer Diagnostic report that contains information about everything. This

    short, clear text report is called an abbreviated diag.

    The audience for abbreviated diags consists mainly of two groups:Support areas that will be handling routine issues such as faulty hardware

    component replacements (drives, power supplies) and environmental issues

    (for instance, array reports temp too high).

    2 - Secure/government sites that need a human-readable version of all data

    destined for delivery outside of the secure area, so that it may be examined,

    possibly edited, and approved by a security officer before being sent.

    For firmware version 4.0.2 and above there are currently four types of abbreviated

    report available: Disk failure, Controller Module failure, power supply failure, and

    temperature event (too high or too low). More types may be added in the future as

    needed.

    To gather an abbreviated diag report, the "-a" option is given on the command line,

    followed by a space and then a one-letter code that indicates the type ("d" = disk, "c"

    = CM, "p" = power supply, "t" = temperature). So, from CLI:

    Example: grpname> diag "-a d"

    The abbreviated diag report will be delivered using the same mechanisms as the

    normal diag output on that array (that is, e-mail to the notification address list if

    available, fetch with ftp/scp, or capture from console). By far the best method is to

    turn on the capture feature on your terminal emulator and have it sent to the serial

    port, this works well since the output is very short.

    Unlike the normal diag output, the abbreviated diag will consist of only one file or e-

    mail message, and it will be in plain text.

    Series Storage Troubleshooting 12

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    18/20

    What is it?

    The abbreviated diag output is a short clear text report that only contains data specific

    to a particular type of hardware error.

    When can it be used?

    This can be used for issues where a customer reports hardware errors with disk drives.

    Like the full version of the DIAGs, the abbreviated version is non-intrusive to the

    operation of the array and will not degrade the performance while it is running.

    How do you use it?

    To gather the abbreviated DIAG report,

    1. Login to the array reporting the disk failure with a telnet/SSH session or

    console port using the grpadminaccount.

    2. Ensure that you connect to the member array that is having the problem (i.e.

    use the IP address of eth0, 1, 2 or 3; do not use the group IP address).

    3. Once logged into the array, at the group name prompt, use the diaga d

    command, with the "-a" (abbreviated) option followed by a space and then a

    one-letter code of d where "d" = Disk

    4. Search the clear text output for the text string GetAbbrevDiskinfo to located

    and confirm the failed disk,

    5. Save the abbreviated diag output file for possible later use

    12-

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    19/20

    Starting with Firmware version 3.1.1 the save-config command was introduced. This command allows you

    to save the group configuration to a file and use the information to restore a group in the event of a

    complete group failure

    The save-config command creates a restoration file on the array, which contains the CLI commandsneeded to recreate the group. For safe-keeping, use ftp or scp to copy the file from the array (specify the

    group IP address in the ftp open or scp command) to a host..

    By default, the restoration file is named config.cli, but you can specify a different file name. For example,

    you can name the file for the date or group on which it was created. Also, if you have several PS Seriesgroups, you can give each groups file a unique name to prevent confusion.

    The save-config command can be run in default or group-only mode. Default mode saves the

    configuration as a set of commands that you can run to restore the configuration. Group-only mode savessome of the information as commands and some as commented text, which you must manually edit.

    In default mode, the resulting file will automatically restore the following:

    Replication partner configuration , Storage pools, Member RAID level, pool, and network interfaceconfiguration , Group customization, including lists of servers, Volume configuration, including

    access control records , Volume collection configuration , Schedules for snapshots and replication ,

    Local CHAP account configuration , Event settings and Account configuration

    Whenever any group configuration information changes, run the save-config command again to create anupdated restoration file

    If you specify the save-config command with the -grouponly parameter, member and pool configuration

    information will be saved as comments instead of commands, so you must manually restore these parts ofthe configuration. A members RAID level and pool must be selected before you can use the storage.

    The save-config command will not restore the basic member network configuration or the group

    configuration, but it will save this information as comments in the restoration file. The following

    information must be manually supplied to each member by running the setup utility:

    Group name and IP address Passwords

    Member name, IP address, default gateway, and netmask.

    A save-config file may not be able to successfully restore a member that is running a firmware version thatis different from the firmware that was running on the array that generated the restoration file.

  • 7/29/2019 Module 13 Troubleshooting and Tips 20 Pages

    20/20