56
Architecting Fibre Channel HA Solutions Rick Jooss [email protected]

Architecting Fibre Channel HA Solutions

  • Upload
    arella

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Architecting Fibre Channel HA Solutions. Rick Jooss [email protected]. Agenda. CFModes Single System Imagine Multipathing Host Clustering Storage System Backend HA Q&A. Agenda. CFModes Single System Image Multipathing Host Clustering Storage System Backend HA Q&A. - PowerPoint PPT Presentation

Citation preview

Page 1: Architecting Fibre Channel HA Solutions

Architecting Fibre Channel HA Solutions

Rick Jooss

[email protected]

Page 2: Architecting Fibre Channel HA Solutions

2NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

Page 3: Architecting Fibre Channel HA Solutions

3NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Image

Multipathing

Host Clustering

Storage System Backend HA

Q&A

Page 4: Architecting Fibre Channel HA Solutions

4NetApp Confidential -- Do Not Distribute

CFMODE – Cluster Failover Mode

What is CFMODE?– FCP Setting– Determines behavior of FC Target Ports, particularly

during a CFO event

Why is there more than one CFMODE?– Original CFMODE (standby) did not work for all host

types (HP-UX, AIX)– Original CFMODE did not work with the 270C

because it only has a single FC port

Page 5: Architecting Fibre Channel HA Solutions

5NetApp Confidential -- Do Not Distribute

Available Paths- Standby Mode

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

0c 0d 0c 0d0a 0b 0a 0bHA Configuration

Controller 1 Controller 2

Solid Blue are paths to the LUNs being served by Controller 1

Dashed Purple are paths to the LUNs being served by Controller 2

Page 6: Architecting Fibre Channel HA Solutions

6NetApp Confidential -- Do Not Distribute

0c 0d 0c 0d0a 0b 0a 0bHA Configuration

Controller 1 Controller 2

Path Access (Switch Failure) – Standby Mode

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Solid and Blue are paths to the LUNs being served by Controller 1

Dashed and Purple are paths to the LUNs being served by Controller 2

Switch/Fabric 1 will experience a failureMP layer works around the failure

Page 7: Architecting Fibre Channel HA Solutions

7NetApp Confidential -- Do Not Distribute

0c 0d 0c 0d0a 0b 0a 0bHA Configuration

Controller 1 Controller 2

Controller 2 Takes over all operations

Path Access (CFO event)- Standby Mode

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Conntroller 1 will experience a failure

Solid and Blue are paths to the LUNs being served by Controller 1

Dashed and Purple are paths to the LUNs being served by Controller 2

Page 8: Architecting Fibre Channel HA Solutions

8NetApp Confidential -- Do Not Distribute

0c 0d 0c 0d0a 0b 0a 0bHA Configuration

Controller 1 Controller 2

Filer Head 2 Takes over all operationsMP layer is not involved in switchover

Path Access (CFO event)- Standby Mode

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Controller 1 will experience a failure

WWN1 WWN2 WWN3 WWN4 WWN5 WWN6 WWN7 WWN8

Solid and Blue are paths to the LUNs being served by Controller 1

Dashed and Purple are paths to the LUNs being served by Controller 2

Page 9: Architecting Fibre Channel HA Solutions

9NetApp Confidential -- Do Not Distribute

Available Paths - Partner Mode

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Solid Blue are paths to the LUNs being served by Controller 1

Dashed Purple are paths to the LUNs being served by Controller 2

0c 0d 0c 0d0a 0b 0a 0bHA Configuration

Controller 1 Controller 2

Page 10: Architecting Fibre Channel HA Solutions

10NetApp Confidential -- Do Not Distribute

Available Paths - Partner Mode – FAS3000 Default Configuration

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Solid Blue are paths to the LUNs being served by Controller 1

Dashed Purple are paths to the LUNs being served by Controller 2

0c 0d 0c 0dHA Configuration

Controller 1 Controller 2

Page 11: Architecting Fibre Channel HA Solutions

11NetApp Confidential -- Do Not Distribute

Available Paths - Dual Fabric

0c_0

Switch/Fabric 1 Switch/Fabric 2

Host

HA Configuration

Controller 1 Controller 2

Solid Blue are paths to the LUNs being served by Controller 1

Dashed Purple are paths to the LUNs being served by Controller 2

LUNsLUNs

0c_00c_2 0c_2

Page 12: Architecting Fibre Channel HA Solutions

12NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

Page 13: Architecting Fibre Channel HA Solutions

13NetApp Confidential -- Do Not Distribute

What is the single system image cfmode?

Universal cfmode– Works on all HA storage systems– Works on all switches

Presents the HA configuration as a single target

All LUNs are visible on all controller ports

All hosts require multipathing software

Page 14: Architecting Fibre Channel HA Solutions

14NetApp Confidential -- Do Not Distribute

0c 0c 0d0dHA Configuration

Controller 1 Controller 2

Available Paths - Single System Image – Single Card

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Solid Blue are paths to the LUNs being served by Controller 1

Dashed Purple are paths to the LUNs being served by Controller 2

Page 15: Architecting Fibre Channel HA Solutions

15NetApp Confidential -- Do Not Distribute

0c 0c 0d0dHA Configuration

Controller 1 Controller 2

Path Access (Switch Failure) - Single System Image – Single Card

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Solid and Blue are paths to the LUNs being served by Head 1

Dashed and Purple are paths to the LUNs being served by Head 2

MP layer works around the failureSwitch/Fabric 1 will experience a failure

Page 16: Architecting Fibre Channel HA Solutions

16NetApp Confidential -- Do Not Distribute

0c 0c 0d0dHA Configuration

Controller 1 Controller 2

Path Access (CFO event) - Single System Image – Single Card

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Solid Blue are paths to the LUNs being served by Controller 1

Dashed Purple are paths to the LUNs being served by Controller 2

Controller 2 takes over all operations MP layer works around the failure

Controller 1 will experience a failure

Page 17: Architecting Fibre Channel HA Solutions

17NetApp Confidential -- Do Not Distribute

Available Paths - Single System Image – Single Port

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Solid Blue are paths to the LUNs being served by Controller 1

Dashed Purple are paths to the LUNs being served by Controller 2

0d 0dHA Configuration

Controller 1 Controller 2

Page 18: Architecting Fibre Channel HA Solutions

18NetApp Confidential -- Do Not Distribute

Available Paths - Single System Image – Single Port

Host

LUNs LUNs

Solid Blue are paths to the LUNs being served by Controller 1

Dashed Purple are paths to the LUNs being served by Controller 2

0d 0dHA Configuration

Controller 1 Controller 2

Loop ModeLoop Mode

Page 19: Architecting Fibre Channel HA Solutions

19NetApp Confidential -- Do Not Distribute

Why SSI mode?

Works in all configurations

Makes us look more like other SAN vendors

Reduces port burn without using FC Loop– Fully redundant config requires only 1 “wire” per

controller, instead of 2.

Simpler wiring, no a/b port distinctions and no requirement to run the same cables from each controller to the same switch.

Page 20: Architecting Fibre Channel HA Solutions

20NetApp Confidential -- Do Not Distribute

Management changes

Unified LUN mapping address space across the HA configuration.– Controller prevents these conflicts by checking with

the partner controller.

If the controller interconnect is down, some operations are disabled by default– Igroup add, lun map, lun online, igroup set ostype

Page 21: Architecting Fibre Channel HA Solutions

21NetApp Confidential -- Do Not Distribute

SSI Roadmap

Introduced in ONTAP 7.1

Refer to FCP host compatibility matrix http://now.netapp.com/NOW/knowledge/docs/san/fcp_iscsi_config/index.shtml for specific host support

Page 22: Architecting Fibre Channel HA Solutions

22NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

Page 23: Architecting Fibre Channel HA Solutions

23NetApp Confidential -- Do Not Distribute

Multipathing

Multipathing provides multiple paths from the host to the external storage device

Provides High-Availability – Protects against path failures– Ensures high availability of applications and data by

eliminating single points of failure

Provides Improved Performance – Increases potential performance by utilizing multiple

paths

Page 24: Architecting Fibre Channel HA Solutions

24NetApp Confidential -- Do Not Distribute

0c 0c 0d0dHA Configuration

Controller 1 Controller 2

Multipathing

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Page 25: Architecting Fibre Channel HA Solutions

25NetApp Confidential -- Do Not Distribute

0c 0c 0d0dHA Configuration

Controller 1 Controller 2

A/P (active passive) policy – Single LUN

Switch/Fabric 1 Switch/Fabric 2

LUNs LUNs

Hosts

Page 26: Architecting Fibre Channel HA Solutions

26NetApp Confidential -- Do Not Distribute

0c 0c 0d0dHA Configuration

Controller 1 Controller 2

A/P (active passive) policy – No Round Robining

Switch/Fabric 1 Switch/Fabric 2

Hosts

LUN1 LUN3LUN2 LUN4

Page 27: Architecting Fibre Channel HA Solutions

27NetApp Confidential -- Do Not Distribute

0c 0c 0d0dHA Configuration

Controller 1 Controller 2

A/P (active passive) policy - Round Robining

Switch/Fabric 1 Switch/Fabric 2

Hosts

LUN1 LUN3LUN2 LUN4

Page 28: Architecting Fibre Channel HA Solutions

28NetApp Confidential -- Do Not Distribute

A/P (active/passive)

Active/Passive Configuration– 1 active path to a single LUN

• Performance to a LUN is limited by that paths capability (HBA, switch, target port)

– Possible to round robin multiple LUNs across multiple paths

– All other paths to the LUN are passive– On failover

• Primary paths are tried first• Secondary paths are used if no primary paths are

available

Page 29: Architecting Fibre Channel HA Solutions

29NetApp Confidential -- Do Not Distribute

A/A (Active active) policy (cfmode = standby)

Switch/Fabric 1 Switch/Fabric 2

Hosts

LUNs LUNs

0c 0d 0c 0d0a 0b 0a 0bHA Configuration

Controller 1 Controller 2

Page 30: Architecting Fibre Channel HA Solutions

30NetApp Confidential -- Do Not Distribute

A/A (active/active)

Host accessing data from a single LUN across multiple paths simultaneously– Typically used for load balancing

• Round Robin• Least Queue Depth• Weighted

– On failure I/Os are sent down remaining available paths

Page 31: Architecting Fibre Channel HA Solutions

31NetApp Confidential -- Do Not Distribute

0c 0c 0d0dHA Configuration

Controller 1 Controller 2

A/A/A (asymmetric active active)

Switch/Fabric 1 Switch/Fabric 2

Host

LUNs LUNs

Page 32: Architecting Fibre Channel HA Solutions

32NetApp Confidential -- Do Not Distribute

A/A/A (asymmetric active active)

Distinguishes between primary and secondary paths

Does active/active across primary paths only

Only uses secondary paths when no primary are available

Page 33: Architecting Fibre Channel HA Solutions

33NetApp Confidential -- Do Not Distribute

NetApp’s Multipathing Strategy

2 pronged strategy– Support for “native” solutions

• What most customers rightly feel best about– Support for host and storage independent

solution• VERITAS• Allows common solution across various

server as well as storage variants

Page 34: Architecting Fibre Channel HA Solutions

34NetApp Confidential -- Do Not Distribute

Multipathing For Windows

Windows MPIO– Uses the Microsoft standard infrastructure– A/P Policy– Automatically chooses primary paths for failover

before trying proxy ones– In standby the LUNS are automatically round

robined across all paths MPIO

Partner/SSI cfmode A/P

Standby cfmode A/P

Dual Fabric cfmode A/P

Page 35: Architecting Fibre Channel HA Solutions

35NetApp Confidential -- Do Not Distribute

MultiPathing For Solaris

DMP 4.0 MPxIO

Partner/SSI cfmode A/A/A A/P

Standby cfmode A/A N/A

Dual Fabric cfmode A/P A/P

Page 36: Architecting Fibre Channel HA Solutions

36NetApp Confidential -- Do Not Distribute

MultiPathing For Solaris

VERITAS DMP 4.0– NetApp ASL 4.0– Supports A/P, A/A, & A/A/A (Active Passive

Concurrent)

SUN Native MPxIO– Not supported with standby cfmode– Supports A/P – Can be A/A but required manual failback– Manual configuration required– Round Robining of the LUNs possible– Sometimes called

• Traffic Manager • Leadville Stack

Page 37: Architecting Fibre Channel HA Solutions

37NetApp Confidential -- Do Not Distribute

MultiPathing For Linux

Qlogic– A/P Policy– Manually configured– Round Robining of LUNs is possible

DCM– Linux native solution

Qlogic DM

Partner/SSI cfmode A/P A/A/A

Standby cfmode A/P A/A

Dual Fabric cfmode A/P A/P

Page 38: Architecting Fibre Channel HA Solutions

38NetApp Confidential -- Do Not Distribute

MultiPathing For AIX

DMP 4.0 SANpath MPIO

Partner/SSI cfmode A/A/A A/A/A A/A/A

Standby cfmode N/A N/A NA

Dual Fabric cfmode A/P A/P A/P

Page 39: Architecting Fibre Channel HA Solutions

39NetApp Confidential -- Do Not Distribute

MultiPathing For AIX

SANpath– A/A/A– Automatically chooses primary paths for failover

before trying proxy ones– Special policy for SCSI-2 reservation

– Required for host clustering HACMP– Can only use A/P

VERITAS DMP 4.0– Only supports A/A/A

IBM MPIO– IBM native solution with NetApp PCM

Page 40: Architecting Fibre Channel HA Solutions

40NetApp Confidential -- Do Not Distribute

Multipathing for HP-UX

Partner/SSI cfmode A/P A/P

Standby cfmode N/A N/A

Dual Fabric cfmode A/P A/P

PVLinks DMP 3.5

Page 41: Architecting Fibre Channel HA Solutions

41NetApp Confidential -- Do Not Distribute

Multipathing for HP-UX

PVlinks/LVM– A/P policy– Single active path per LUN, user controlled– Ordering for remaining paths for failover– ntap_config_paths

• NETAPP script to define path ordering based on filer path types: primary, proxy

• automatically round robin primary paths among all LUNS

– Supports both FCP and iSCSI paths

VERITAS DMP 3.5– A/P Policy

Page 42: Architecting Fibre Channel HA Solutions

42NetApp Confidential -- Do Not Distribute

Multipathing for VMware

VMware– A/P Policy– Manually configured– Round Robining of LUNs possible

VMware

Partner/SSI cfmode A/P

Standby cfmode A/P

Dual Fabric cfmode A/P

Page 43: Architecting Fibre Channel HA Solutions

43NetApp Confidential -- Do Not Distribute

Multipathing for Netware

Novell– A/P Policy– Manually configured– Round Robining of LUNs possible

Novell

Partner/SSI cfmode A/P

Standby cfmode A/P

Dual Fabric cfmode A/P

Page 44: Architecting Fibre Channel HA Solutions

44NetApp Confidential -- Do Not Distribute

Fibre Channel SAN Host Support

Partner/SSI cfmode

Windows“NTAP DSM”

Standby cfmode

Solaris “DMP”

Dual Fabric cfmode

Linux: Qlogic“Failover Mode”

A/P A/P

VMwareMultipathing

Solaris“MPxIO”

A/A/A A/A A/P

A/P

A/P A/P A/P

A/P A/P A/P

A/P N/A A/P

AIX“SANpath”

A/A/A N/A A/P

HP-UX“PVLinks”

Novell

A/P N/A A/P

A/P A/P A/P

Page 45: Architecting Fibre Channel HA Solutions

45NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

Page 46: Architecting Fibre Channel HA Solutions

46NetApp Confidential -- Do Not Distribute

Host Clustering & Storage

LUNs need to be made visible to host simultaneously

Some Host Clustering solutions require SCSI reservations to avoid to split brain

0c

Switch/Fabric 1

Host 1

LUNs

Controller 10b 0d

Controller 2

Controller 1 Active Shelf(s)

Controller 2 Active Shelf(s)

Switch/Fabric 2

0a

0d0b 0a0c

Host 2

Page 47: Architecting Fibre Channel HA Solutions

47NetApp Confidential -- Do Not Distribute

Host Clustering for Microsoft

Microsoft Cluster– SnapDrive is integrated to help configuration– WIN2K3 allows single HBA for both boot device &

shared storage– Cannot grow LUN online in cluster

• SnapDrive ability to very quickly grow a LUN minimizes the pain caused by this

Page 48: Architecting Fibre Channel HA Solutions

48NetApp Confidential -- Do Not Distribute

Host Clustering for VERITAS

VCS– By default does not us I/O fencing to protect against

split brain– I/O fencing requires SCSI-3 reservations– 7.0.3 will have SCSI-3 reservations that are

compatible with VERITAS– Does not do failover on FC links

Page 49: Architecting Fibre Channel HA Solutions

49NetApp Confidential -- Do Not Distribute

Host Clustering for HP-UX

ServiceGuard– 1 to 3 node clusters using SCSI-2 locks as arbitrator

to avoid split brain– Does not do failover in dead FC links

Page 50: Architecting Fibre Channel HA Solutions

50NetApp Confidential -- Do Not Distribute

Host Clustering for AIX

HACMP– Uses SCSI-2 locks as arbitrator to avoid split brain

• “setsp –b2” to enable locks with SANpath• SCSI-2 locks to active/active are mutually

exclusive

Page 51: Architecting Fibre Channel HA Solutions

51NetApp Confidential -- Do Not Distribute

Fibre Channel SAN Host Support

Host ClusterOS Vendor Multipath File SystemHBA

Native SANpath HACMP

Volume Mgr

LVM

QLogic QLogic Novell Clusters NSS

Emulex MPIO MSCS MMC NTFS

Emulex Veritas DMP Veritas VCS Veritas VxVM Veritas VxFS

JFS/2Raw

NativeHP PVLInksVeritas DMP

MC ServiceGuardVeritas VCS

LVMVeritas VxVM

JFS/ HFSRaw

Veritas VxFS

QLogic QLogic Oracle 9i, 10g RACext3ext2

Reiser

QLogic QLogic Oracle 9i, 10g RACext3ext2

Reiser

EmulexQLogic

VMWare MSCSVirtualCenter (VMotion)

VMware VMFS 2.xRaw

Page 52: Architecting Fibre Channel HA Solutions

Shared Storage

Page 53: Architecting Fibre Channel HA Solutions

53NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A

Page 54: Architecting Fibre Channel HA Solutions

54NetApp Confidential -- Do Not Distribute

Enables Dual Path HA

Key Benefits Full storage hardware

redundancy in HA systems

Prevent cluster failover events due to many storage issues.

Complements CFO for improved HA and resiliency

Key Benefits Full storage hardware

redundancy in HA systems

Prevent cluster failover events due to many storage issues.

Complements CFO for improved HA and resiliency

Loop 1 Loop 2 Loop 3 Loop 4

X

Protect Against Cable Pulls Or

Breaks

X Protect Against Single HBA Failure

X

Protect Against Storage Controller

(eg. ESH2) Hot Swap

Page 55: Architecting Fibre Channel HA Solutions

55NetApp Confidential -- Do Not Distribute

Switched Back-End

Dual Active Paths for HA Environments– Reduces the number of HA failovers– Improve overall HA performance– Data ONTAP tries to balance load across paths

SyncMirror– SyncMirror requires 100% disk overhead– Proper configuration survives all single failures

Page 56: Architecting Fibre Channel HA Solutions

56NetApp Confidential -- Do Not Distribute

Agenda

CFModes

Single System Imagine

Multipathing

Host Clustering

Storage System Backend HA

Q&A?