71
Storage Area Network Baoquan Zhang

Storage Area Network - University of Minnesota · Storage Area Network Universal Storage Connectivity Good scalability Scale performance and capacity Relatively long distance data

  • Upload
    lyduong

  • View
    220

  • Download
    5

Embed Size (px)

Citation preview

Storage Area NetworkBaoquan Zhang

Outline

• What is a SAN?

• Why SAN?

• What is a SAN composed of?

• SAN, NAS or DAS?

Outline

• What is a SAN?

• Why SAN?

• What is a SAN composed of?

• SAN, NAS or DAS?

• Any high-performance network whose primary purpose is to enable storage devices to communicate with computer systems and with each other. *

• A high-speed network, an extension to the storage bus, allows the establishment of direct connections between storage devices and processors (servers). **

• A network that provides access to consolidated, block level data storage. ***

SAN

*www.snia.org**Khattar, Ravi Kumar, et al. Introduction to Storage Area Network, SAN. IBM Corporation, International Technical Support Organization, 1999.***https://en.wikipedia.org/wiki/Storage_area_network

Outline

• What is a SAN?

• Why SAN?

• What is a SAN composed of?

• SAN, NAS or DAS?

Why SAN?

• Industry Recognition : Three tiers architecture

PresentationDesktop(PC, NC)

ProcessingApplication Servers

Data StorageStorage Devices

Why SAN?

Client/Server Computing

Clients

Client Access LAN

Application Servers

Storage Devices

DAS

A B

Information Island

Limited distance of data transmittingSCSI: 1.5m~25m

Poor scalabilityAdding Disk for each server

Hard to share datainformation island• Extra resource of copying and

transmitting data• Work with out-of-date data

Why SAN?

Storage Area NetworkUniversal Storage Connectivity

Good scalability

Scale performance and capacity

Relatively long distance data transmitting

IP : Internet-based Long-distance

FC: 15m~10km

IB : 15m~10km

No-copy data sharing

Shared storage pool

Clients

Client Access LAN

Application Servers

Storage Area Network

Storage

Storage Model 1: Direct Access Storage

• All storage stranded behind server

• Proprietary access (vendor specific)

• Storage sharing creates CPU overhead

• Network burdened with disk I/O traffic

• Limited scalability and low performance

Server

Storage Model 2: Fibre Channel SAN

• Replaces parallel SCSI transport

• SAN is DAS from servers’ perspective

• Optimized for movement of data from server to disk or tape

• Facilitates storage clustering & LAN-free backup

• Typically does not use LAN protocols, relies on serial SCSI (SCSI-3)

SAN

SAN

SAN

Server

Server

Server

Interne

t

Intrane

t

Storage Model 2: FC SAN Limitations

• Creates a 3rd network (LAN, WAN, SAN)

• Pre-Gigabit Ethernet bandwidth assumptions

• Management nightmare

• Limited interoperability

• Minimal storage security

• Creates “SAN Islands”

SAN

SAN

SAN

Server

Server

Server

Interne

t

Intrane

t

Storage Model 3: IP-SAN

• Best features of Fibre Channel & IP networks

• Multiple server operating systems supported

• Maintain IT infrastructure, security & interoperability

• Ease of configuration and management

• Servers used optimally

• Support IP Quality of Service, Error detection & Prioritization

StorageData

IP

Video Voice

LINUXWIN 2000

SN 5420

SUN NT 4.0

IP

Network

Fibre Channel

Active Disk with OSD Capability As An Example of Intelligent Storage Devices

IP Network Attached

More Processing

Power and Memory

Storage Area Network• Server Architecture Based on SAN & NAS

• Network Protocol (FC-AL and SSA)

• Spatial Reuse

• Multiple Links and Switch Based Multiple FC-AL

HostSAN

FC-AL

Internet

ConnectionInternet

Connection

Internet

Connection

Internet

Connection

HostHost

Host

FC-AL

FC-AL

SAN

SAN

FC-AL FC-AL

Previous Research on SAN

• Efficient Protocol Design for FC-AL and SSA

• Emphasis on performance for future disks

• Built detailed simulation models for both FC-AL and SSA

• Supported by Seagate and IBM Storage Systems Division

• Scalable Streaming Video Servers based on SAN

• Co-funded a streaming video server company- Steaming21

• Many publications on streaming video servers and streaming video delivery over Internet

Serial Storage Interfaces

• FC-AL

• SSA

• FC-TORN

• FC-AL3

• InfiniBand

Serial Storage Interfaces

• Fibre Channel• FC-AL

• FC Switch

• Serial Storage Architecture (SSA)• Buffer Insertion Ring

• Link-by-link flow control

• Fairness Algorithm

• Independent links: Spatial Reuse

• Fault tolerance against link failure

FC-AL Features

• Bandwidth: 100 MB/s

• Connectivity: 126 devices

• Connection Distance: 30m device to device (with copper) and 10km

(with Fiber Optics)

• Fault-Tolerance: CRC protected frames, dual port, hot plug connector

• Distributed switch logic

FC-AL Fairness Algorithm

• Based on an Access Window with a history variable ACCESS

• Default value of ACCESS is true

• When an L_Port wins the arbitration, set ACCESS to false

• Before opening a circuit, winner send out ARB(F0) to detect if other L_Ports are also arbitrating

• If receive ARBx, other L_Ports are arbitrating

• When relinquish the loop, the winner sends out:• ARB(F0) if other L_Ports are arbitrating, or

• IDLE to trigger all L_Ports re-set ACCESS to true

SSA Features• 2-in and 2-out links per node (with 20 MB/s per

link)

• Fairness Access

• Fault-Tolerance: A multiple host configuration offers fault tolerance again host, link and adapter failures.

• Number of attachments: 126 for SSA

• Compact connectors: serial vs. parallel for SCSI

• Transmission distance: 25 m (2.5km) between devices with copper cables (fiber optic)

Spatial Reuse

• What is spatial reuse?• Concurrent non-overlap transfers can utilize full link bandwidth

• Why is it important?• Throughput can scale up with more links and non-overlap

transfers

• Achieved throughput could be as low as link bandwidth

• Device/data sharing may reduce spatial reuse potential

SSA SAT Fairness Algorithm• Based on token passing and quota

• Forwarding frames have higher priority than originating frames

• Holding a token allows a node to switch the priority between the originating and forwarding traffic.

• Hold quota (a_quota): number of frames that can be originated when holding the SAT token.

• Idle quota (b_quota): number of frames that can be originated since a node passed the SAT token last time and the channel is idle. In general, b_quota =4*a_quota.

Fairness vs. Channel Utilization

• How to define fairness?

• How to improve channel utilization?

• Starvation possible?

• Fairness+throughput

FC-TORN

• B_RDY (credit) is used to control the number of frames potentially can be sent to a destination (disk).

• SAT token based on one quota for each source (host or disk) to control the maximum number of frames sent by a source.

• B_RDY and B_RDY’ are used to produce fairness from sources to a destination.

Storage Area Network• Server Architecture Based on SAN & NAS

• Network Protocol (FC-AL and SSA)

• Spatial Reuse

• Multiple Links and Switch Based Multiple FC-AL

HostSAN

FC-AL

InternetConnection

InternetConnection

InternetConnection

InternetConnection

Host Host

Host

FC-AL

FC-AL

SAN

SAN

FC-AL FC-AL

Outline

• What is a SAN?

• Why SAN?

• What is a SAN composed of?

• SAN, NAS or DAS?

SAN Components

Interconnects

Adapter

Server Host

Interconnects (The heart of a SAN)

• CableConnect the components with each other

• AdaptersConnect to devices and control the protocol

• Switches (Fabric)

Storage Array

Tapes Hard Disks

Interconnect devices, increase bandwidth, reduce

congestion and provide aggregate throughput

provide simple NameServer services.

or Hub (Arbitrated Loop) Share bandwidth

Fibre Channel SAN or FC SAN

IP Network SAN or IP SAN

InfiniBand SAN or IB SAN

SAN Components

Note: It doesn't say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect technology. A growing number of network technologies have architectural and physical properties that make them suitable for use in SANs. - See more at: http://www.snia.org/education/storage_networking_primer/san/what_san#sthash.9cPWdUBs.dpuf

Fibre Channel SAN or FC SAN

IP Network SAN or IP SAN

InfiniBand SAN or IB SAN

SAN Components

Note: It doesn't say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect technology. A growing number of network technologies have architectural and physical properties that make them suitable for use in SANs. - See more at: http://www.snia.org/education/storage_networking_primer/san/what_san#sthash.9cPWdUBs.dpuf

Fibre Channel

SAN Components

Fibre Channel started in 1988, with ANSI standard approval in 1994, to simplify HPPI (High Performance Parallel Interface) system.

FC is a high-speed network technology (commonly running at 2-, 4-, 8- and

16-gigabit per second rates) primarily used to connect computer data

storage. (32-Gigabit, 128-Gigabit speeds in 2016)

FC is the best design combining the I/O Channel with Networking.

Networking pays most attention on handling the changes of configuration and

loads as well as addressing data to proper destination.

I/O Channel focuses on the performance, which means to move data with least

latency by utilizing a rigorous and simple protocol.

FC maintains the speed and low overhead of a channel while adding the

flexibility (through connectivity) and the longer distances that are characteristic of

a networking.

SAN Components

Eventually the market chose FC over SSA (Serial Storage Architecture).

The competition of High-end Storage Technology

FC SSA

Throughput 531.25 Mb/s 640Mb/s

Device amount UnlimitedUp to 192 hot swappable hard disk per system

Up to 32 separate RAID arrays per adaptor

Distance 10km 10km( with 25 meters apart among arrays)

Up layer Protocol ATM, IP, FICON, SCSI SCSI-3

Fibre Channel Topologies:

SAN Components

FC-P2P:Point to point

The easiest configuration

The easiest to administer

High-speed interconnect between

two nodes

Possible Usage

• Between Central Processing Units

• From a workstation to a specialized graphics processor or simulation accelerator

• From a file server to a disk array

……

Fibre Channel Topologies:

SAN Components

FC-AL: Arbitrated Loop

1. First arbitrate to win control of the loop.2. Establish a point-to-point (virtual)

connection3. two nodes consume all of the loop’s

bandwidth until the data transfer operation is complete

Advantages

• Lower-cost alternative

• Support of up to 126 devices is possible on a single loop.

• ……

However, by 2007, FC-AL had become rare in server-to-storage communication

Fibre Channel Topologies:

SAN Components

FC-SW: Switched Fabric

Increased bandwidth

Increased number of devices

scalable performance

maximum of 16 million devices

FC-SW topology is what we deploy in a SAN.

High cost : Switch is the most costly hardware device.

Fibre Channel Switches

FC Host Bus Adapter

Server Host

Fibre or copper cable

Fibre cable

Fibre Channel SAN

SAN Components

FC Host Bus AdapterA unique World Wide Name (WWN)

CableCopper 15m 100 MB/sFibre 10km 2000MB/s

Fibre Channel SwitchesDirectors

No single point of failure (high availability)

Switches

smaller, fixed-configuration, less redundant devices

Fibre Channel Protocol Layers

SAN Components

FC-4

FC-3

FC-2

FC-1

FC-0

Fibre Channel Layers

SAN Components

FC-0 Physical layer : describes the physical interface

• an analog interface to transmitter

• a digital interface to the FC-1 layer

• the requirements for infrastructures

Transport media

Receiver hardware

……

Example of options of FC-0 Plants

Fibre Channel Layers

SAN Components

FC-1 Encode/Decode Layer: describes the means of encoding/decoding user data

8/10 bit encode/decode scheme

8b/10b encoding was proposed by Albert X. Widmer and Peter A.

Franaszek of IBM Corporation in 1983.

Minimize errors by equalizing the number of 1’s and 0’s transmitted and not

allowing more than 4 consecutive bits of the same type in a row. Allows for distinguishing “Special Characters (K28.5)” and also provides for

simplifying byte and word alignment.

the evening out of 1’s and 0’s allows for the design of relatively inexpensive

transmitter/receiver circuitry.

SAN ComponentsFC-1 Encode/Decode Layer: Encode Process

FC-2 byte notation: 0xBC (Hexadecimal)

FC-2 bit notation: 1 0 1 1 1 1 0 0 K7 6 5 4 3 2 1 0 Variable

FC-1 un-encoded: 1 0 1 1 1 1 0 0 KH G F E D C B A Z

FC-1 reordered for : K 1 1 1 0 0 1 0 1Z E D C B A F G H

Z XX . y

K28.5

FC-1 encoded : 0 0 1 1 1 1 1 0 1 0A B C D E i F G H j

5B/6B (Negative) 3B/4B(+Previous Running Disparity)

Fibre Channel Layers

SAN Components

FC-2: Framing Protocol/Flow Control

data using frames

flow control

classes of service

SAN Components Frames are the basic package used to encapsulate and transport the data.

Two types of Frames Data Frame

Link Control Frame

A group of related Frames transmitted in one direction constitute a

sequence.

Exchanges are groups of related Sequences.

SoF: the “comma”

and 3 bytes

indicating the type of

connection service

Expiration Security Header

Network Header

Association Header

Device Header

User Data(Not used in Link Control Frame )

Verify the data integrity of the FH and Payload

Designates the end of the Frame content

and validity of the Frame’s content

SAN Components FC-2 controls the flow of Frames between ports so that receiver buffers are

not overrun.

Buffer is maintained by the Sequence Initiator (transmitter) and is used to

throttle the transmission of Frames.

There are two basic types of flow control.

End to End Control in N_port to N_port communicationsThe receiver responds to all valid Frames it receives with an ACK Frame.

Buffer to Buffer Control in N_port talking to a Fabric or an N_port to N_port

connection in a Point to Point topologyEach side is responsible for maintaining its own BB_Credit_Count.

SAN Components FC-2 provides up to 5 Classes of Service (CoS). The different CoS represent

different levels of delivery guarantee, bandwidth and connectivity.

Class 1

dedicated connection

remain active until being closed.

R_RDY on Connect Request only

sustained, high throughput transactions

SAN Components

Class 2

control on a Frame by Frame

Basis

allows interleaving of

Sequences over the single

connection from multiple

N_ports

the ACK for every Frame. Also R_RDY.

SAN Components

Class 3

provides a connectionless service

with no acknowledgment

lack of ACK. Only R_RDY for link

maintenance

Fibre Channel Layers

SAN Components

FC-3: Common Services

The FC-3 level is not currently fully defined. The term “common services”

means a service that would utilize multiple N_ports working together on

a single node.

Fibre Channel Layers

SAN Components

FC-4: Upper Level Protocol Support

The FC-4 level supports the mapping of Upper Level Protocols (ULP) onto

Fibre Channel data structures.

SCSI (Small Computer Systems Interface)

IPI-3 (Intelligent Peripheral Interface-3)

HiPPI (High Performance Parallel Interface)

IP (Internet Protocol) - IEEE 802.2 (TCP/IP) data

ATM/AAL5 (ATM adaptation layer for computer data)

SBCCS (Single Byte Command Code Set)

The way that FC serves as a transport for ULPs is by mapping the ULP

messages(known as Information Units) into FC Sequences and/or Exchanges.

SAN Components IP over FC

IP datagram

ARP datagram

Moving between nodes on networks using the IP protocol stack

ARP datagram is used during network configuration to map IP addresses to

Media Access Control addresses (used for routing).

A dedicated ARP server must be set up at a “well known” address

Two kinds of Information Units

SAN Components IP over FC

Frame Header Network Header Payload Frame Header Payload … Frame

Split

Optional HeaderThe First Frame Additional Frame

Network Header

IP Packets

SAN Components SCSI over Fibre Channel (Predominate in FC SAN)

Generally, FCP stands for Fibre Channel Protocol for SCSI.*

The transport is accomplished by wrapping SCSI command, response, status and data

blocks.

*Norman, David. "Fibre Channel Technology for Storage Area Networks."

SCSI Command

SAN Components SCSI over Fibre Channel

*Norman, David. "Fibre Channel Technology for Storage Area Networks."

Receive

Handle

Initiator FCP_Port Target FCP_Port

Read Example

Fibre Channel SAN or FC SAN

IP Network SAN or IP SAN

InfiniBand SAN or IB SAN

SAN Components

Note: It doesn't say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect technology. A growing number of network technologies have architectural and physical properties that make them suitable for use in SANs. - See more at: http://www.snia.org/education/storage_networking_primer/san/what_san#sthash.9cPWdUBs.dpuf

Network Switches

iSCSI Host Bus Adapter

Server Host

Ethernet

IP SAN

SAN Components

iSCSI HBAsiSCSI Node Names

CableEthernet

Network Switches

An IP SAN is a Storage Area Network that

uses the iSCSI protocol to transfer block-level

data over a network, generally Ethernet.

Fibre Channel SAN or FC SAN

IP Network SAN or IP SAN

InfiniBand SAN or IB SAN

SAN Components

Note: It doesn't say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect technology. A growing number of network technologies have architectural and physical properties that make them suitable for use in SANs. - See more at: http://www.snia.org/education/storage_networking_primer/san/what_san#sthash.9cPWdUBs.dpuf

InfiniBand Network Architecture

IB SAN

InfiniBand is a network communications protocol that offers a switch-based fabric

of point-to-point bi-directional serial links between processor nodes, as well as

between processor nodes and input/output nodes, such as disks or storage.

Higher throughput – 56Gb/s per server and storage connection, and soon 100Gb/s, compared

to up-to 40Gb Ethernet and Fibre Channel

Lower latency – RDMA zero-copy networking reduces OS overhead so data can move

through the network quickly

Enhanced scalability – InfiniBand can accommodate theoretically unlimited-sized flat networks

based on the same switch components simply by adding additional switches

Higher CPU efficiency – Data movement offloads the CPU

InfiniBand Architecture

EndNodes: Servers and Devices

Link: copper and optical fibre*

1X fibre link has two optical fibres, one for each direction

Switches: IBA Switches A private, protected channel directly between the nodes was established by switches. Adapters: Host Channel Adapter Data and message movement without CPU involvement with RDMA and Send/Receive

offloads is performed by adapters. The adapters are connected on one end to the CPU over a PCI Express interface and to

the InfiniBand subnet through InfiniBand network ports. Subnet Manager: Routing define and Subnet discovery

InfiniBand Architecture

IB Storage Stack

InfiniBand Architecture

IB Communication Stack

A Consumer is a process with virtual

address space.

A Consumer can have more than one QP.

A QP(Queue Pair) is a Virtual Interface.

A OP includes a Send Q and Receive Q.

QPs are the endpoints of Channel.

A Channel Adapter has up to 2^24 QPs.

QPs are independent with each other.

IB Message Transfer Semantics Send/Receive

Simply send and receive.

RDMA Read/Write

Directly Read and write to Virtual Memory

InfiniBand Architecture

IB Message Transfer Semantics: Send/Receive

Step:

1. Initiator put the message in the SND.

2. The Message is sent to Target.

3. Target receive the Message.

4. Target put the Message in the RCV.

InfiniBand Architecture

IB Message Transfer Semantics: RDMA

Step:

1. Application on initiator registers a

buffer and puts the send request in

SND.

2. Target receives the request and

reads the data from initiator buffer

directly.

3. Target returns a status to Initiator.

Complete IBA Packet Format

Local Routing Header

Global Routing Header

Base Transport

Header

ExtendedTransport

Header

Immediate Data

MessagePayload

InvariantCRC

VariantCRC

8 Bytes 40 Bytes 12 Bytes 28 bytes 4 Bytes 0-4096 Bytes 4 Bytes 2 Bytes

Intra-subnet

Inter-subnettells endnodes what to

do with packets

Message

InfiniBand Architecture is said to be message-oriented.

A message can be any size ranging up to 2^31 bytes in size.

The InfiniBand hardware automatically segments the outbound message into a

number of packets.

InfiniBand Architecture

IB Verbs

InfiniBand architecture does not

define APIs, only provides the basis

for specifying the APIs.

A verb is a method by which an

application requests an action from

InfiniBand’s message transport

service.

Other organizations, such as the

OpenFabrics Alliance, provide a

complete set of APIs and software

that implements the verbs to work

seamlessly with the InfiniBand

hardware.

InfiniBand Architecture

IB Up Layer Protocol

InfiniBand Architecture

Linux InfiniBand software architecture

The upper level protocols

IPoIB : IP over IB

SRP : SCSI RDMA Protocol

SDP : Sockets Direct Protocol

iSER : iSCSI Extensions for RDMA

SRP Protocol

InfiniBand Architecture

Linux InfiniBand SRP Protocol architecture

SCSI RDMA Protocol (SRP) was

defined by the ANSI T10 committee to

provide block storage capabilities for

the InfiniBand architecture.

SRP is a protocol that tunnels SCSI

request packets over InfiniBand hardware

SAN Components (Summary)

FC SAN IP SAN IB SAN

Bandwidth100Mb(Copper)

20Gb(Fibre)32Gband 128Gb(Coming)

100Mb or 1Gb(Ethernet)10Gb(10GB Ethernet)

120Gb(12X)

LatencyDedicated to

block I/ODirect connection

Dedicated to block I/O

Distance15m(Copper)20km(Fibre)

Internet-based Long-distance

125m(12X)10km(1X)

Cost High Cheap Medium

Outline

• What is a SAN?

• Why SAN?

• What is a SAN composed of?

• SAN, NAS or DAS?

SAN, NAS or DAS?

SAN

More Efficient Block-Level data access

NAS

Convenient data sharing in homogenous File System

DAS

Easy implement and low cost

Acknowledgement Professor David Du gives me numerous basic knowledge on Storage

System and provides this interesting topic.

During the preparation for the presentation, Dr. Fenggang Wu helps mereview the slices and gives me significant references.

Reference• www.snia.org• Khattar, Ravi Kumar, et al. Introduction to Storage Area Network, SAN. IBM Corporation,

International Technical Support Organization, 1999.• https://en.wikipedia.org/wiki/Storage_area_network• https://en.wikipedia.org/wiki/Fibre_Channel#Fibre_Channel_topologies• http://www.networkworld.com/article/2174282/lan-wan/fibre-channel-will-come-with-

32-gigabit--128-gigabit-speeds-in-2016.html• https://www.pctechguide.com/interfaces/hard-disks-what-is-serial-storage-architecture• https://en.wikipedia.org/wiki/Fibre_Channel_point-to-point• Shanley, Tom, and Joe Winkles. InfiniBand Network Architecture. Addison-Wesley

Professional, 2003.• IP SAN Fundamentals: An Introduction to IP SANs and iSCSI• Norman, David. "Fibre Channel Technology for Storage Area Networks.“• Grun, Paul. "Introduction to infiniband for end users." White paper, InfiniBand Trade

Association (2010).

Thank you