20

Click here to load reader

TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

Embed Size (px)

Citation preview

Page 1: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

Page 105 Jul 2000

TCP ULP Message FramingiSCSI Framing

Randy HaagensAllyn Romanow

Page 2: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 2Randy Haagens /Allyn Romanow

Outline

•The Problem -- Conserving host memorybandwidth

•How to solve it -- Direct data placement•The framing issue•Solutions to framing -- pros and cons

–TCP unaware–TCP aware

•TCP message boundary option

Page 3: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 3Randy Haagens /Allyn Romanow

The Problem: Cost, Feasibility of Re-assembly

•Limited host memory and bus bandwidth

–PCI-X delivers approx. 8.5 Gbps

•Must deliver data directly to host memory buffers

–One use of bus and memory

–“Zero copy”

•Re-assembly buffer required on NIC to reorderTCP segments received out of order

–1 Gbps, possible, expensive (Fibrechannel)

–10 Gbps, it does not look feasible

Page 4: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 4Randy Haagens /Allyn Romanow

“Conventional” NIC Implementation

protocolcontroller

re--assembly

buffer

lineinterface

PCI-X 1.064 GB/sec nom. (8.512 Gb/sec)

hostmemory

memorycontroller

1Gbps 10 Gbps

• Possible• More

expensivethan FC

• Questionablefeasibility

2X-3X BW

Page 5: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 5Randy Haagens /Allyn Romanow

Desired NIC Implementation

protocolcontroller

re--assembly

buffer

lineinterface

PCI-X 1.064 GB/sec nom. (8.512 Gb/sec)

hostmemory

memorycontroller

2X-3X BWX

Page 6: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 6Randy Haagens /Allyn Romanow

The Solution: Direct Data Placement

Host Memory(Large)

Payload steeringData steeringRDMA

•ULP (iSCSI) payload placed directlyin host memory without intermediatebuffer

•Requires decoding ULP (iSCSI)headers

Descriptors

Page 7: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 7Randy Haagens /Allyn Romanow

Loss of ULP Synchronization

Netw

ork headersN

etwork headers

Netw

ork headersN

etwork headersTC

P TC

P TC

P TC

P headerheaderheaderheader

ULP

ULP

ULP

ULP H

eader H

eader H

eader H

eader

TCP segment (MSS)TCP segment (MSS)TCP segment (MSS)TCP segment (MSS)

ULP PDU (Message)ULP PDU (Message)ULP PDU (Message)ULP PDU (Message)

•Segment containing a ULP header is dropped (ordelayed), ULP sync is lost. Direct data placementcannot continue; data must be diverted to a re-assembly buffer

Xrecover ULP sync hererecover ULP sync hererecover ULP sync hererecover ULP sync here

Page 8: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 8Randy Haagens /Allyn Romanow

Recovery of ULP Framing Synchronization

•Goal is to recover ULP synchronization at the nextULP header

•Two categories of approaches–TCP unaware, transparent to TCP–TCP aware, not transparent to TCP

Page 9: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 9Randy Haagens /Allyn Romanow

Approaches that are Transparent to TCP

•SCTP– Framing with chunks, but requires maturity

•Special characters: byte stuffing, recoding– Onerous for software, process stream byte by

byte•Fixed-length ULP messages

– Inefficient for short ULP messages•Periodic Marker - Best in class

– Manageable software overhead to insert andremove markers; relatively easy to implementin hardware

Page 10: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 10Randy Haagens /Allyn Romanow

Periodic Marker

TCP byte streamTCP byte streamTCP byte streamTCP byte stream

MarkerMarkerMarkerMarkerSeqCntSeqCntSeqCntSeqCnt = 0 = 0 = 0 = 0

((((SeqCntSeqCntSeqCntSeqCnt mod N) = mod N) = mod N) = mod N) =0000

N

Length field indicates start of nextLength field indicates start of nextLength field indicates start of nextLength field indicates start of nextmessagemessagemessagemessage

•Marker 4B field--number of ULP bytes remainingin current PDU. Marker inserted and removedby framing protocol, e.g. iSCSI

•After loss of synch, locate next marker; use tolocate the next ULP PDU

•Markers are transmitted twice in a row. Ensuresmarkers can’t be split by stream segmentation

ULP PDU (Message)ULP PDU (Message)ULP PDU (Message)ULP PDU (Message)

Page 11: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 11Randy Haagens /Allyn Romanow

Approaches that are not Transparent to TCP

•URGent pointer–Not allowed -- IESG, not within spec

•PSH bit–Use of PSH as a record marker is not allowed(RFC 1122)

•TCP option for finding ULP message boundary

Page 12: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 12Randy Haagens /Allyn Romanow

Message Boundary Recognition at TransportLayer

•Procedure

•TCP options - background

•TCP message boundary option for finding ULPframing

Page 13: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 13Randy Haagens /Allyn Romanow

Message Boundary Support at Transport Layer

•General solution at transport layer vs.individualized solutions for different applications

•Procedure for standardizing a TCP option–TSV working group -- work item

– Meeting– Proposal, mailing list– Time frame

•IPS -- follow the TSV WG progress

Page 14: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 14Randy Haagens /Allyn Romanow

TCP Options - Overview

•Extend TCP•Up to 40 bytes, before TCP payload•Current TCP options

–MSS Maximum Segment Size–Window Scale Factor–Timestamp–SACK Selective Acknowledgment

•Reference - Stevens

Page 15: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 15Randy Haagens /Allyn Romanow

TCP Options - Overview - Issues

•Built-in mechanism for extension of TCP

•Limited space for options, 40 bytes, scarceresource

•Tension between TCP evolution and risk ofchanges, tend to minimize changes

Page 16: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 16Randy Haagens /Allyn Romanow

TCP Message Boundary Option

•Two options–One for negotiating the option–The second for communicating the messageboundary information

Kind = ?Kind = ?Kind = ?Kind = ? Len = 2Len = 2Len = 2Len = 2

•Message Boundary Permitted Option–Sent with the TCP SYN packets

Page 17: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 17Randy Haagens /Allyn Romanow

Message Boundary Option

•Two approaches so far -- flag, offset

Flag Approach

Kind = ?Kind = ?Kind = ?Kind = ? Len = 2Len = 2Len = 2Len = 2

•Costa has written up a description of thisalternative

•ULP header is aligned with first byte of TCPsegment payload

•May cause segments smaller than an MSS

Page 18: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 18Randy Haagens /Allyn Romanow

TCP Framing Option (a) Flag

Netw

ork headersTC

P headerU

LP Header MSS MSS < MSS < MSSMSS

TCP segment (MSS)

ULP PDU (Message)

Option: first

Option: first

Page 19: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 19Randy Haagens /Allyn Romanow

Message Boundary Option

Offset approach

Kind = ?Kind = ?Kind = ?Kind = ? Len = 4Len = 4Len = 4Len = 4 offsetoffsetoffsetoffset offsetoffsetoffsetoffset

•2-byte field indicates offset into TCP payload offirst ULP header in the segment

•Write-up forthcoming

Netw

ork headersTC

P headerU

LP Header MSS MSS MSSMSS

TCP segment (MSS)

ULP PDU (Message)

Page 20: TCP ULP Message Framing iSCSI Framing - IBM ULP Message Framing iSCSI Framing Randy Haagens Allyn Romanow Randy Haagens /Allyn Romanow December 15, 2000 Page 2 Outline •The Problem

December 15, 2000 Page 20Randy Haagens /Allyn Romanow

Conclusion

•The right way to solve the problem is at the TCPlayer

•We’re going to try it•Let’s see how it plays

•DISCUSSION