Performance and Scalability of xrootd

Performance and Scalability of xrootd

Andrew Hanushevsky (SLAC),Wilko Kroeger (SLAC), Bill Weeks (SLAC),

Fabrizio Furano (INFN/Padova), Gerardo Ganis (CERN)Jean-Yves Nief (IN2P3), Peter Elmer (U Wisconsin)

Les Cottrell (SLAC), Yee Ting Li (SLAC)

Computing in High Energy Physics

13-17 February 2006 http://xrootd.slac.stanford.edu xrootd is largely funded by the US Department of Energy

Contract DE-AC02-76SF00515 with Stanford University

CHEP 13-17 February 2006 2: http://xrootd.slac.stanford.edu

Outline

Architecture Overview Performance & Scalability

Single Server Performance Speed, latency, and bandwidth Resource overhead

Scalability Server and administrative

Conclusion


authentication(gsi, krb5, etc)

Clustering(olbd)

lfn2pfnprefix encoding

Storage System(oss, drm/srm, etc)

authorization(name based)

File System(ofs, sfs, alice, etc)

Protocol (1 of n)(xrootd)

xrootd Plugin Architecture

Protocol Driver(XRD)


Performance Aspects

Speed for large transfers MB/Sec

Random vs Sequential Synchronous vs asynchronous Memory mapped (copy vs “no-copy”)

Latency for small transfers sec round trip time

Bandwidth for scalability “your favorite unit”/Sec vs increasing load


Raw Speed I (sequential)

Disk Limit

Sun V20z2x1.86GHz Opteron 244

16GB RAMSeagate ST373307LC73GB 10K rpm SCSI

sendfile() anyone?


Raw Speed II (random I/O)

(file not preloaded)


Latency Per Request


Event Rate Bandwidth

NetApp FAS270: 1250 dual 650 MHz cpu, 1Gb NIC, 1GB cache, RAID 5 FC 140 GB 10k rpmApple Xserve: UltraSparc 3 dual 900MHz cpu, 1Gb NIC, RAID 5 FC 180 GB 7.2k rpm Sun 280r, Solaris 8, Seagate ST118167FCCost factor: 1.45


Latency & Bandwidth

Latency & bandwidth are closely related Inversely proportional if linear scaling present

The smaller the overhead the greater the bandwidth Underlying infrastructure is critical

OS and devices


Server Scaling (Capacity vs Load)


ESnet routed ESnet SDN layer 2 via USN

SLAC to Seattle

BW Challenge

Seattle to SLAC

•SC2005 BW Challenge•Latency Bandwidth

•8 xrootd Servers•4@SLAC & 4@Seattle•Sun V20z w/ 10Gb NIC•Dual 1.8/2.6GHz Opterons•Linux 2.6.12

•1,024 Parallel Clients•128 per server

•35Gb/sec peak•Higher speeds killed router•2 full duplex 10Gb/s links•Provided 26.7% overall BW

•BW averaged 106Gb/sec•17 Monitored links total

I/OBandwidth (wide area network)

http://www-iepm.slac.stanford.edu/monitoring/bulk/sc2005/hiperf.html


xrootd Server Scaling

Linear scaling relative to load Allows deterministic sizing of server

Disk NIC CPU Memory

Performance tied directly to hardware cost Underlying hardware & software are critical


Overhead Distribution


OS Effects


Device & File System Effects

CPU limited

I/O limited

1 Event 2K

UFS good on small readsVXFS good on big reads


NIC Effects


Super Scaling

xrootd Servers Can Be Clustered Support for over 256,000 servers per cluster Open overhead of 100us*log64(number servers)

Uniform deployment Same software and configuration file everywhere No inherent 3rd party software requirements

Linear administrative scalingEffective load distribution


Cluster Data Scattering (usage)


Cluster Data Scattering (utilization)


Low Latency Opportunities

New programming paradigm Ultra-fast access to small random blocks

Accommodate object data Memory I/O instead of CPU to optimize access

Allows superior ad hoc object selection Structured clustering to scale access to memory

Multi-Terabyte memory systems at commodity prices PetaCachePetaCache Project SCALLASCALLA SStructured CCluster AArchitecture for LLow LLatency AAccess

Increased data exploration opportunities


Memory Access Characteristics

Block size effect on average overall

latency per I/O (1 job - 100k I/O’s)

Scaling effect on average overall

latency clients (5 - 40 jobs)

Disk I/O

Mem I/O


Conclusion

System performs far better than we anticipatedWhy? Excruciating attention to details

Protocols, algorithms, and implementation Effective software collaboration

INFN/Padova: Fabrizio Furano, Alvise Dorigao Root: Fons Rademakers, Gerri Ganis Alice: Derek Feichtinger, Guenter Kickinger Cornell: Gregory Sharp SLAC: Jacek Becla, Tofigh Azemoon, Wilko Kroeger, Bill Weeks BaBar: Pete Elmer

Critical operational collaboration BNL, CNAF, FZK, INFN, IN2P3, RAL, SLAC

Commitment to “the science needs drive the technology”

Documents

Performance and Scalability of xrootd