Upload
nasim-alvarez
View
50
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Performance and Scalability of xrootd. Andrew Hanushevsky (SLAC), Wilko Kroeger (SLAC), Bill Weeks (SLAC), Fabrizio Furano (INFN/Padova), Gerardo Ganis (CERN) Jean-Yves Nief (IN2P3), Peter Elmer (U Wisconsin) Les Cottrell (SLAC), Yee Ting Li (SLAC). Computing in High Energy Physics - PowerPoint PPT Presentation
Citation preview
Performance and Scalability of xrootd
Andrew Hanushevsky (SLAC),Wilko Kroeger (SLAC), Bill Weeks (SLAC),
Fabrizio Furano (INFN/Padova), Gerardo Ganis (CERN)Jean-Yves Nief (IN2P3), Peter Elmer (U Wisconsin)
Les Cottrell (SLAC), Yee Ting Li (SLAC)
Computing in High Energy Physics
13-17 February 2006 http://xrootd.slac.stanford.edu xrootd is largely funded by the US Department of Energy
Contract DE-AC02-76SF00515 with Stanford University
CHEP 13-17 February 2006 2: http://xrootd.slac.stanford.edu
Outline
Architecture Overview Performance & Scalability
Single Server Performance Speed, latency, and bandwidth Resource overhead
Scalability Server and administrative
Conclusion
CHEP 13-17 February 2006 3: http://xrootd.slac.stanford.edu
authentication(gsi, krb5, etc)
Clustering(olbd)
lfn2pfnprefix encoding
Storage System(oss, drm/srm, etc)
authorization(name based)
File System(ofs, sfs, alice, etc)
Protocol (1 of n)(xrootd)
xrootd Plugin Architecture
Protocol Driver(XRD)
CHEP 13-17 February 2006 4: http://xrootd.slac.stanford.edu
Performance Aspects
Speed for large transfers MB/Sec
Random vs Sequential Synchronous vs asynchronous Memory mapped (copy vs “no-copy”)
Latency for small transfers sec round trip time
Bandwidth for scalability “your favorite unit”/Sec vs increasing load
CHEP 13-17 February 2006 5: http://xrootd.slac.stanford.edu
Raw Speed I (sequential)
Disk Limit
Sun V20z2x1.86GHz Opteron 244
16GB RAMSeagate ST373307LC73GB 10K rpm SCSI
sendfile() anyone?
CHEP 13-17 February 2006 6: http://xrootd.slac.stanford.edu
Raw Speed II (random I/O)
(file not preloaded)
CHEP 13-17 February 2006 7: http://xrootd.slac.stanford.edu
Latency Per Request
CHEP 13-17 February 2006 8: http://xrootd.slac.stanford.edu
Event Rate Bandwidth
NetApp FAS270: 1250 dual 650 MHz cpu, 1Gb NIC, 1GB cache, RAID 5 FC 140 GB 10k rpmApple Xserve: UltraSparc 3 dual 900MHz cpu, 1Gb NIC, RAID 5 FC 180 GB 7.2k rpm Sun 280r, Solaris 8, Seagate ST118167FCCost factor: 1.45
CHEP 13-17 February 2006 9: http://xrootd.slac.stanford.edu
Latency & Bandwidth
Latency & bandwidth are closely related Inversely proportional if linear scaling present
The smaller the overhead the greater the bandwidth Underlying infrastructure is critical
OS and devices
CHEP 13-17 February 2006 10: http://xrootd.slac.stanford.edu
Server Scaling (Capacity vs Load)
CHEP 13-17 February 2006 11: http://xrootd.slac.stanford.edu
ESnet routed ESnet SDN layer 2 via USN
SLAC to Seattle
BW Challenge
Seattle to SLAC
•SC2005 BW Challenge•Latency Bandwidth
•8 xrootd Servers•4@SLAC & 4@Seattle•Sun V20z w/ 10Gb NIC•Dual 1.8/2.6GHz Opterons•Linux 2.6.12
•1,024 Parallel Clients•128 per server
•35Gb/sec peak•Higher speeds killed router•2 full duplex 10Gb/s links•Provided 26.7% overall BW
•BW averaged 106Gb/sec•17 Monitored links total
I/OBandwidth (wide area network)
http://www-iepm.slac.stanford.edu/monitoring/bulk/sc2005/hiperf.html
CHEP 13-17 February 2006 12: http://xrootd.slac.stanford.edu
xrootd Server Scaling
Linear scaling relative to load Allows deterministic sizing of server
Disk NIC CPU Memory
Performance tied directly to hardware cost Underlying hardware & software are critical
CHEP 13-17 February 2006 13: http://xrootd.slac.stanford.edu
Overhead Distribution
CHEP 13-17 February 2006 14: http://xrootd.slac.stanford.edu
OS Effects
CHEP 13-17 February 2006 15: http://xrootd.slac.stanford.edu
Device & File System Effects
CPU limited
I/O limited
1 Event 2K
UFS good on small readsVXFS good on big reads
CHEP 13-17 February 2006 16: http://xrootd.slac.stanford.edu
NIC Effects
CHEP 13-17 February 2006 17: http://xrootd.slac.stanford.edu
Super Scaling
xrootd Servers Can Be Clustered Support for over 256,000 servers per cluster Open overhead of 100us*log64(number servers)
Uniform deployment Same software and configuration file everywhere No inherent 3rd party software requirements
Linear administrative scalingEffective load distribution
CHEP 13-17 February 2006 18: http://xrootd.slac.stanford.edu
Cluster Data Scattering (usage)
CHEP 13-17 February 2006 19: http://xrootd.slac.stanford.edu
Cluster Data Scattering (utilization)
CHEP 13-17 February 2006 20: http://xrootd.slac.stanford.edu
Low Latency Opportunities
New programming paradigm Ultra-fast access to small random blocks
Accommodate object data Memory I/O instead of CPU to optimize access
Allows superior ad hoc object selection Structured clustering to scale access to memory
Multi-Terabyte memory systems at commodity prices PetaCachePetaCache Project SCALLASCALLA SStructured CCluster AArchitecture for LLow LLatency AAccess
Increased data exploration opportunities
CHEP 13-17 February 2006 21: http://xrootd.slac.stanford.edu
Memory Access Characteristics
Block size effect on average overall
latency per I/O (1 job - 100k I/O’s)
Scaling effect on average overall
latency clients (5 - 40 jobs)
Disk I/O
Mem I/O
CHEP 13-17 February 2006 22: http://xrootd.slac.stanford.edu
Conclusion
System performs far better than we anticipatedWhy? Excruciating attention to details
Protocols, algorithms, and implementation Effective software collaboration
INFN/Padova: Fabrizio Furano, Alvise Dorigao Root: Fons Rademakers, Gerri Ganis Alice: Derek Feichtinger, Guenter Kickinger Cornell: Gregory Sharp SLAC: Jacek Becla, Tofigh Azemoon, Wilko Kroeger, Bill Weeks BaBar: Pete Elmer
Critical operational collaboration BNL, CNAF, FZK, INFN, IN2P3, RAL, SLAC
Commitment to “the science needs drive the technology”