Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
ROBIN Project Update
Dr. Wenji Wu ([email protected]), Fermilab
LHCOPN-LHCONE meeting #46
Tuesday, March 23, 2021
Many people’s hard work
FNAL: Wenji Wu, Liang Zhang, Qiming Lu, Amy Jin, Phil DeMar
iCAIR/StarLight: Joe Mambretti, Se-young Yu, Fei Yeh, Jim-Hao Chen
ESnet: Inder Monga, Xi Yang, Tom Lehman, Chin Guok,
John Macauley
Outline
• Objectives
• Evaluation Methodology
• Testbed features and configurations
• ROBIN Deployment and Configurations
• Rucio/FTS Deployment and Configurations
• Results
• Conclusion and Future Plans
Objectives: Evaluate and Compare two data service platformsRucio/BigData Express/SENSE (ROBIN) vs. Rucio/FTS
Rucio/BigData Express/SENSE (ROBIN)
Rucio/FTS
The Next Generation Data Service Platform Existing Data Service Platform
Evaluation Methodology - 1
• Deploy ROBIN and RUCIO/FTS on a trans-Atlantic international testbed to evaluate and compare
• A Trans-Atlantic International Testbed• Two administratively independent sites
• The StarLight International/National Communication Exchange Facility in Chicago
• The CERNLight Open Exchange in Switzerland
• A dedicated layer-2 WAN circuit connects the two sites
Evaluation Methodology - 2• Run data transfer between Starlight and CERN DTNs
• Performance metrics • Data Transfer Throughput
• Five Scenarios• Single large file: 20 GB
• Group of small files: Linux source tree 4.4.9 ( 718MB Total, 53351 files, Max. 2MB)
• Dataset-10%: mixed of large (10GB), medium ( 10 MB ), and small (1 MB) files, total size 200GB with 10% of small files
• Dataset-20%: mixed of large (10GB), medium ( 10 MB ), and small (1 MB) files, total size 200GB with 20% of small files
• Dataset-40%: mixed of large (10GB), medium ( 10 MB ), and small (1 MB) files, total size 200GB with 40% of small files
# of Small files # of Medium files # of Large files
Dataset-10% 20000 8000 10
Dataset-20% 40000 8000 8
Dataset-40% 80000 6000 6
Testbed Features and Configurations – DTN
Role Host Name Public Network Private Network Storage
DTN dtn04.cern.ch192.91.245.29 (enp1s0 )
1 Gb/s
10.250.38.200 (enp4s0f0.2038@enp4s0f0)
40 Gb/s/dev/sdc (SSD)
Role Host Name Public Network Private Network Storage
DTNdtn110.sl.starta
p.net165.124.33.142 (management)
1 Gb/s
10.250.38.53 (vlan2038@p4p1)
100 Gb/s/dev/nvme0n1 (SSD)
Starlight:
CERN:
Run “dd” to benchmark DTN Disk performance
Host Disk Throughput
dtn04.cern.ch /dev/sdc (SSD) 259.9 MB/s = 2.079 Gb/s
dtn110.sl.startap.net /dev/nvme0n1 (NVME-SSD) 1091 MB/s = 8.728 Gb/s
Testing scripts
SSD testing tool, FIO, shows similar metrics.
Testbed Features and Configurations – DTN (cont.)
Run “iperf” to benchmark Network Throughput between StarLight <--> CERN
Tool Source Destination Public – Public Network Private – Private Network
iperf3 dtn110.sl.startap.net dtn04.cern.ch 0.94 Gb/s = 120.3 MB/s 7.80 Gb/s = 998.4 MB/s
Testing scriptsPlatform Parameters
Testbed Features and Configurations – Network
ROBIN Deployment and Configuration
BDE Head @wwportal:5000
BDE Portal
MDTMFTP clientBDE Server
BDE Launcher
BDE Rucio Extension
Rucio Client @ mac-131933
CERN
BDE/mdtmFTP
BDE-Rucio
Rucio
BDE/mdtmFTP
Rucio Server@ wwportal:8443
Postgres
Rucio Cores Rucio Daemons
BigData Express Plugin
SQL
Starlight
SENSE Service (data plane: VLAN 2038)
BDE Head @ cixp-surfnet-dtn.cern.ch:5000
BDE Portal
MDTMFTP clientBDE Server
BDE Launcher
BDE Rucio Extension
BDE Agent MDTMFT Server
BDE DTN @dtn110.sl.startap.net
Files Files
BDE Agent MDTMFT Server
BDE DTN @dtn04.cern.ch
Files Files
BDE-Rucio
Rucio
Location Hostname
Starlight wwportal
Starlight dtn110.sl.startap.net
CERN cixp-surfnet-dtn.cern.ch
CERN dtn04.cern.ch
Home mac-131933.local
control plane
1
2
3
4n Terminals and Web UI
Web server
Rucio/FTS Deployment and Configuration
FTS Service @wwportal:8446
gfal2
Xrootd client
FTS Server
Rucio Client @ mac-131933
CERN
Rucio Server@ wwportal:8443
Postgres
Rucio Cores Rucio Daemons
FTS Client
SQL
Starlight
SENSE Service (data plane: VLAN 2038)
XRootD Server
XRootD @ DTN dtn110.sl.startap.net
Files
FilesXRootD Server
XRootD @ DTN dtn04.cern.ch
Files
Files
Location Hostname
Starlight wwportal
Starlight dtn110.sl.startap.net
CERN cixp-surfnet-dtn.cern.ch
CERN dtn04.cern.ch
Home mac-131933.local
control plane
1
2
3
4
Web server
ROBIN: Data Transfer Job Submission
• ROBIN submits transfer jobs through Rucio CLI commands• Transfer jobs and related files are tracked and managed by Rucio services.
ROBIN: Data Transfer Status
ROBIN provides visualized view of the data transfer status through Web GUI.
Result 1: Transfer Speed
163.90
12.71 11.45 6.72
249.00
268.55
243.98
215.35
0
50
100
150
200
250
300
20GB Dataset-10% Dataset-20% Dataset-40%
Tran
sfe
r Sp
ee
d (
MB
/s)
Rucio/FTS vs. ROBIN Transfer SpeedRucio/FTS ROBIN
0.04
26.99
0
5
10
15
20
25
30
Linux 4.4.9
Tran
sfe
r Sp
ee
d (
MB
/s )
Rucio/FTS vs. ROBIN Transfer Speed
Rucio/FTS ROBIN
Result 2: Comparative Analysis
ROBIN outperforms Rucio/FTS significantly!
100% 100% 100% 100%152%
2113% 2131%
3205%
0%
500%
1000%
1500%
2000%
2500%
3000%
3500%
20GB Dataset-10% Dataset-20% Dataset-40%
Rat
io (
% )
Rucio/FTS vs. ROBIN Transfer Speed
Rucio/FTS ROBIN
100%
67475%
0%
10000%
20000%
30000%
40000%
50000%
60000%
70000%
80000%
Linux 4.4.9
Rat
io (
% )
Rucio/FTS vs. ROBIN Transdfer Speed
Rucio/FTS ROBIN
Conclusion and Future Plans
• ROBIN outperforms Rucio/FTS significantly!
• Future plans • Continue to test/evaluate ROBIN
• 100Gbps international WAN paths
• High-end DTNs
• Multiple site deployment
• Increased automation
• Enhanced parameter analytics
Questions?
Additional Information
[1] Rucio: https://rucio.cern.ch/[2] BigData Express: http://bigdataexpress.fnal.gov[3] SENSE: http://sense.es.net