Upload
wbinventor
View
1.474
Download
2
Embed Size (px)
DESCRIPTION
This was my final presentation of the summer to my research group at CERN. I gave this presentation with the other student I worked with, Martin Barisits, to the Distributed Data Management group of ATLAS at CERN.
Citation preview
MARTINWILLSIMGRID Simulator
Martin Barisits Will Boyd
Supervised by Mario Lassnig and Vincent Garonne
August 13, 2009
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 1 / 41
Content
1 Introduction
2 Approach
3 Topology Generator
4 Load Generator
5 Simulator
6 Simulation Results
7 Conclusion
8 Acknowledgements
9 References
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 2 / 41
Introduction
Introduction
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 3 / 41
Introduction Authors
The Authors
Martin BarisitsVienna UT, Austria
• BSc: Medical ComputerScience
• MSc: ComputationalIntelligence
Will BoydGeorgia Tech, USA
• BSc: Physics & ComputerScience
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 4 / 41
Introduction Problem
The Problem
• Goal: Test data distribution strategies• Need: Simulator• Need: Ability to load the Simulator with the current GRID Topology• Need: Inject the Simulator with realistic workloads• Process Results
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 5 / 41
Approach
Approach
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 6 / 41
Approach Design
• Two basic challenges• Write a tool to get a snapshot of the whole GRID environment
(Topology, Usage) and to generate Loads• Write a Simulator which can execute this input
• Different Simulators for GRID analysis are available in theresearch community
• For time reasons we decided to use a Simulator package to baseour Simulator on
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 7 / 41
Approach Design
• Two basic challenges• Write a tool to get a snapshot of the whole GRID environment
(Topology, Usage) and to generate Loads• Write a Simulator which can execute this input
• Different Simulators for GRID analysis are available in theresearch community
• For time reasons we decided to use a Simulator package to baseour Simulator on
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 7 / 41
Approach Design
• Two basic challenges• Write a tool to get a snapshot of the whole GRID environment
(Topology, Usage) and to generate Loads• Write a Simulator which can execute this input
• Different Simulators for GRID analysis are available in theresearch community
• For time reasons we decided to use a Simulator package to baseour Simulator on
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 7 / 41
Approach Package Evaluation
Package Evaluation
• Evaluation of GRID/cloud computing simulation packages• SimGrid[2]
• Based on pure C
• Pros: Fast execution time; low memory consumption; scalable
• Cons: Lacking in some functionality; High level of abstraction• GridSim[3]
• Java-based
• Pros: Highly developed; internal logging of network traffic; easier touse; Packet-based
• Cons: Slow execution time; bad memory consumption; not scalable
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 8 / 41
Approach Package Evaluation
Package Evaluation
• Evaluation of GRID/cloud computing simulation packages• SimGrid[2]
• Based on pure C
• Pros: Fast execution time; low memory consumption; scalable
• Cons: Lacking in some functionality; High level of abstraction• GridSim[3]
• Java-based
• Pros: Highly developed; internal logging of network traffic; easier touse; Packet-based
• Cons: Slow execution time; bad memory consumption; not scalable
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 8 / 41
Approach Package Evaluation
Package Evaluation
• Evaluation of GRID/cloud computing simulation packages• SimGrid[2]
• Based on pure C
• Pros: Fast execution time; low memory consumption; scalable
• Cons: Lacking in some functionality; High level of abstraction• GridSim[3]
• Java-based
• Pros: Highly developed; internal logging of network traffic; easier touse; Packet-based
• Cons: Slow execution time; bad memory consumption; not scalable
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 8 / 41
Approach Package Evaluation
Package Performance
• Attempted to simulate oneday on GRID (1.5 millionfile transfers)
• GridSim: exponential inCPU time with increasingtransfers
• SimGrid: linear in CPUTime with increasingtransfers
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 9 / 41
Approach Flow
Flow
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 10 / 41
Topology Generator
Topology Generator
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 11 / 41
Topology Generator The GRID
GRID Sites
GRID sites across the world
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 12 / 41
Topology Generator The GRID
ATLAS Computing Model
• Hierarchical computingnetwork
• Tier-0
• Tier-1
• Tier-2
• Tier-0 (CERN)generates data
• Tier-1s store data
• Tier-2s process data The Tier-0 and Tier-2network configuration[1]
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 13 / 41
Topology Generator The GRID
ATLAS Computing Model
• Hierarchical computingnetwork
• Tier-0
• Tier-1
• Tier-2
• Tier-0 (CERN)generates data
• Tier-1s store data
• Tier-2s process data The Tier-0 and Tier-2network configuration[1]
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 13 / 41
Topology Generator The GRID
ATLAS Computing Model
• Hierarchical computingnetwork
• Tier-0
• Tier-1
• Tier-2
• Tier-0 (CERN)generates data
• Tier-1s store data
• Tier-2s process data The Tier-0 and Tier-2network configuration[1]
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 13 / 41
Topology Generator The GRID
ATLAS Computing Model
• Hierarchical computingnetwork
• Tier-0
• Tier-1
• Tier-2
• Tier-0 (CERN)generates data
• Tier-1s store data
• Tier-2s process data The Tier-0 and Tier-2network configuration[1]
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 13 / 41
Topology Generator Simulator Topology
The Topology Generator
• TopologyGen.py• Script to construct GRID topology
• Parses TiersOfATLASCache.py
• Finds and associates Tier-1s and Tier-2s
• Queries the DQ2 database
• Total disk space capacity
• Used disk space
• Topology is written to two XML files
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41
Topology Generator Simulator Topology
The Topology Generator
• TopologyGen.py• Script to construct GRID topology
• Parses TiersOfATLASCache.py
• Finds and associates Tier-1s and Tier-2s
• Queries the DQ2 database
• Total disk space capacity
• Used disk space
• Topology is written to two XML files
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41
Topology Generator Simulator Topology
The Topology Generator
• TopologyGen.py• Script to construct GRID topology
• Parses TiersOfATLASCache.py
• Finds and associates Tier-1s and Tier-2s
• Queries the DQ2 database
• Total disk space capacity
• Used disk space
• Topology is written to two XML files
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41
Topology Generator Simulator Topology
The Topology Generator
• TopologyGen.py• Script to construct GRID topology
• Parses TiersOfATLASCache.py
• Finds and associates Tier-1s and Tier-2s
• Queries the DQ2 database
• Total disk space capacity
• Used disk space
• Topology is written to two XML files
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41
Topology Generator Simulator Topology
The Topology Generator
• TopologyGen.py• Script to construct GRID topology
• Parses TiersOfATLASCache.py
• Finds and associates Tier-1s and Tier-2s
• Queries the DQ2 database
• Total disk space capacity
• Used disk space
• Topology is written to two XML files
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41
Topology Generator Simulator Topology
The Topology Generator
• TopologyGen.py• Script to construct GRID topology
• Parses TiersOfATLASCache.py
• Finds and associates Tier-1s and Tier-2s
• Queries the DQ2 database
• Total disk space capacity
• Used disk space
• Topology is written to two XML files
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41
Topology Generator Simulator Topology
The Topology Generator
• TopologyGen.py• Script to construct GRID topology
• Parses TiersOfATLASCache.py
• Finds and associates Tier-1s and Tier-2s
• Queries the DQ2 database
• Total disk space capacity
• Used disk space
• Topology is written to two XML files
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41
Topology Generator Simulator Topology
Platform and Deployment Files
• Platform file• Node declarations
• Link declarations
• Route declarations
• Deployment file• Logfiles for each node
• Total and used disk space
• Used disk space by datatype
• Tier-0 loadfiles
• Associated Tier-1s and Tier-2s
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 15 / 41
Topology Generator Simulator Topology
Platform and Deployment Files
• Platform file• Node declarations
• Link declarations
• Route declarations
• Deployment file• Logfiles for each node
• Total and used disk space
• Used disk space by datatype
• Tier-0 loadfiles
• Associated Tier-1s and Tier-2s
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 15 / 41
Topology Generator Simulator Topology
Platform and Deployment Files
• Platform file• Node declarations
• Link declarations
• Route declarations
• Deployment file• Logfiles for each node
• Total and used disk space
• Used disk space by datatype
• Tier-0 loadfiles
• Associated Tier-1s and Tier-2s
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 15 / 41
Topology Generator Simulator Topology
Route declaration from a Tier-0 to Tier-1 in the platform file<route src= ’CERN_52 ’ ds t= ’RAL−LCG2_MCDISK ’>
< l i n k : c t n i d = ’RAL−LCG2_MCDisk_InternalLink ’>< l i n k : c t n i d = ’ RAL_OPNLinkInternal ’ / >< / l i n k : c t n i d = ’ CERN_52_InternalLink ’>
< / rou te>
Host definition in the platform file<process f u n c t i o n = ’ T ier1Storage ’ host= ’ INFN−T1_DATADISK ’>
<argument value= ’ 1 ’ / > < !−− l o g f i l e −−><argument value= ’ 214576722 ’ / > < !−− t o t a l d isk space −−><argument value= ’ 75266283 ’ / > < !−− used d isk space −−><argument value= ’ 2631309 ’ / > < !−− RAW−−><argument value= ’ 0 ’ / > < !−− SIM −−><argument value= ’ 0 ’ / > < !−− DRD−−><argument value= ’ 28882683 ’ / > < !−− ESD−−><argument value= ’ 21172405 ’ / > < !−− AOD−−><argument value= ’ 0 ’ / > < !−− DPD−−><argument value= ’ 244615 ’ / > < !−− TAG−−><argument value= ’ INFN−MILANO−ATLASC_DATADISK; INFN−NAPOLI−ATLAS_DATADISK ; ’ / >
< / process>
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 16 / 41
Topology Generator Simulator Topology
MARTINWILLSIM GRID Topology
The topology that is generated for simulation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 17 / 41
Load Generator
Load Generator
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 18 / 41
Load Generator LoadGen.py
Generating a Load
• Loadfile given to each Tier-0
• Loadfiles define dataset transfers• Unique dataset ID
• Random (uniform) target Tier-1 storage node
• Random (uniform) filesize (0.5-6GB)
• Random (weekly distribution) inter-submission time
• Dataset datatype (i.e., RAW)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 19 / 41
Load Generator LoadGen.py
Generating a Load
• Loadfile given to each Tier-0
• Loadfiles define dataset transfers• Unique dataset ID
• Random (uniform) target Tier-1 storage node
• Random (uniform) filesize (0.5-6GB)
• Random (weekly distribution) inter-submission time
• Dataset datatype (i.e., RAW)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 19 / 41
Load Generator Simulating Real Loads
Load Distribution
• Dataset distribution• Uniform background
traffic
• Wednesday/Fridaypeak traffic
• Random "spikes" oftraffic
• Each component isweighted
• Distribution can easilybe adjusted
An example weekly dataset transfer distribution
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41
Load Generator Simulating Real Loads
Load Distribution
• Dataset distribution• Uniform background
traffic
• Wednesday/Fridaypeak traffic
• Random "spikes" oftraffic
• Each component isweighted
• Distribution can easilybe adjusted
An example weekly dataset transfer distribution
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41
Load Generator Simulating Real Loads
Load Distribution
• Dataset distribution• Uniform background
traffic
• Wednesday/Fridaypeak traffic
• Random "spikes" oftraffic
• Each component isweighted
• Distribution can easilybe adjusted
An example weekly dataset transfer distribution
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41
Load Generator Simulating Real Loads
Load Distribution
• Dataset distribution• Uniform background
traffic
• Wednesday/Fridaypeak traffic
• Random "spikes" oftraffic
• Each component isweighted
• Distribution can easilybe adjusted
An example weekly dataset transfer distribution
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41
Load Generator Simulating Real Loads
Load Distribution
• Dataset distribution• Uniform background
traffic
• Wednesday/Fridaypeak traffic
• Random "spikes" oftraffic
• Each component isweighted
• Distribution can easilybe adjusted
An example weekly dataset transfer distribution
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41
Load Generator Simulating Real Loads
Load Distribution
• Dataset distribution• Uniform background
traffic
• Wednesday/Fridaypeak traffic
• Random "spikes" oftraffic
• Each component isweighted
• Distribution can easilybe adjusted
An example weekly dataset transfer distribution
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41
Simulator
Simulator
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 21 / 41
Simulator Facts
Facts
• Based on SimGrid[2]• Implemented in C• Intent to implement an extensible Simulation-Framework rather
than a strict Simulator• Goals:
• Fast• Scalable• Representative
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 22 / 41
Simulator Facts
Facts
• Based on SimGrid[2]• Implemented in C• Intent to implement an extensible Simulation-Framework rather
than a strict Simulator• Goals:
• Fast• Scalable• Representative
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 22 / 41
Simulator Facts
Facts
• Based on SimGrid[2]• Implemented in C• Intent to implement an extensible Simulation-Framework rather
than a strict Simulator• Goals:
• Fast• Scalable• Representative
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 22 / 41
Simulator Facts
Facts
• Based on SimGrid[2]• Implemented in C• Intent to implement an extensible Simulation-Framework rather
than a strict Simulator• Goals:
• Fast• Scalable• Representative
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 22 / 41
Simulator Facts
Features
• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,
. . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41
Simulator Facts
Features
• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,
. . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41
Simulator Facts
Features
• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,
. . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41
Simulator Facts
Features
• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,
. . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41
Simulator Facts
Features
• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,
. . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41
Simulator Design
Major Entities
• Nodes (Tier0, Tier1, Tier2)• Tasks (Datatransfer)• DataSets• Links
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 24 / 41
Simulator Design
Major Entities
• Nodes (Tier0, Tier1, Tier2)• Tasks (Datatransfer)• DataSets• Links
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 24 / 41
Simulator Design
Task
TaskTaskname Size (DataSet)
• Taskname• Command
• Size• Communication
Size• Execution Size
• DataSet• DataSet ID• DataSet Size• DataSet Type
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 25 / 41
Simulator Design
Task
TaskTaskname Size (DataSet)
• Taskname• Command
• Size• Communication
Size• Execution Size
• DataSet• DataSet ID• DataSet Size• DataSet Type
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 25 / 41
Simulator Design
Task
TaskTaskname Size (DataSet)
• Taskname• Command
• Size• Communication
Size• Execution Size
• DataSet• DataSet ID• DataSet Size• DataSet Type
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 25 / 41
Simulator Design
Task
TaskTaskname Size (DataSet)
• Taskname• Command
• Size• Communication
Size• Execution Size
• DataSet• DataSet ID• DataSet Size• DataSet Type
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 25 / 41
Simulator Design
Command Language
• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41
Simulator Design
Command Language
• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41
Simulator Design
Command Language
• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41
Simulator Design
Command Language
• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41
Simulator Design
Command Language
• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41
Simulator Design
Command Language
• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41
Simulator Design
Nodes
• Different types of nodes with different functions• Producer (Tier 0)• Storages (Tier 1, Tier 2)• Hosts (Tier 1, Tier 2)• FinalizeNode
• All nodes understand the Command Language• Different nodes execute commands differently
• It’s up to the user to define the semantics of a node• Node Features
• DataSet Store (Hashmap)• Ability to write simulation output• Queues• Simulation Functions (Execute, Sleep, . . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 27 / 41
Simulator Design
Nodes
• Different types of nodes with different functions• Producer (Tier 0)• Storages (Tier 1, Tier 2)• Hosts (Tier 1, Tier 2)• FinalizeNode
• All nodes understand the Command Language• Different nodes execute commands differently
• It’s up to the user to define the semantics of a node• Node Features
• DataSet Store (Hashmap)• Ability to write simulation output• Queues• Simulation Functions (Execute, Sleep, . . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 27 / 41
Simulator Design
Nodes
• Different types of nodes with different functions• Producer (Tier 0)• Storages (Tier 1, Tier 2)• Hosts (Tier 1, Tier 2)• FinalizeNode
• All nodes understand the Command Language• Different nodes execute commands differently
• It’s up to the user to define the semantics of a node• Node Features
• DataSet Store (Hashmap)• Ability to write simulation output• Queues• Simulation Functions (Execute, Sleep, . . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 27 / 41
Simulator Design
Nodes
• Different types of nodes with different functions• Producer (Tier 0)• Storages (Tier 1, Tier 2)• Hosts (Tier 1, Tier 2)• FinalizeNode
• All nodes understand the Command Language• Different nodes execute commands differently
• It’s up to the user to define the semantics of a node• Node Features
• DataSet Store (Hashmap)• Ability to write simulation output• Queues• Simulation Functions (Execute, Sleep, . . . )
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 27 / 41
Simulator Design
Pseudo Code of a Tier-1 Storagewhile ( 1 ) {
task = receiveTask ( ) ;switch ( task )
case <PUSH, dataSet >:s to re ( dataSet ) ; / / S t o r e i n t h e F i l e S y s t e m
t i e r 2 = getNextT ier2 ( ) ;send ( t i e r 2 , PUSH, dataSet ) ; / / S e n d t h e t a s k
w r i t e S t a t i s t i c s ( ) ;case <DELETE, dataSet >:
. . .case <FINALIZE , dataSet >:
w r i t e S t a t i s t i c s ( ) ;break ;
. . .}
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 28 / 41
Simulator Design
The way of a DataSet (1/4)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 29 / 41
Simulator Design
The way of a DataSet (2/4)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 30 / 41
Simulator Design
The way of a DataSet (3/4)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 31 / 41
Simulator Design
The way of a DataSet (4/4)
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 32 / 41
Simulation Results
Simulation Results
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 33 / 41
Simulation Results Disk Space Evolution
Overloading the GRID
Tier-0 Dataset submissiondistribution
Disk space evolution withincreasing daily dataset transfers
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 34 / 41
Simulation Results Disk Space Evolution
Disk Space Evolution
Tier-1 storage node An associated Tier-2 storagenode
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 35 / 41
Simulation Results Disk Space Evolution
Data Storage by Datatype
Uniform dataset transfer distribution Simulated dataset transfer distribution
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 36 / 41
Simulation Results Scalability
Scalability of MARTINWILLSIM
• MartinWillSim run tosimulate increasingnumber of days
• 250,000 tasks/day(800TB/day)
• Simulated one month ofdataset transfers in 40mins.
• CPU time linear withnumber of simulated tasks
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 37 / 41
Conclusion
Conclusion
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 38 / 41
Conclusion Conclusion
Recap
• Evaluation of different Simulator Packages [2 Weeks]• Design and Implementation of:
• Topology & Load Generator [6 Weeks]• MartinWillSim Simulator [6 Weeks]
• Result Analysis• Documentation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 39 / 41
Conclusion Conclusion
Recap
• Evaluation of different Simulator Packages [2 Weeks]• Design and Implementation of:
• Topology & Load Generator [6 Weeks]• MartinWillSim Simulator [6 Weeks]
• Result Analysis• Documentation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 39 / 41
Conclusion Conclusion
Recap
• Evaluation of different Simulator Packages [2 Weeks]• Design and Implementation of:
• Topology & Load Generator [6 Weeks]• MartinWillSim Simulator [6 Weeks]
• Result Analysis• Documentation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 39 / 41
Conclusion Conclusion
Recap
• Evaluation of different Simulator Packages [2 Weeks]• Design and Implementation of:
• Topology & Load Generator [6 Weeks]• MartinWillSim Simulator [6 Weeks]
• Result Analysis• Documentation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 39 / 41
Conclusion Future Work
Future Work
• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis
• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior
• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41
Conclusion Future Work
Future Work
• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis
• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior
• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41
Conclusion Future Work
Future Work
• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis
• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior
• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41
Conclusion Future Work
Future Work
• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis
• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior
• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41
Conclusion Future Work
Future Work
• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis
• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior
• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41
Conclusion Future Work
Future Work
• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis
• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior
• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41
Acknowledgements
Thank you!
• Mario Lassnig
• Vincent Garonne
• Angelos Molfetas
• Ingrid Schmid
• All those who helped make the 2009 Summer Student Programmepossible!
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 41 / 41
References
[1] "https://twiki.cern.ch/twiki/pub/LHCOPN/ImplementationDetails/map-lhcopn.png"
[2] Casanova, Legrand, Quinson: SimGrid: a Generic Frameworkfor Large-Scale Distributed Experiments, 2008
[3] Buyya, Murshed: GridSim: A Toolkit for the Modeling andSimulation of Distributed Resource Management and Schedulingfor Grid Computing, 2002
Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 41 / 41