21
1 Kittikul Kovitanggoon*, Burin Asavapibhop, Narumon Suwonjandee, Gurpreet Singh Chulalongkorn University, Thailand July 23, 2015 Workshop on e-Science and High Performance Computing (eHPC2015) Big data management at CMS collaboration with worldwide LHC computing grid

1 Kittikul Kovitanggoon*, Burin Asavapibhop, Narumon Suwonjandee, Gurpreet Singh Chulalongkorn University, Thailand July 23, 2015 Workshop on e-Science

Embed Size (px)

Citation preview

1

Kittikul Kovitanggoon*, Burin Asavapibhop, Narumon Suwonjandee, Gurpreet Singh

Chulalongkorn University, Thailand

July 23, 2015

Workshop on e-Science and High Performance Computing (eHPC2015)

Big data management at CMS collaboration with worldwide LHC computing grid

22

Outline

33

IntroductionsCMS has had a distributed computing model motivated by various factors.

•The large quantity of data and computing requirement encouraged distributed

resources from a facility infrastructure point of view.•Ability to leverage resources at labs and university.

• Hardware, expertise, infrastructure.•Benefits of providing local control of some resources.•Ability to secure local funding sources.

~20% of the resources are located at CERN, 40% at T1s, and 40% at T2s,

•Relies on the development of tools to make transparent access to the resources.•Efficient distributed computing services.

Can only be successful with sufficient networking between facilities.

•Availability of high performance networks has made the distributed model feasible.

44

Large Hadron Collider (LHC)

• 27 km in circumference

• To collide rotating beams of protons or heavy ions

• Maximum energy of proton-proton collisions at = 14 TeV and 4 x 1034 cm-2s-1

• In 2011, collision at = 7 TeV and 4 x 1033 cm-2s-1

• In 2012, collision at = 8 TeV and 7.7 x 1033 cm-2s-1

• In 2015, expect collision at = 13 TeV and 22.8 x 1033 cm-

2s-1

√s

√s

√s

CMS

ALICE

ATLAS

LHCb

√s

55

Compact Muon Solenoid (CMS)

66

CMS Collisions

77

CMS Collision Data● CMS detectors are gigantic digital cameras that can identify various elementary particles from the millions of collisions per second.

● Decay particles from each collisions will be:

Recorded the passage of each particle through various sub-detectors as a series of electronic signals.

Sent the data to the CERN Data Centre (DC) for digital reconstruction.

Reconstructed digitized summary as a `collision event’.

Data from the CMS experiments will be distributed around the globe with the Worldwide LHC Computing Grid (WLCG) project that is built and maintained for data storage and provides analysis infrastructure for the entire CMS community

Thailand Involved in WLCG: Tier-2/Tier-3 computing centres of CMS Thailand [T2_TH_CUNSTDA and T3_TH_CHULA].

88

CMS Physics and Event RatesDesign Luminosity (L) = 1034 cm-2s-1

•23 pp events/25 ns xing

• ~ 1 GHz input rate

• “Good” events contain

• ~ 20 bkg. Events

•1 kHz W events

•10 Hz top events

•< 104 detectable Higgs decays/year

•Can store ~ 300 Hz events

•Select in stages

•Level-1 Triggers

• 1 GHz to 100 kHz

•High Level Triggers

• 100 kHz to 300 Hz

99

CMS Physics and Event Rates

1010

CMS Data Flow

1111

Worldwide LHC Computing Grid

1212

Tier-0

The first tier in the CMS model, for which there is only one site, CERN, is known as Tier-0 (T0). The T0 performs several functions. The standard workflow is as follows:

1. accepts RAW data from the CMS Online Data Acquisition and Trigger System (TriDAS)

2. repacks the RAW data received from the DAQ into primary datasets based on trigger information

3. archives the repacked RAW data to tape4. distributes RAW data sets among the next tier stage resources (Tier-1) so that two

copies are saved5. performs PromptCalibration in order to get the calibration constants needed to run the

reconstruction6. feeds the RAW datasets to reconstruction7. performs prompt first pass reconstruction which writes the RECO and Analysis Object

Data (AOD) extraction8. distributes the RECO datasets among Tier-1 centers, such that the RAW and RECO

match up at each Tier-19. distributes full AOD to all Tier-1 centers

The T0 does not provide analysis resources and only operates scheduled activities.

1313

Tier-1

• There is a set of thirteen Tier-1 (T1) sites, which are large centers in CMS collaborating countries (large national labs, e.g. FNAL, and RAL).

• Tier-1 sites will in general be used for large-scale, centrally organized activities and can provide data to and receive data from all Tier-2 sites. Each T1 center:

1. receives a subset of the data from the T0 related to the size of the pledged resources in the WLCG MOU

2. provides tape archive of part of the RAW data (secure second copy) which it receives as a subset of the datasets from the T0

3. provides substantial CPU power for scheduled: re-reconstruction, skimming, calibration, and AOD extraction

4. stores an entire copy of the AOD5. distributes RECOs, skims and AOD to the other T1 centers and CERN as well

as the associated group of T2 centers 6. provides secure storage and redistribution for MC events generated by the

T2's

1414

Tier-2A more numerous set of smaller Tier-2 (T2) centers but with substantial CPU resources. T2 provide:

1. services for local communities 2. grid-based analysis for the whole experiment (Tier-2 resources available to

whole experiment through the grid) 3. Monte Carlo simulation for the whole experiment

• T2 centers rely upon T1s for access to large datasets and for secure storage of the new data (generally Monte Carlo) produced at the T2.

• The MC production in Tier-2's will in general be centrally organized, with generated MC samples being sent to an associated Tier-1 site for distribution among the CMS community.

• All other Tier-2 activities will be user driven, with data placed to match resources and needs: tape, disk, manpower, and the needs of local communities.

• The Tier-2 activities will be organized by the Tier-2 authorities in collaboration with physics groups, regional associations and local communities.

1515

Thailand at CERNWLCG•Tier-2/Tier-3 computing centres of CMS Thailand

• [T2_TH_CUNSTDA and T3_TH_CHULA]

CMS•Chulalongkorn University, Bangkok

ALICE •King Mongkut's University of Technology Thonburi (KMUTT), Bangkok •Thai Microelectronics Center (TMEC), Muang Chachoengsao •Suranaree University of Technology, Nakhon Ratchasima

CERN and Thailand: •http://international-relations.web.cern.ch/international-relations/nms/thailand.html

1616

Tier-2/Tier-3 Monitoring

Monitoring Tier-2/Tier-3 of CMS Thailand that are part of worldwide LHC computing grid (WLCG).

Duties for T2 TH CUNSTDA and T3 TH CHULA

•Periodical investigation of CMS site readiness, availability and reliability status.

•To maintain adequate network bandwidth and high PhEDEx – CMS data transfers.

•Contribute to preserve memory for analysis operations, data operations, and local activities.

•WLCG squid monitoring for server traffic volume, HTTP hits/requests, cached objects.

1717

Tier-2/Tier-3 Monitoring

1818

Tier-2/Tier-3 Monitoring

1919

Tier-3 Maintenance

• Working with IBM system x3755 M3, x3550 M4, BladeCenter H 8852.

• Ensuring suitable working condition of Tier-3 system, user machines and web-server hosts, etc.

• Management of user accounts and corresponding data.

• Serving e-science etc. etc.

20

Conclusions

2121

Acknowledgments

• This research is supported by Rachadapisek Sompote Fund for Postdoctoral Fellowship, Chulalongkorn University.

• Department of Physics, Faculty of Science, Chulalongkorn University for financial support.

• The CMS collaboration.

• All eHPC 2015 staffs for organizing this event.