23
Foundations for an LHC Data Grid Stu Loken Berkeley Lab

Foundations for an LHC Data Grid

  • Upload
    taipa

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Foundations for an LHC Data Grid. Stu Loken Berkeley Lab. The Message. Large-scale Distributed Computing (known as Grids) is a major thrust of the U.S. Computing community Annual investment in Grid R&D and infrastructure is ~$100M per year - PowerPoint PPT Presentation

Citation preview

Page 1: Foundations for an LHC Data Grid

Foundations for anLHC Data Grid

Stu Loken

Berkeley Lab

Page 2: Foundations for an LHC Data Grid

The Message

• Large-scale Distributed Computing (known as Grids) is a major thrust of the U.S. Computing community

• Annual investment in Grid R&D and infrastructure is ~$100M per year

• This investment can and should be leveraged to provide the Regional computing model for LHC

Page 3: Foundations for an LHC Data Grid

The Vision for the Grid

• Persistent, Universal and Ubiquitous

Access to Networked Resources

• Common Tools and Infrastructure for

Building 21st Century Applications

• Integrating HPC, Data Intensive

Computing, Remote Visualization

and Advanced Collaborations

Technologies

Page 4: Foundations for an LHC Data Grid

The Grid from a Services View

Resource-specific implementations of basic services:E.g., Transport protocols, name servers, differentiated services, CPU schedulers, public keyinfrastructure, site accounting, directory service, OS bypass

Resource-independent and application-independent services:E.g., authentication, authorization, resource location, resource allocation, events, accounting,remote data access, information, policy, fault detection

DistributedComputing

ApplicationsToolkit

Grid Fabric(Resources)

Grid Services(Middleware)

ApplicationToolkits

Data-Intensive

ApplicationsToolkit

CollaborativeApplications

Toolkit

RemoteVisualizationApplications

Toolkit

ProblemSolving

ApplicationsToolkit

RemoteInstrumentation

ApplicationsToolkit

Applications Chemistry

Biology

Cosmology

High Energy Physics

Environment

Page 5: Foundations for an LHC Data Grid

Grid-based Computing Projects

• China Clipper

• Particle Physics Data Grid

• NASA Information Power Grid: Distributed Problem Solving

• Access Grid: The Future of Distributed Collaboration

Page 6: Foundations for an LHC Data Grid

Clipper Project

• ANL-SLAC-Berkeley• Push the limits of very high-speed

data transmission• Builds on Globus Middleware and

high-performance distributed storage• Demonstrated data rates up to 50

Mbytes/sec.

Page 7: Foundations for an LHC Data Grid

China Clipper TasksHigh-Speed Testbed

– Computing and networking infrastructure

Differentiated Network Services– Traffic shaping on ESnet

Monitoring Architecture– Traffic analysis to support traffic shaping and CPU scheduling

Data Architecture– Transparent management of data

Application Demonstration– Standard Analysis Framework (STAF)

Page 8: Foundations for an LHC Data Grid

China Clipper Testbed

Page 9: Foundations for an LHC Data Grid

Clipper Architecture

Page 10: Foundations for an LHC Data Grid

MonitoringEnd-to-end monitoring of the assets in a

computational grid is necessary both for resolving network throughput problems and for dynamically scheduling resources.

China Clipper adds precision-timed event monitor agents to:– ATM switchs – DPSS servers– Testbed computational resources

• Produce trend analysis modules for monitor agents• Make results available to applications

Page 11: Foundations for an LHC Data Grid

Monitoring

Page 12: Foundations for an LHC Data Grid

Particle Physics Data Grid

• HENP Labs and Universities (Caltech-SLAC lead)

• Extend GRID concept to large-scale distributed data analysis

• Uses NGI testbeds as well as production networks

• Funded by DOE-NGI program

Page 13: Foundations for an LHC Data Grid

NGI: “Particle Physics Data Grid”ANL(CS/HEP), BNL, Caltech, FNAL, JLAB,LBNL(CS/HEP), SDSC, SLAC, U.Wisconsin

High-Speed Site-to-Site File Replication ServiceFIRST YEAR:

• SLAC-LBNL at least;

• Goal intentionally requires > OC12;

• Use existing hardware and networks (NTON);

• Explore “Diffserv”, instrumentation, reservation/allocation.

Bulk Transfer Service:100 Mbytes/s, 100 Tbytes/year

Primary Site

Data Acquisition,CPU, Disk,Tape-Robot

Replica Site(Partial)

CPU, Disk,Tape-Robot

Page 14: Foundations for an LHC Data Grid

NGI: “Particle Physics Data Grid”

Deployment of Multi-Site Cached File AccessPrimary Site

Data Acquisition,CPU, Disk,Tape-Robot

Satellite Site

CPU, Disk,Tape-Robot

Satellite Site

CPU, Disk,Tape-Robot

UniversityCPU, Disk,

Users

UniversityCPU, Disk,

Users

UniversityCPU, Disk,

Users

Satellite Site

CPU, Disk,Tape-Robot

FIRST YEAR:

• Read access only;

• Optimized for 1-10 GB files;

• File-level interface to ODBMSs;

• Maximal use of Globus, MCAT, SAM, OOFS, Condor, Grand Challenge etc.;

• Focus on getting users.

Page 15: Foundations for an LHC Data Grid

Information Power GridDistributed High-Performance Computing,

Large-Scale Data Management, andCollaboration Environments for Science and

EngineeringBuilding Problem-Solving Environments

William E. Johnston, Dennis Gannon, William Nitzberg

Page 16: Foundations for an LHC Data Grid

IPG Problem Environment

Page 17: Foundations for an LHC Data Grid

IPG Requirements• Multiple datasets

• Complex workflow scenarios

• Data-streams from instrument systems

• Sub-component simulations coupled simultaneously

• Appropriate levels of abstraction

• Search, interpret and fuse multiple data archives

• Share all aspects of work processes

• Bursty resource availability and scheduling

• Sufficient available resources

• VR and immersive techniques

• Software agents to assist in routine/repetitive tasks

• All this will be supported by the Grid. PSEs are the primary scientific/engineering user interface to the Grid.

Page 18: Foundations for an LHC Data Grid

The Future of Distributed Collaboration Technology:

The Access Grid

Ian Foster, Rick Stevens

Argonne National Laboratory

Page 19: Foundations for an LHC Data Grid

Beyond Teleconferencing:

• Physical spaces to support distributed groupwork

• Virtual collaborative venues

• Agenda driven scenarios and work sessions

• Integration with Integrated GRID services

Page 20: Foundations for an LHC Data Grid

Access Grid Project Goals • Enable Group-to-Group Interactions at a

Distance

• Provide a Sense of Presence• Use Quality but Affordable Digital IP Based

Audio/video (Open Source)

• Enable Complex Multi-site Visual and Collaborative Experiences

• Build on Integrated Grid Services Architecture

Page 21: Foundations for an LHC Data Grid

The Docking Concept for Access Grid

Private Workspaces - Docked into the Group Workspace

Page 22: Foundations for an LHC Data Grid

Ambient mic(tabletop)

Presentermic

Presentercamera

Audience camera

Ambient mic(tabletop)

Presentermic

Presentercamera

Audience camera

Access Grid Nodes

• Access Grid Nodes Under Development– Library, Workshop– ActiveMural Room– Office– Auditorium

Page 23: Foundations for an LHC Data Grid

Conclusion

A set of closely coordinated projects is laying the foundation for a high-performance distributed computing environment.

There appear to be good prospects for a significant long-term investment to deploy the necessary infrastructure to support Particle Physics Data Analysis.