52
www.ci.anl.gov www.ci.uchicago.edu Accelerating data-driven discovery by outsourcing the mundane Ian Foster

Mexico talk foster march 2012

Embed Size (px)

DESCRIPTION

Keynote talk at the 3rd International Conference on Supercomputing in Mexico: www.isum.mx. A great group of people!A substantially revised version of a talk with the same title given on previous occasions.

Citation preview

Page 1: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

Accelerating data-driven discoveryby outsourcing the mundane

Ian Foster

Page 2: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

The data deluge

Page 3: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

3

The data deluge in biology

x105 in 6 years

x10 in 6 years

Page 4: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

4

Number of sequencing machines

http://omicsmaps.com/

Page 5: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

5

18 ordersof magnitudein 5 decades!12 orders of

magnitudein 6 decades

Moore’s Law for X-ray sources

Credit: Linda Young

Page 6: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

6

Exploding data volumes in astronomy

100,000 TB

MACHO et al.: 1 TB

Palomar: 3 TB2MASS: 10 TBGALEX: 30 TBSloan: 40 TB

Pan-STARRS: 40,000 TB

Page 7: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

7

Exploding data volumes in climate science

Climate model intercomparisonproject (CMIP) of the IPCC

2004: 36 TB

2012: 2,300 TB

Page 8: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

8

Big science has been successful

All build on NSF OCI (& DOE)-supported Globus Toolkit software

LIGO: 1 PB data in last science run, distributed worldwide

ESG: 1.2 PB climate datadelivered to 23,000 users; 600+ pubs

OSG: 1.4M CPU-hours/day, >90 sites, >3000 users, >260 pubs in 2010

Robust production solutionsSubstantial teams and expenseSustained, multi-year effortApplication-specific solutions, built on common technology

Page 9: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

9

Small science is struggling

More data, more complex dataAd-hoc solutionsInadequate software, hardwareData plan mandates

Page 10: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

10

Dark data in the long tail of science

Awarded Amount 2007

$0

$1,000,000

$2,000,000

$3,000,000

$4,000,000

$5,000,000

$6,000,000

$7,000,000

1 586 1171 1756 2341 2926 3511 4096 4681 5266 5851 6436 7021 7606 8191 8776

NSF grant awards, 2007 (Bryan Heidorn)

Page 11: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

11

The challenge of staying competitive

"Well, in our country," said Alice … "you'd generally get to somewhere else — if you run very fast for a long time, as we've been doing.”

"A slow sort of country!" said the Queen. "Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"

Page 12: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

12

A crisis that demands new approaches

• We have exceptional infrastructure for the 1% (e.g., supercomputers, Large Hadron Collider, …)

• But not for the 99% (e.g., the vast majority of the 1.8M publicly funded researchers in the EU)

We need new approaches to providing research cyberinfrastructure, that:— Reduce barriers to entry— Are cheaper— Are sustainable

Page 13: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

13

You can run a company from a coffee shop

Page 14: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

14

Because businesses outsource their IT

Web presence Email (hosted Exchange) Calendar Telephony (hosted VOIP) Human resources and payroll Accounting Customer relationship mgmt

Software as a Service

(SaaS)

Page 15: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

15

And often their large-scale computing too

Web presence Email (hosted Exchange) Calendar Telephony (hosted VOIP) Human resources and payroll Accounting Customer relationship mgmt Data analytics Content distribution

Infrastructure as a Service

(IaaS)

Software as a Service

(SaaS)

Page 16: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

16

Let’s rethink how we provide research IT

Accelerate discovery and innovation worldwide by providing research IT as a service

Leverage the cloud to• provide millions of researchers with

unprecedented access to powerful tools; • enable a massive shortening of cycle times in

time-consuming research processes; and• reduce research IT costs dramatically via

economies of scale

Page 17: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

17grail.cs.washington.edu

Page 18: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

18

Cloud layers

18

Software as a Service: SaaS

Platform as a Service: PaaS

Infrastructure as a Service: IaaS

Page 19: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

19

Common research data management steps

• Dark Energy Survey• Galaxy genomics• LIGO observatory

• SBGrid structural biology consortium• NCAR climate data applications• Land use change; economics

Page 20: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

20

Common research data management steps

• Dark Energy Survey• Galaxy genomics• LIGO observatory

• SBGrid structural biology consortium• NCAR climate data applications• Land use change; economics

Page 21: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

21

Scientific data delivery, 2012

• “[A] majority of users at BES facilities … physically transport data to a home institution using portable media … data volumes are going to increase significantly in the next few years (to 70 TB/day or more) – data must be transferred over the network”

• “the effectiveness of data transfer middleware [is] not just on the transfer speed, but also the time and interruption to other work required to supervise and check on the success of large data transfers”

• “It took two weeks and email traffic between network specialists at NERSC and ORNL, sys-admins at NERSC, … and combustion staff at ORNL and SNL to move 10 TB from NERSC to ORNL”

[ESNet Network Requirements Workshops, 2007-2010]

Major usability, productivity, performance problems

1980

Page 22: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

22

The challenge: Moving big data easily

What should be trivial …

… can be painfully tedious and time-consuming

“I need my data over there – at my _____”

( supercomputing center, campus server,

etc.)

Data Source

Data Destination

! Config issues

! Unexpected failure = manual retry

Data Source

Data Destination

“GAAAH!

%&@#&” ! Firewall issues

Page 23: Mexico talk foster march 2012

GO PICTURE

Page 24: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

24

Globus Online: Data transfer as SaaS• Reliable file transfer.

– Easy “fire-and-forget” transfers– Automatic fault recovery– High performance– Across multiple security domains

• No IT required.– Software as a Service (SaaS)

o No client software installationo New features automatically available

– Consolidated support & troubleshooting– Works with existing GridFTP servers– Globus Connect solves “last mile problem”

• >4000 registered users, >3 Petabytes moved

Recommended by XSEDE, NERSC, Blue Waters, and many campuses

Page 25: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

25

Dark Energy Survey use of Globus Online• Dark Energy Survey

receives 100,000 files each night in Illinois

• They transmit files to Texas for analysis … then move results back to Illinois

• Process must be reliable, routine, and efficient

• They outsource this task to Globus Online

Image credit: Roger Smith/NOAO/AURA/NSF

Blanco 4m on Cerro Tololo

Page 26: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

26

Page 27: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

27

Page 28: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

28

Integration with Earth System Grid

28

High-speed transfersAutomated retriesWorks behind firewallsCredential managementTransfer monitoring

Page 29: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

29

Globus Online under the covers

User Hub manages user identities and profilesGroup Hub manages groups and policiesResource Hub for resource definitions

Page 30: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

30

Globus Online under the covers

User Hub manages user identities and profilesGroup Hub manages groups and policiesResource Hub for resource definitions

Monitoring and controlAuto-tuning of transfer parametersDetection & attempted correction of errorsManual intervention when required

Page 31: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

31

Globus Online under the covers

User Hub manages user identities and profilesGroup Hub manages groups and policiesResource Hub for resource definitions

Monitoring and controlAuto-tuning of transfer parametersDetection & attempted correction of errorsManual intervention when required

Reliable cloud-based infrastructureEC2 for transfer managementS3 for system stateSimpleDB for lock managementReplication across availability zones

Page 32: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

32

Globus Online under the covers

User Hub manages user identities and profilesGroup Hub manages groups and policiesResource Hub for resource definitions

Monitoring and controlAuto-tuning of transfer parametersDetection & attempted correction of errorsManual intervention when required

Reliable cloud-based infrastructureEC2 for transfer managementS3 for system stateSimpleDB for lock managementReplication across availability zones

Page 33: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

33

• Dark Energy Survey• Galaxy genomics• LIGO observatory

• SBGrid structural biology consortium• NCAR climate data applications• Land use change; economics

Towards “research IT as a service”

Page 34: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

34

Towards “research IT as a service”

Page 35: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

35

Commercial storage service

provider

National research center

Campus computing

center

Globus Storage: For when you want to …

• Place your data where you want

• Access it from anywhere via different protocols

• Update it, version it,and take snapshots

• Share versions with who you want

• Synchronize among locations

Globus Storage volume

GridFTP, HTTP, WebDAV

Page 36: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

36

Globus Collaborate: For when you want to

Join with a few or many people to:• Share documents• Track tasks• Send email• Share data • Do whateverWith:• Common groups• Delegated mgmt

Page 37: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

37

Globus Integrate: For when you want to

Write programs that access/manage user identities, profiles, groups, resources—and data …

… via REST APIs and command line programs

Globus Integrate• Transfer API available• User profile, group APIs in alpha• APIs for Storage, Collaborate

planned after app release

Globus Connect Multi User

Globus Connect

Globus Transfer• In production use• Service and Web

UI enhancements continue

Globus Storage• Early release

available in March• Generally

available in Q3

Globus Collaborate

• Initial projects starting in March

• Early release sometime in Q3

Page 38: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

38

Other innovative science SaaS projects

Page 39: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

39

Other innovative science SaaS projects

Page 40: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

40

Other innovative science SaaS projects

Page 41: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

41

Other innovative science SaaS projects

Page 42: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

42

Realizing the benefits of cloud services

• Understand what services researchers really need

• Acquire and sustain the expertise required to create and operate useful services

• Incentivize those who produce services that are widely adopted

• Provide excellent network connectivity

Page 43: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

43

On the importance of networks

“80 percent of success is showing up”

Page 44: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

44

Time required to move 10 Terabytes

10 30 100 300 1000 3000 10000 30000 100000 300000 10000000.01

0.10

1.00

10.00

100.00

1,000.00

10,000.00

Network speed in Megabits/sec

Hou

rs to

tran

sfer

10

Tera

byte

s

Page 45: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

45

Time required to move 10 Terabytes

10 30 100 300 1000 3000 10000 30000 100000 300000 10000000.01

0.10

1.00

10.00

100.00

1,000.00

10,000.00

Network speed in Megabits/sec

Hou

rs to

tran

sfer

10

Tera

byte

s

2 hours US R1 Universities

Page 46: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

46

Time required to move 10 Terabytes

10 30 100 300 1000 3000 10000 30000 100000 300000 10000000.01

0.10

1.00

10.00

100.00

1,000.00

10,000.00

Network speed in Megabits/sec

Hou

rs to

tran

sfer

10

Tera

byte

s

2 hours US R1 Universities10 mins Upgrade

Page 47: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

47

Time required to move 10 Terabytes

10 30 100 300 1000 3000 10000 30000 100000 300000 10000000.01

0.10

1.00

10.00

100.00

1,000.00

10,000.00

Network speed in Megabits/sec

Hou

rs to

tran

sfer

10

Tera

byte

s

2 hours US R1 Universities10 mins Upgrade

1 month Cinvestav Langebio

Page 48: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

48

A 21st C research cyberinfrastructure

LL

LL

L

LL

L

LL

L

LL

L

LL

L

LL

L

LL

L

LL

L

LP P P P

Research data management Collaboration, computationResearch administration

• To providemore capability formore people at less cost …

• Create cloud-based services– Robust and universal– Economies of scale– Positive returns to scale

• Via the creative use of– Aggregation (“cloud”)– Federation (“grid”)

• Powered by networks

Small and medium laboratories and projects

aaS

P

Page 49: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

49

Questions for you

• How much “dark data” exists in your field? How important is that data?

• Can you quantify the scale, in your field, of– Wasted resources due to duplicated effort– Delays in research progress due to inadequate

infrastructure?• If you could do one thing to accelerate adoption

of advanced computing within your field, what would it be?

Page 50: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

50

Acknowledgments

Colleagues at UChicago and ArgonneSteve Tuecke, Ravi Madduri, Kyle Chard, Tanu Malik, Rachana Ananthakrisnan, Raj Kettimuthu,and others listed at www.globusonline.org/about/goteam/

NSF Office of CyberinfrastructureDOE Office of Advanced Scientific Computing Res.National Institutes of Health

Page 51: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

51

For more information

Attend GlobusWorld in Chicago, April 10-12, 2012• www.globusonline.org• Twitter: @globusonline, Globus Online on Facebook• Foster, I. Globus Online: Accelerating and democratizing

science through cloud-based services. IEEE Internet Computing(May/June):70-73, 2011.

• Allen, B., Bresnahan, J., Childers, L., Foster, I., Kandaswamy, G., Kettimuthu, R., Kordas, J., Link, M., Martin, S., Pickett, K. and Tuecke, S. Software as a Service for Data Scientists. Communications of the ACM, Feb, 2012.

Page 52: Mexico talk foster march 2012

www.ci.anl.govwww.ci.uchicago.edu

Thank you!

[email protected]@anl.gov

www.globusonline.orgTwitter: @globusonline, @ianfoster