16
1 NYU Scientific Reproducibility Workshop, May 30, 2013 Olive One-click Execution of Internet-Archived Software Mahadev Satyanarayanan School of Computer Science Carnegie Mellon University Supported by:

Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

1NYU Scientific Reproducibility Workshop, May 30, 2013

OliveOne-click Execution of Internet-Archived Software

Mahadev SatyanarayananSchool of Computer ScienceCarnegie Mellon University

Supported by:

Page 2: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

2NYU Scientific Reproducibility Workshop, May 30, 2013

April 2011 Workshop at IBMComputer Scientists and Digital Library Community

Benjamin Gilbert, CMU

Catherine Zhang, IBM

Chris Shillum, Elsevier

Claire Cocco, OCLC

Clifford Lynch, CNI

Doug Freimuth, IBM

Erika Linke, CMU

Giovanni Pacifici, IBM

Glenn Ammons, IBM

Gloriana St Clair, CMU

Jeff Brody, IBM

Joan Smith, Emory

Kevin Guthrie, ITHAKA

M Satyanarayanan (Satya), CMU

Maria Ebling, IBM

Michael Witt, Purdue

Mike Handy, Library of Congress

Mike Lesk, Rutgers

Mladen Vouk, NCSU

Naomi Nelson, Duke

Patrick Drehrer, RENCI

Paul Jones, Ibiblio.org

Peng Ning, NCSU

Steve Griffin, Library of Congress

Tamar Eilam, IBM

Timothy Dilauro, Johns Hopkins

Vasanth Bala (Vas), IBM

Vijay Kumar, MIT

Page 3: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

3NYU Scientific Reproducibility Workshop, May 30, 20133

Archiving Static Content

Page 4: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

4NYU Scientific Reproducibility Workshop, May 30, 2013

What About Executable Content?

Application-specific contentGames

WordPerfect 1.0 docCan you read it today?100 years from now?

Original Wang docCan you read it today?100 years from now?

Simulation modelCan you re-run old

model with new data?

Page 5: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

5NYU Scientific Reproducibility Workshop, May 30, 2013

Execution Fidelity

Ability to precisely reproduce execution• many moving parts

hardware, operating system, dynamically linked libraries, configuration parameters,language settings, timezone settings,

• very difficult to achieve execution fidelity

Problem faced by every software vendor• large market configuration flexibility complexity• results in high time to value for customer

Exactly this problem is faced in archiving executable content• no help desk for 100-year old software!• how do we fake the existence of now-obsolete environments?

Page 6: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

6NYU Scientific Reproducibility Workshop, May 30, 2013

Computer Museum

For software: ??????

For hardware

30-year functional lifeHow to test patches?

Page 7: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

7NYU Scientific Reproducibility Workshop, May 30, 2013

Solution: Take It With YouTransform the problem into a scaling problem

• pack up and carry the entire environment with you(including the OS)

• transitive closure of everything you need

Central idea of a (hardware) virtual machine (VM)

Page 8: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

8NYU Scientific Reproducibility Workshop, May 30, 2013

VMs Can Relieve PainVM library of pre-tested software combinations

• use VMs as the basis of software distribution• outsource configuration pain to third parties• value to both software vendor and customer

User- and enterprise-specific data outside VM (e.g. dist. file sys.)

Broader value: archive executable content• reproducibility for scientific results• executable embodiment of archival results (e.g. SIGGRAPH)• escrow SaaS (software as a service) code as insurance• many others

VMs can do for software what VLSI did for hardware(hide complexity and increase robustness)

Page 9: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

9NYU Scientific Reproducibility Workshop, May 30, 2013

But VMs are Huge!

10 GB VM

• @ 100 Mbps at least 800 seconds (13 minutes) download

• @ 10 Mbps at least 8000 seconds (over two hours) download

No one will wait that long to look at something briefly!

How do we achieve quick launch?

Page 10: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

10NYU Scientific Reproducibility Workshop, May 30, 2013

Internet

Video Streaming

Page 11: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

11NYU Scientific Reproducibility Workshop, May 30, 2013

VM Streaming Not So Easy

Access to VM image is not linear

Reference pattern depends on many runtime factors

• data dependencies

• human interaction

• spatial and temporal locality (program behavior)

Borrow an old idea from operating systems• demand paging• intercept missing VM pieces and fetch over Internet• prefetching may speed up performance if predictions good

Page 12: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

12NYU Scientific Reproducibility Workshop, May 30, 2013

Transient PC(e.g. ISR)

VM-based Cloud Computing Today

better safety

VM Storage

better availability

cris

per i

nter

actio

nm

ore

com

ute

pow

er

VM E

xecu

tion

Classic PC model(e.g. laptops)

Remote execution(e.g. Grid Computing)

Managed execution(e.g. Amazon EC2)

cloud-cloud

cloud-locallocal-local

local-cloud

Local Cloud

Loca

lC

loud

Page 13: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

13NYU Scientific Reproducibility Workshop, May 30, 2013

Suspend

VM changes

Transient PC Model( AFS++)

Transient PC #1Transient PC #2

Resume

Transient PC #3

Resume

“Cyber Dishwasher”

CloudCloudCloud

Suspend

Page 14: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

14NYU Scientific Reproducibility Workshop, May 30, 2013

Olive VisionOpen archive of VM images accessible Internet-wide

Workshop clarified Olive concepts and value proposition

• clean separation of locations for VM image archiving instance creation

• ability to compose VMs (e.g. for licensing reasons)

Many Olive concepts explored and fleshed out

Page 15: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

15NYU Scientific Reproducibility Workshop, May 30, 2013

CloudOlive

EdgeOlive

GlobalOlive

Few instances1 to 10 worldwide

Librarian viewpoint• ~10 to 100 year timescale• no execution capability• archiving & preservation

concerns foremost• provenance tracking critical• rigorous licensing scrutiny

and enforcement• heavyweight publication process• not in mission-critical path

Numerous instances10 to 1000 worldwide

One or few perFortune 500 enterprise

Cloud operator viewpoint• ~1 to 5 year timescale• reliability and availability

concerns foremost• usage metering important• revocation of licenses possible• lightweight publication process

(within cloud and from edge to cloud)

few to manyrelationship

one to manyrelationship

Myriad instances102 to 106 per cloud

End user viewpoint• crisp interaction foremost• availability important• disconnected operation possible• “disposable” computing hardware

Page 16: Mahadev Satyanarayanan School of Computer Science Carnegie ... · NYU Scientific Reproducibility Workshop, May 30, 2013 8 VMs Can Relieve Pain VM library of pre-tested software combinations

16NYU Scientific Reproducibility Workshop, May 30, 2013

Olive Demo(work in progress)