36
Evolving inversion methods in Geophysics with Cloud Computing – a case study of an eScience collaboration Mudge, Chandrasekhar, Heinson, Thiel Prof J Craig Mudge FTSE University of Adelaide Australia School of Computer Science/ School of Earth Sceinces 7 th IEEE eScience Conference, Stockholm, December 2011 1

Evolving inversion methods in Geophysics with Cloud Computing – a case study of an eScience collaboration Mudge, Chandrasekhar, Heinson, Thiel Prof J Craig

Embed Size (px)

Citation preview

1

Evolving inversion methods in Geophysics with Cloud Computing – a case study of an eScience

collaboration

Mudge, Chandrasekhar, Heinson, Thiel

Prof J Craig Mudge FTSEUniversity of Adelaide

AustraliaSchool of Computer Science/ School of Earth Sceinces

7th IEEE eScience Conference, Stockholm, December 2011

2

Two South Australian successes in geology1. Hot rocks for geo-thermal energy - 95% investment is in

South Australia

2. Olympic Dam - BHP Billiton -- world's fourth largest copper deposit, fifth largest gold

deposit and the largest uranium deposit.

[email protected] IEEE eScience 2011

Outline

1. Cloud computing2. Collaborative Cloud Computing Lab (C3L)3. Inversion in magnetotelluric processing4. Geothermal – EGS in South Australia5. Results and Lessons learned6. Future work

4

Cloud service provider owns and operates the infrastructure

and innovates to keep technology leading edge, handle software upgrades, and

steadily reduce energy costs

Google, Dalles Oregon Microsoft Azure, Chicago

Air flow

Massive scale of data centres delivers 4 – 7X cost reduction and energy efficiency

5

6

A no-machines Lab

eScience enabled bycloud computing

Seed funding from -- Department of Mines www.pir.sa.gov.au

-- MSFT Research Jim Gray Seed Grant

Started June 2010

machines

Our three cloud service providers

1. Amazon Web Services2. Microsoft Azure

Now adding government funded eResearch clouds which will run Open Stack (NASA and Rackspace)

[email protected] IEEE eScience 2011

Magnetotelluric (MT) imaging1. Using the magnetic and electric

fields of the earth, MT imaging determines the resistivity structure of a sub-surface area of interest.

2. It goes deeper (hundred or so Km) than seismic (<2 Km) but does not have the same resolution

3. Applications1. mineral exploration, 2. water management in mining, 3. geothermal exploration, 4. carbon storage, 5. aquifer research and management6. earthquake and volcano studies.

CO2 in depleted gas field

(Heinson and Mudge, 2010)

8

Electrical resistivity

Electromagnetic methods

11

Data logging by University of Adelaide Geophysics, on a geothermal site – Paralana, SA,

Australia

12

MT Processing steps

[email protected] IEEE eScience 2011

Inversion

[email protected] IEEE eScience 2011

13

yes

no

locally improvemodel misfit

compute model’sMT response

can locally improve misfit?

> max iterations?

start

compute sensitivity

matrix

compare model responseto observed data

can locally improve smoothness?

smoothenough?

requiredmisfit?

locally improvemodel smoothness

finish

yes

yes

no

yesno

yes

no

no

Inversion iterations:Compute model response,compare with observed data

Searching the solution space

[email protected] IEEE eScience 2011

14

[email protected] IEEE eScience 2011

[email protected] IEEE eScience 2011 16

Setting up a new inversion – part 1

[email protected] IEEE eScience 2011 17

Setting up a new inversion – part 2

[email protected] IEEE eScience 2011 18

Dashboard

19

Results and Lessons learned

[email protected] IEEE eScience 2011

[email protected] IEEE eScience 2011 20

Speedup

Sequential

Parallel

[email protected] IEEE eScience 2011 21

Performance analysis beyond speedup

Sequential

Parallel

Examples of recent performance analysis 1. Effect of FORTRAN compiler with different optimisations has been worth exploring. A factor of 3X speed up from the Intel Visual Fortran Composer XE 2011 for Windows.2. “Steal time” - time lost due to hypervisor’s management of a virtual machine – Netflix have analysed their Amazon experience extensively

[email protected] IEEE eScience 2011 22

Results and learnings

1. “No-machines” works2. Speedup has led to 100% adoption in MT research3. First results of monitoring fluid injection in EGS

Reservoirs using magnetotellurics (MT) – promising since seismic does not indicate fluid flow, and MT is low cost

4. Taking chunks of FORTRAN is achievable in a timely manner

5. Capability building – a true eScience partnership6. Our Web Services user interactions took same amount

of programming effort as parallelising

[email protected] IEEE eScience 2011

23

eScience in the cloud- observations of a veteran of the

computer industry (but not my co-authors in this eScience paper)

1. Web Services (giving interoperability between disparate services of historic proportion) could have been adopted faster in eScience

24

[email protected] IEEE eScience 2011(Mudge, 2002)

[email protected] IEEE eScience 2011

(Mudge, 2002)

[email protected] IEEE eScience 2011

26

eScience in the cloud- observations of a veteran of the

computer industry (but not my co-authors in this eScience paper)

1. Web Services (giving interoperability between disparate services of historic proportion) could have been adopted faster in eScience

2. Cloud computing will speed up the use of web services , because cloud makes it natural to interact using web services (service orientation, discovery, interoperability)

[email protected] IEEE eScience 2011

27

Lessons learned – HPC programming

1. MapReduce (Hadoop) is the programming model that best matches data centre as the computer. However, because it requires rewrite of existing programs, the first wave of benefits come from simpler parallelism – parameter sweeps, Monte Carlo simulation, job-level parallelism, etc.

2. Second wave of benefits will be new algorithms and rewrites using MapReduce

3. Nevertheless, the first wave in cloud-based bioinformatics (matching short reads against reference genome) did use MapReduce

28

Lessons learned - Azure1. Why was Azure much harder to migrate to than

predicted?Answer:- We came from a non .Net environment- Azure younger than Amazon (2 years)

- Virtual Machine in Beta- Deployment times 20 minutes vs 20 seconds slows debugging

- Azure designed for long running applications, e.g., ecommerce, more than for scientific

2. However, we persist.- Warehouse-sized data centre – operating system is robust

and rich, e.g., hot swap of patches- Benefits of [email protected] IEEE eScience 2011

29

Future work

[email protected] IEEE eScience 2011

[email protected] IEEE eScience 2011

30

Future work 1 of 2

1. Inversion on demand, available to colleagues and explorers world-wide, wrapped in workflow (persistence, provenance, partial runs, ...)

2. National/international collaboration building on a national Geophysics Virtual Lab

- access to disparate data (seismic, borehole images, gravity, magnetic, ...) built by Auscope using results of GeoSciML Interoperability Working Group

31

Sustainable Energy Policy Societal Need

Energy Exploration Integrated Virtual Laboratory

EnvironmentVirtual Laboratory

Integrated Virtual Labs

Virtual Geophysical Laboratory

National Borehole

Laboratory

Virtual Geodesy Laboratory

Virtual Earth ObservationLaboratory

Virtual Oceans Laboratory

Virtual Laboratories

Geophysics Borehole Geodesy Land cover Marine

Virtual Libraries

Processing Services

DataMiddleware

Processing Services

DataMiddleware

Processing Services

DataMiddleware

Processing Services

DataMiddleware

Processing Services

DataMiddleware

Modelling & analytic tools

Dr Robert Woodcock and Dr Lesley [email protected]

IEEE eScience 2011

[email protected] IEEE eScience 2011

32

Future work 2 of 2

3. Explore statistical machine learning to detect interesting patterns

4. Exploring solution space using Evolutionary Algorithms implemented on thousands of processors in the cloud (Brad Alexander)

5. Promulgate security best practices6. Following the success of speedup, model size

has become the limiter for our geophysicists

[email protected] IEEE eScience 2011

33

AcknowledgementsBrad AlexanderGordon BellPinaki ChandrasekharDennis GannonGraham HeinsonTony Hey Ed LazowskaStephan Thiel

Summary

1. Cloud computing2. Collaborative Cloud Computing Lab (C3L)3. Inversion in magnetotelluric processing4. Geothermal – EGS in South Australia5. Lessons learned6. Future work

35

Thanks and

questions

[email protected]

www.cloudinnovation.com.au

+61 417 679 266+1 650 224 2111

[email protected] IEEE eScience 2011

[email protected] IEEE eScience 2011

36

Security best practices

1. Certifications2. Physical security3. Secure services4. Data privacy via encryption5. Backups6. Constant monitoring7. External review8. Compare yours with Google, Amazon, Azure