19
4DVAR Optimization & Use-cases for Deep Learning in Earth Sciences BoM R&D Workshop, 9 th December 2016 Dr. Phil Brown Earth Sciences Segment Leader

4DVAR Optimization & Use-cases for Deep Learning in Earth

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

4DVAR Optimization & Use-cases for Deep

Learning in Earth Sciences

BoM R&D Workshop, 9th

December 2016

Dr. Phil Brown

Earth Sciences Segment Leader

Topics

● Optimization of UM 4DVAR

● Courtesy of Lucian Anton

● Use-Cases for Deep Learning in Earth Sciences

Copyright 2016 Cray Inc. - BoM R&D Workshop2

Topics

● Future Technology Trends

● Challenges & Opportunities for Data Assimilation

● Use-Cases for Deep Learning in Earth Sciences

Copyright 2016 Cray Inc. - BoM R&D Workshop3

4

Historical Performance Trends

Copyright 2016 Cray Inc. - BoM R&D Workshop5

WRF Data from SPEC-FP-2006-rate:

https://www.spec.org/cgi-bin/osgresults?conf=rfp2006

0

0.5

1

1.5

2

2.5

3

3.5

4

Nehalem-EP Westmere-EP Sandy Bridge-EP

Ivy Bridge-EP Haswell-EP Broadwell-EP

Rela

tive

Pe

rfo

rma

nce

WRF Performance

FLOPs aren’t the bottleneck!

Copyright 2016 Cray Inc. - BoM R&D Workshop6

WRF Data from SPEC-FP-2006-rate:

https://www.spec.org/cgi-bin/osgresults?conf=rfp2006

0

2

4

6

8

10

12

14

16

18

Nehalem-EP Westmere-EP Sandy Bridge-EP

Ivy Bridge-EP Haswell-EP Broadwell-EP

Rela

tive

Pe

rfo

rma

nce

WRF Performance Peak FLOPS

Memory Bandwidth & Serial Performance

Copyright 2016 Cray Inc. - BoM R&D Workshop7

WRF Data from SPEC-FP-2006-rate:

https://www.spec.org/cgi-bin/osgresults?conf=rfp2006

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Nehalem-EP Westmere-EP Sandy Bridge-EP

Ivy Bridge-EP Haswell-EP Broadwell-EP

Rela

tive

Pe

rfo

rma

nce

WRF Performance Memory Bandwidth Serial Performance

Exascale Computing Memory Trends

Copyright 2016 Cray Inc. - BoM R&D Workshop8

CPU

Memory

(DRAM)

Storage

(HDD)

CPU

Near Memory

(HBM/HMC)

Near Storage

(SSD)

Far Memory

(DRAM/NVDIMM)

Far Storage

(HDD)

On Node

Off Node

On Node

Off Node

Today Future

Exascale Computing Memory Trends

Copyright 2016 Cray Inc. - BoM R&D Workshop9

CPU

Memory

(DRAM)

Storage

(HDD)

CPU

Near Memory

(HBM/HMC)

Near Storage

(SSD)

Far Memory

(DRAM/NVDIMM)

Far Storage

(HDD)

On Node

Off Node

On Node

Off Node

Today Future

Solid-state “Near” Storage

● SSDs enable very high bandwidth storage close to compute● 1TB/s per PB capacity

● Configured as a “consumable” resource

● Use-cases:● Shared Scratch● Workflows

● Checkpoint-Restart

Copyright 2016 Cray Inc. - BoM R&D Workshop10

Exascale Computing Memory Trends

Copyright 2016 Cray Inc. - BoM R&D Workshop11

CPU

Memory

(DRAM)

Storage

(HDD)

CPU

Near Memory

(HBM/HMC)

Near Storage

(SSD)

Far Memory

(DRAM/NVDIMM)

Far Storage

(HDD)

On Node

Off Node

On Node

Off Node

Today Future

Non-volatile Memory

● New non-volatile memories on the horizon● 3D Xpoint, RRAM etc.

● Block and/or byte addressable● Somewhat slower than DRAM

● Opportunity for multi-TB node memory?● Need a really compelling use-case to justify on every node

● Software layers/interfaces still unclear/in development● User controlled (either as memory, or as storage)● Memory expansion (fronted by RAM cache)

● Distributed resilient SAN/filesystems?

Copyright 2016 Cray Inc. - BoM R&D Workshop12

Exascale Computing Memory Trends

Copyright 2016 Cray Inc. - BoM R&D Workshop13

CPU

Memory

(DRAM)

Storage

(HDD)

CPU

Near Memory

(HBM/HMC)

Near Storage

(SSD)

Far Memory

(DRAM/NVDIMM)

Far Storage

(HDD)

On Node

Off Node

On Node

Off Node

Today Future

Next-Generation Memory Technologies

Copyright 2016 Cray Inc. - BoM R&D Workshop14

● Benefits: ● Higher Memory Bandwidth

● Lower Power Consumption per GB/s

● Higher density?

● Downsides: ● Lower Primary Memory Capacity● More Complicated Memory

Hierarchy?

http://www.amd.com/en-us/innovations/software-technologies/hbm https://software.intel.com/en-us/articles/what-disclosures-has-intel-made-about-knights-landing

https://devblogs.nvidia.com/parallelforall/nvlink-pascal-stacked-memory-feeding-appetite-big-data/

Implications for Data Assimilation

● Parallelism is here to stay

● Bad news for classic 4DVAR?

● Low resolution linear/adjoint models

● EnVAR should help

● Use-cases for large non-volatile memories?

● Primary memory may get smaller but much faster

Copyright 2016 Cray Inc. - BoM R&D Workshop15

What is Machine/Deep Learning?

● Deep Learning used to describe a family of algorithms related to multi-level neural networks:● Deep Neural Networks

● Convolutional Neural Networks

● Recurrent Neural Networks

● Lots more!

● Key enabler has been access to compute resources● DL is predominantly FLOP bound

● Large scale problems rapidly becoming “HPC”-class

● Delivering “state of the art” results in computer vision, speech recognition, natural language processing etc.

Copyright 2016 Cray Inc. - BoM R&D Workshop16

Opportunities for Machine/Deep Learning in Weather/Climate

● Almost the opposite of a physics/dynamics based model● Arduous to train, but comparatively quick to run

● Data producer vs data consumer

● Use-cases will be complementary?

● Some ideas:● Rapid classifiers for radar/observations

● Optimal observation selection● Alternative approaches for parameterization● Pattern recognition in model outputs

● Infilling/smoothing model outputs

Copyright 2016 Cray Inc. - BoM R&D Workshop17

Thank you for your attention

19

Legal DisclaimerInformation in this document is provided in connection with Cray Inc. products. No license, express or implied, to any intellectual property rights is granted by this document.

Cray Inc. may make changes to specifications and product descriptions at any time, without notice.

All products, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.

Cray hardware and software products may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Cray uses codenames internally to identify products that are in development and not yet publically announced for release. Customers and other third parties are not authorized by Cray Inc. to use codenames in advertising, promotion or marketing and any use of Cray Inc. internal codenames is at the sole risk of the user.

Performance tests and ratings are measured using specific systems and/or components and reflect the approximate performance of Cray Inc. products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance.

The following are trademarks of Cray Inc. and are registered in the United States and other countries: CRAY and design, SONEXION, URIKA and YARCDATA. The following are trademarks of Cray Inc.: ACE, APPRENTICE2, CHAPEL, CLUSTER CONNECT, CRAYPAT, CRAYPORT, ECOPHLEX, LIBSCI, NODEKARE, THREADSTORM. The following system family marks, and trademarks of Cray Inc.: CS, CX, XC, XE, XK, XMT and XT. The registered trademark LINUX is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a worldwide basis.

Other names and brands may be claimed as the property of others. Other product and service names mentioned herein are the trademarks of their respective owners.

Copyright 2016 Cray Inc.

Copyright 2016 Cray Inc. - BoM R&D Workshop