Introduction The Pipeline PyWPS Conclusion
Orchestrating a Climate Modeling Data Pipelineusing Python and RAM Disk
A Brief Overview of the WRF Tools Package
Andre R. Erler
PyCon Canada
November 7th, 2015
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion
Outline
Introduction: Climate ModelingRegional Models
The Pre-processing PipelineThe WRF Tools Package
Using Python to Drive the PipelineThe Tool ChainThe Class Structure
Concluding Remarks
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Global Models Regional Models
Global Climate Models
IPCC AR4 (2007) projectionsfor global surface temperatureunder different scenarios.
Global Climate Modelsare the main tool to pre-dict climate change.
Climate models compute energy, mass, and momentumfluxes on a relatively coarse computational grid.
Schematic of a Global Climate Model (GCM):
Center for Multiscale Modeling of Atmospheric Processes, CSU
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Global Models Regional Models
Regional Climate ModelsGCM resolution is coarse and manyregional details are not resolved (e.g.the Rocky Mountains and the GreatLakes).
Giorgi (2006)
Regional ImpactsRegional impacts of ClimateChange are modeled withhigh-resolution regional climatemodels (RCM).
Regional models simulate a smallarea at much higher resolution(×10).
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Global Models Regional Models
Regional Climate ModelsGCM resolution is coarse and manyregional details are not resolved (e.g.the Rocky Mountains and the GreatLakes).
Giorgi (2006)
Regional ImpactsRegional impacts of ClimateChange are modeled withhigh-resolution regional climatemodels (RCM).
Regional models simulate a smallarea at much higher resolution(×10).
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Global Models Regional Models
Global Model: CESM
The Community Earth SystemModel is used as driving model.
Average Surface Temperature andOutline of the WRF Domain
Topography of Western Canada.Left: CESM at ∼80 kmRight: WRF at 10 km
Regional Model: WRF
The Weather Research and Forecastmodel is our regional model.
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Global Models Regional Models
WRF is actuallypronounced “Worf”,like Lt. Worf inStar Trek: The NextGeneration (left)
The WRF model is alimited-area numericalweather prediction modeldeveloped by theNational Center forAtmospheric Research
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
Running the RegionalClimate Model (WRF)I The coupling process
between GCM and RCM is“off-line” (asynchronous)
I A pre-processing systemconverts GCM output intoRCM (wrf-)input files
I A RCM simulation is splitinto ∼ 200 separate jobs
I The RCM runs continuously,each job submitting the next
The WRF Tools Package
PythonI Run pre-processing
tool chain (WPS)I Initialize WRF jobsI Run post-processing
Shell ScriptI Submit pre-processingI Run the WRF job,
submit next jobI Archiving to tape
WRF Tools enables continuousand autonomous operation
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
Running the RegionalClimate Model (WRF)I The coupling process
between GCM and RCM is“off-line” (asynchronous)
I A pre-processing systemconverts GCM output intoRCM (wrf-)input files
I A RCM simulation is splitinto ∼ 200 separate jobs
I The RCM runs continuously,each job submitting the next
The WRF Tools Package
PythonI Run pre-processing
tool chain (WPS)I Initialize WRF jobsI Run post-processing
Shell ScriptI Submit pre-processingI Run the WRF job,
submit next jobI Archiving to tape
WRF Tools enables continuousand autonomous operation
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
Running the RegionalClimate Model (WRF)I The coupling process
between GCM and RCM is“off-line” (asynchronous)
I A pre-processing systemconverts GCM output intoRCM (wrf-)input files
I A RCM simulation is splitinto ∼ 200 separate jobs
I The RCM runs continuously,each job submitting the next
The WRF Tools Package
PythonI Run pre-processing
tool chain (WPS)I Initialize WRF jobsI Run post-processing
Shell ScriptI Submit pre-processingI Run the WRF job,
submit next jobI Archiving to tape
WRF Tools enables continuousand autonomous operation
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
Running the RegionalClimate Model (WRF)I The coupling process
between GCM and RCM is“off-line” (asynchronous)
I A pre-processing systemconverts GCM output intoRCM (wrf-)input files
I A RCM simulation is splitinto ∼ 200 separate jobs
I The RCM runs continuously,each job submitting the next
The WRF Tools Package
PythonI Run pre-processing
tool chain (WPS)I Initialize WRF jobsI Run post-processing
Shell ScriptI Submit pre-processingI Run the WRF job,
submit next jobI Archiving to tape
WRF Tools enables continuousand autonomous operation
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
The Data Pipeline
Global Model (CESM)asynchronous/archived output
Long-termArchive,Post-processing,Analysis
WPS
offline
WRFwrfinput wrfout
WPS WRFwrfinput
wrfoutlaunch
WPS WRFwrfinput
wrfout
launch
geogrid
staticdata
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
The Data Pipeline
Global Model (CESM)asynchronous/archived output
Long-termArchive,Post-processing,Analysis
WPS
offline
WRFwrfinput wrfout
WPS WRFwrfinput
wrfoutlaunch
WPS WRFwrfinput
wrfout
launch
geogrid
staticdata
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
The Data Pipeline
Global Model (CESM)asynchronous/archived output
Long-termArchive,Post-processing,Analysis
WPS
offline
WRFwrfinput wrfout
WPS WRFwrfinput
wrfoutlaunch
WPS WRFwrfinput
wrfout
launch
geogrid
staticdata
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
The Data Pipeline
Global Model (CESM)asynchronous/archived output
Long-termArchive,Post-processing,Analysis
WPS
offline
WRFwrfinput wrfout
WPS WRFwrfinput
wrfoutlaunch
WPS WRFwrfinput
wrfout
launch
geogrid
staticdata
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
WPS: A Collection of Fortran Legacy Tools
WPS Components
1. geogrid.exestatic / geographic data
2. ungrib.exe /unccsm.execonvert driving data toWRF IM Format
3. metgrid.exeinterpolate to WRF grid
4. real.exegenerate boundarycondition files
Fortran legacy tools readfrom and write to temporaryfiles:
I Strongly I/O limited in aHPC cluster environment
The Solution (on Linux)
Run on RAM-disk!I speedup ∼ ×10I requires 64 GB RAM
Using Python driver script
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion The WRF Tools Package
WPS: A Collection of Fortran Legacy Tools
WPS Components
1. geogrid.exestatic / geographic data
2. ungrib.exe /unccsm.execonvert driving data toWRF IM Format
3. metgrid.exeinterpolate to WRF grid
4. real.exegenerate boundarycondition files
Fortran legacy tools readfrom and write to temporaryfiles:
I Strongly I/O limited in aHPC cluster environment
The Solution (on Linux)
Run on RAM-disk!I speedup ∼ ×10I requires 64 GB RAM
Using Python driver script
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Overview The Tool Chain The Class Structure
PyWPS: A Driver Module for WPS
I Collect required input data fromGCM archive
I Run applicable pre-processing toolson RAM disk
I Assemble WRF input files
Why Python?I Easier with complex logicI Classes for different
datasets/GCMs
PyWPS Imports
I multiprocessing forparallelization
I re to find input filesI fileinput, sys to edit
configurations filesI subprocess to launch
Fortran toolsI shutils, os to handle
temporary files
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Overview The Tool Chain The Class Structure
PyWPS: A Driver Module for WPS
I Collect required input data fromGCM archive
I Run applicable pre-processing toolson RAM disk
I Assemble WRF input files
Why Python?I Easier with complex logicI Classes for different
datasets/GCMs
PyWPS Imports
I multiprocessing forparallelization
I re to find input filesI fileinput, sys to edit
configurations filesI subprocess to launch
Fortran toolsI shutils, os to handle
temporary files
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Overview The Tool Chain The Class Structure
PyWPS: The Program Flow & Parallelization
All CESMOutput files:...
Selected CESMOutput files:only current job
core 0
core 1
core 2
core 3
Select Files
WRF inputfiles
core 0
core 1
core 2
core 3
WPS Tool Chain
The WPS Tool Chain:
CESMoutput
WRF inter-mediate file
ungrib.exe metgridfile
metgrid.exewrfinput
real.exe
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Overview The Tool Chain The Class Structure
PyWPS: The Program Flow & Parallelization
All CESMOutput files:...
Selected CESMOutput files:only current job
core 0
core 1
core 2
core 3
Select Files
WRF inputfiles
core 0
core 1
core 2
core 3
WPS Tool Chain
The WPS Tool Chain:
CESMoutput
WRF inter-mediate file
ungrib.exe metgridfile
metgrid.exewrfinput
real.exe
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Overview The Tool Chain The Class Structure
PyWPS: The Program Flow & Parallelization
All CESMOutput files:...
Selected CESMOutput files:only current job
core 0
core 1
core 2
core 3
Select Files
WRF inputfiles
core 0
core 1
core 2
core 3
WPS Tool Chain
The WPS Tool Chain:
CESMoutput
WRF inter-mediate file
ungrib.exe metgridfile
metgrid.exewrfinput
real.exe
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Overview The Tool Chain The Class Structure
The Class Structure
Dataset/GCM specificparameters:
I Input file types/namesI Interpolation tables/gridI Variables / frequency
Multiple DatasetsI Inheritance for common
proceduresI Polymorphism for
different procedures
class Dataset(object):prefix = ’’ # file prefixvtable = ’Vtable’gribname = ’GRIBFILE’ # inputungrib_exe = ’ungrib.exe’ungrib_log = ’ungrib.exe.log’...def __init__(self, ...):
# type checking...
def setup(self, src, ...):...
def cleanup(self, tgt):...
def extractDate(self, fname):# match valid filenames...
def ungrib(self, date, mytag):# generate file for metgrid...
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion Overview The Tool Chain The Class Structure
The Class Structure
Dataset/GCM specificparameters:
I Input file types/namesI Interpolation tables/gridI Variables / frequency
Multiple DatasetsI Inheritance for common
proceduresI Polymorphism for
different procedures
class Dataset(object):prefix = ’’ # file prefixvtable = ’Vtable’gribname = ’GRIBFILE’ # inputungrib_exe = ’ungrib.exe’ungrib_log = ’ungrib.exe.log’...def __init__(self, ...):
# type checking...
def setup(self, src, ...):...
def cleanup(self, tgt):...
def extractDate(self, fname):# match valid filenames...
def ungrib(self, date, mytag):# generate file for metgrid...
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion
Summary & Conclusion
PythonI Use Python for flow control (manage legacy tools)
I Parallelization relatively easy (within one node)
I Class structure is versatile and makes maintenance easier
RAM-diskI Scientific Programming: dealing with legacy tools
Often in Fortran, often relying on disk I/O
I Use RAM-disk to avoid unnecessary disk I/O
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion
Thank You! ∼ Questions?
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Introduction The Pipeline PyWPS Conclusion
List of Publications using WRF Tools
I Erler, Andre R., W. Richard Peltier, Marc d’Orgeville (under review), ProjectedChanges in Hydro-Climatic Extremes for Western Canada, Journal of Climate.
I Marc d’Orgeville, W. Richard Peltier, Andre R. Erler (accepted), Uncertainty inFuture Summer Precipitation on the Great Lakes Basin due to Drought in theSouth-Western US, Journal of Geophysical Research.
I Erler, Andre R., W. Richard Peltier, Marc d’Orgeville, 2015, DynamicallyDownscaled High Resolution Hydro-Climate Projections for Western Canada,Journal of Climate.
I Marc d’Orgeville, W. Richard Peltier, Andre R. Erler, Jonathan Gula, 2014,Climate change impacts on Great Lakes Basin precipitation extremes, Journal ofGeophysical Research.
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Experimental Setup Summary of Results
Regional Climate Projections
Experimental SetupI GCM & RCM run for 15 years
(model time)
I Historical (1979 - 1994)I Mid-21st -Century (2045-2060)I End-21st -Century (2085-2100)
I GCM & RCM use RCP 8.5 GHGconcentration scenarios
I RCM runs with different physicalparameterizations
I Both models run in an initial conditionensemble with 4 members each
IPCC AR4 climate projectionsbased on different scenarios; theRCP 8.5 is very similar to theolder A2 scenario
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Experimental Setup Summary of Results
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline
Experimental Setup Summary of Results
Summary of ResultsI Significant increase in winter
precipitation (extremes, ∼30%)
I Small increase in summer, butmore increase in evaporation
Hydrological Impacts
I Climate change impacts inARB/Alberta likely benign
I 50% reduction in peak snowmeltand spring runoff in FRB/BC...
I ... but increased flood risk due toprecipitation extremes in fall
End-century soil moisturechanges in late summer
I Late summer dryingwest of ContinentalDivide, but not east
Andre R. Erler ([email protected]) Orchestrating a Climate Modeling Pipeline