23
© Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

Embed Size (px)

Citation preview

Page 1: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Met Office Unified Model I/O Server

Paul Selwood

Page 2: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

I/O Server motivation

Page 3: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Some History…

• I/O has always been a problem for NWP, more recently for climate

• ~2003 – application level output buffering

• ~2008 – very simple, single threaded I/O servers added for benchmarking

• Intercepted low-level “open/write/close”

• Single threaded

• Some benefit, but limited

• Not addressed scaling issues – message numbers

Page 4: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Old UM I/O – Restart Files

Page 5: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Old UM I/O - Diagnostics

Page 6: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Why I/O Server approach?

• Full parallel I/O difficult with our packing

• “Free” CPUs available

• “Spare” memory available

• Chance to re-work old infrastructure

• Our file format is neither GRIB or netCDF.

Page 7: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Diagnostic flexibility• Variables (primary and derived)

• Output times

• Temporal processing (e.g. accumulations, extrema, means)

• Spatial processing (sub-domains, spatial means)

• Variable to unit mapping

• Basic output resolution is a 2D field

Page 8: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Key design decisions

• Parallelism over output streams

• Output streams distributed over servers

• Server is threaded

• “Listener” receives data & puts in queue

• “Writer” processes queue including packing

• Ensures asynchronous behaviour

• Shared FIFO queue

• Preserves instruction order

• Metadata/Data split

• Data initially stored on compute processes

• Data of same type combined into large messages

Page 9: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Parallelism in I/O Servers

•Multiple I/O streams in typical job

•I/O servers spread among nodes

• Can utilise more memory

• Will improve bandwidth to disk

Page 10: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Automatic post-processing

• Model can trigger automatic post-processing

• Requests dealt with by I/O Server

• FIFO queue ensures integrity of data

Page 11: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

How data gets output

Compute I/OListener Writer

Thread 0Thread 1

Page 12: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

I/O Server development

• Initial version – Synchronous data transmission

• Asynchronous diagnostic data

• Asynchronous restart data

• Amalgamated data

• Asynchronous metadata

• Load balancing

• Priority messages with I/O Server

Page 13: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Lots of diagnostic output

• Which processes are I/O servers

• “Stall” messages

• Memory log

• Timing log

• Full log of metadata / queue

All really useful for tuning!

Page 14: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Lots of tuneable parameters…

• Number and spacing of I/O servers

• Memory for I/O servers

• Number of local data copies

• Number of fields to amalgamate

• Load balancing options

• Timing tunings

• + standard I/O tunings (write block size) etc

Page 15: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Overloaded servers

Page 16: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

I/O Servers keeping up!

Page 17: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

MPI considerations

• Differing levels of MPI threading support

• Best with MPI_THREAD_MULTIPLE

• OK with MPI_THREAD_FUNNELED

• MPI tuning

• Want metadata to go as quickly as possible

• Want data transfer to be truly asynchronous

• Don’t want to interfere with model comms (e.g. halo exchange)

• Currently use 19 environment variables!

Page 18: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Deployment

• July 2011 – Operational global forecasts

• January 2012 – Operational LAM forecasts

• February 2012 – High resolution climate work

• Not currently used in

• Operational ensembles

• Low resolution climate work

• Most research work

Page 19: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Global Forecast Improvement

QG

00/12

QG

06/18

QU

Time 777s 559s 257s

%age 19% 28% 27%

Total saving: over 21 node-hours per day

Page 20: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Impact on High Resolution Climate

• N512 resolution AMIP

• 59 GB restart dumps

• Modest diagnostics

• Cray XE6 with up to 9K cores

• All “in-run” output hidden

• Waits for final restart dump

• Most data buffered on client side

Page 21: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Current and Future Developments• MPI Parallel I/O servers

• Multiple I/O servers per stream

• Gives more memory per stream on server

• Reduced messaging rate per node

• Parallel packing

• Potential for parallel I/O

• Read ahead

• Potential for boundary conditions / forcings

• Some possibilities for initial condition

Page 22: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Parallel I/O server improvement

Before

After

Page 23: © Crown copyright Met Office Met Office Unified Model I/O Server Paul Selwood

© Crown copyright Met Office

Questions and answers