Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Shortened presentation titleShortened presentation titleShortened presentation title
Evaluating the Performance of Large Scale Data Assimilation on Modern Geophysical Models
Julia PiscioniereCollege of Charleston
SIParCS Intern
Mentors:Brian Dobbins and John Dennis
August 3, 2018
Computational & Information Systems Lab
Shortened presentation titleShortened presentation titleShortened presentation title
Overview
2
Plots of Results
OptimizationMotivationsDART
Shortened presentation titleShortened presentation titleShortened presentation title
Introduction
• Data Assimilation Research TestbedDART?
• When information from new physical observations are used to update a numerical model in order to produce more accurate predictions
Data Assimilation?
3
https://www.image.ucar.edu/DAReS/DART/
Shortened presentation titleShortened presentation titleShortened presentation title
Motivation
Overall Goal: Improve the performance of data assimilation
Optimization Approach: Reduce communication in DART
4
• Observation density is increasing
• CESM ensembles: hundreds or thousands of nodes
• Enable DART to scale to larger node counts
Shortened presentation titleShortened presentation titleShortened presentation title
How DART Currently Works
5
1. Statistics of observations are calculated
•Processes one observation at a time
3. Update State Phase
•Updates the local model states based on the observation
4. Update Observations Phase
•Updates the local observations with the statistics
2. Localization Phase
•Finds state variables that are close to the current observation being processed
Number of Iterations = The Number of Observations
Shortened presentation titleShortened presentation titleShortened presentation title
How the Optimized DART Works
6
1. Statistics of observations are calculated
•Processes groups of observations at a time (graph code)
3. Update State Phase
•Updates the local model states based on the observation
4. Update Observations Phase
•Updates the local observations with the statistics
2. Localization Phase
•Finds state variables that are close to the current observation being processed
Number of Iterations = The Number of Groups
Number of Communication Counts = 1-2% of the Number of Observations
Shortened presentation titleShortened presentation titleShortened presentation title
Cutoff Distance
7
Shortened presentation titleShortened presentation titleShortened presentation title
Cutoff Distance
8
Shortened presentation titleShortened presentation titleShortened presentation title
What is a Graph Coloring Algorithm?
9
Shortened presentation titleShortened presentation titleShortened presentation title
What is a Graph Coloring Algorithm?
10
Shortened presentation titleShortened presentation titleShortened presentation title
What is a Graph Coloring Algorithm?
11
Shortened presentation titleShortened presentation titleShortened presentation title
What is a Graph Coloring Algorithm?
12
Shortened presentation titleShortened presentation titleShortened presentation title
What is a Graph Coloring Algorithm?
13
Shortened presentation titleShortened presentation titleShortened presentation title
What is a Graph Coloring Algorithm?
14
Shortened presentation titleShortened presentation titleShortened presentation title
What is a Graph Coloring Algorithm?
15
Color 4 = RedColor 5 = Blue
Shortened presentation titleShortened presentation titleShortened presentation title
Running Comparisons
Chunk Size
• The highest number of observations that will be in a chunk
• Used chunk size of 256
Ensemble Size
• Number of ensemble members DART will produce
• Used ensemble size 100
Number of Nodes
• Range from 4 - 384
Cutoff Distance
• Used cutoffs 0.1 and 0.2
16
• All runs were 1–degree CAM (Community Atmospheric Model) cases
Shortened presentation titleShortened presentation titleShortened presentation title
Choosing Chunk Size
17Evaluating Performance of New DART
Chunk sizes of 128 and 256 were both viable options
Shortened presentation titleShortened presentation titleShortened presentation title
Assimilation Time Plots
18Evaluating Performance of New DART
At 64 nodes, the graph code is 60% faster than the original code
Shortened presentation titleShortened presentation titleShortened presentation title
Assimilation Time Plots
19Evaluating Performance of New DART
At 128 nodes, the graph code is 51% faster than the original
Shortened presentation titleShortened presentation titleShortened presentation title
Communication Time Plot
20Evaluating Performance of New DART
The communication time is decreased by an average of 90% across nodes from 4-384
Shortened presentation titleShortened presentation titleShortened presentation title
Cheyenne Over Gigabit Ethernet
21Evaluating Performance of New DART
The graph code may run faster on a slower network, meaning people with slower networks can run DART
more efficiently
Shortened presentation titleShortened presentation titleShortened presentation title
Conclusions
•The graph code communication time is decreased by an average of 90%
The assimilation time is decreased by a minimum of 10% and -as the number of nodes increases - a maximum of 85%
The graph code may be a better option for people who want to run DART on systems with slower network links
22
Shortened presentation titleShortened presentation titleShortened presentation title
Thank you! [email protected]
AcknowledgementsMentors:
– Brian Dobbins– John Dennis
CODE Group:– AJ Lauer– Elliot Foust– Jenna Preston
DAReS Group:– Nancy Collins– Jeff Anderson – Tim Hoar
– Elizabeth Faircloth, admin– Rich Loft– Valerie Sloan– National Center for Atmospheric
Research– National Science Foundation– John Piscioniere– My fellow interns :)
23Evaluating Performance of New DART