Upload
javen-popple
View
223
Download
0
Tags:
Embed Size (px)
Citation preview
Computation of High-Resolution Global Ocean Model
using Earth Simulator
By Norikazu Nakashiki (CRIEPI) Yoshikatsu Yoshida (CRIEPI) Takaki Tsubono (CRIEPI) Dong-Hoon. Kim (CRIEPI) Frank O. Bryan (NCAR) Richard D. Smith (LANL) Mathew E. Maltrud (LANL) Julie L. McClean (NPS)
Parallel Ocean Program (POP)
1. Designed for Massive Parallel Computer -> sheared memory, massive parallel computing
2. Free-surface boundary condition -> no island problem -> unsmoothed bottom topography -> prognostic sea-surface height
3. General Orthogonal Coordinate -> displaced-pole grid (singularity free Arctic Ocean)
4. Vertical mixing parameterization 1) simple constant mixing 2) Richardson-number dependent mixing 3) KPP mixing parameterization
5. Convective Adjustment 1) convection adjustment 2) large mixing coefficient
6. Horizontal mixing 1) laplacian 2) bi-harmonic 3) Gent-McWilliams isopycnal tracer diffusion 4) Anisotropic viscosity
7. Equation of State 1) UNESCO eq. (based on potential temperature) 2) full UNESCO eq. (polynomial fit) 3) linear eos
8. Topographic stress 1) Holloway’s topographic stress parameterization
POP (Parallel Ocean Program)
1) High Resolution Global Ocean model Resolution : 0.1x0.1x40L (3600x2400x40) (pole on North America) Horizontal : Bi-harmonic Mixing for Momentum & Tracer Vertical : Kpp Mixing Time step : 220/day (≒ 6min.)
2) Global Model for CCSM2 Resolution : 1x1x40L (320x384x40) (pole on Green Land) Horizontal : Anisotropic Mixing for Momentum GM Mixing for Tracer Vertical : Kpp Mixing Time step : 23/day (≒60min.)
0 5 10 15 20 25 30 35 40
0
1000
2000
3000
4000
5000
6000
z- index
Computational Grid of POP x0.1
Horizontal Mesh Vertical Mesh
POP timing measurement on ES
• 1 degree model– 320 x 384 x 40 grid division, 23 full-step/day– KPP vertical mixing scheme– GM horizontal mixing for tracer– Anisotropic viscosity parameterization – 3rd upwind tracer advection
• 0.1 degree model– 3600 x 2400 x 40 grid division, 220 full-step/day– KPP vertical mixing scheme– Bi-harmonic horizontal mixing for tracer and momentum
• No history output. No forcing data input.
0 100 200
0 400 800 1200 1600
0 200 400 600
wallclock seconds per simulated day
w/o optimization
w/o optimization
w/o optimization
w/ optimization
w/ optimization
w/ optimization
(a) 20 PEs
(b) 160 PEs
(c) 960 PEs
baroclinic
barotropic
baroclinic
baroclinic
baroclinic
barotropic
Cost distribution in POP
resolution: x0.1 deg
16 32 64 128 256 512 1024101
102
103 baroclinic
barotropic
w/ optimization
w/o optimization
wa
llclo
ck s
eco
nd
s p
er
sim
ula
ted
da
y# of processors
2 4 8 16 32 64 128# of nodes
1 2 4 8 16 32 64 128
100
101
102
baroclinic
barotropic
w/ optimization
w/o optimization
wal
lclo
ck s
econ
ds p
er s
imul
ated
day
# of processors
2 4 8 16# of nodes1
Scalability in baroclinic/barotropic mode
Significant improvement in barotropic mode Scalability wall around 2-node (1 deg) and 80-node (0.1 deg)
Slight speedup in baroclinic mode
(a) 1 degree (b) 0.1 degree
1 2 4 8 16 32 64 128 256 512 1024
100
101
102
103
1 2 4 8 16 32 64
0.1 degree
1 degree
w/ optimization
w/o optimization
# of PEs
# of nodes
wal
lclo
ck d
ays
per
sim
ulat
ed c
entu
ry
128
1 2 4 8 16 32 64 128 256 512 10240.0
0.2
0.4
0.6
0.8
1.0
1.21 2 4 8 16 32 64
0.1 degree
1 degree
# of PEs
# of nodes
para
llel e
ffici
ency
128
w/o optimization
w/ optimization
POP performance on ES
1.64 day/century, 70.2 Gflop/s, at 1 degree (4 nodes) 27.1 day/century, 1.60 Tflop/s, at 0.1 degree (120 node
s)
(a) wallclock v.s. # of PEs (b) efficiency v.s. # of PEs
Parallel Efficiency of POP x1 (Relation with Vertical Resolution)
Parallel efficiency ≧ 50 % (10 Node ) on ES center
30
40
50
60
70
80
90
100
0 16 32 48 64 80 96 112 128
POPx1 Layer- Parallel Efficiency
40L effc.80L effc.160L effc.200L effc.
PEs
1
10
100
1000
1 10 100 1000
POPx1 ( Layer Num.- CPU time )
40L cpu
80L cpu
160L cpu
200L cpu
PEsNum. of PE Num. of PE
Wall
Clo
ck T
ime (
sec)
for
2 D
ays
Inte
gra
tion
Para
llel Effi
ciency
(%
)
Vertical Resolution -> 40L, 80, 160, 200L
Further optimizations for POP code
• POP version 1– Distributed parallel I/O w/ horizontal data decomposition (J. Ue
no, will be completed in March) – Tests of NEC’s new MPI library (incl. all-reduce)– Merge CRIEPI version and CRAY version into one
• POP version 2– POP2 beta2 code ported to ES– Vector optimization– Timing measurement in progress (H. Komatsu, J. Ueno)
Some problems w/ OpenMP• NEC’s compiler supports OpenMP1.1, not OpenMP2• Some features of f90 cannot be used w/ OpenMP1.1
10 years Spin up of POP (x0.1)
10 year Computation (10year * 1cycle)
Initial data From LANL/NPS
Earth Simulator
40 node (320 PE)
Atmospheric Boundary Conditions NCEP, etc. (1990-2000) (1) Wind Daily (2) Surface Heat Flux Daily (3) Surface Fresh Water Flux Monthly
POP x0.1
Surface Boundary Condition
Global Diagnostics
Kinetic Energy at Surface
Global Mean KE
Global Mean PTGlobal Mean SAL
Annual Mean Sea Surface Temp.
Annual Mean Sea Surface Sal. Levitus
POP x1 (2000)
POP x0.1 m(2000)
Kuroshio
CCSM2 (for climate simulation) X1 deg. (100km×100km)High Resolution Model x0.1 deg. (10km× 1 0km)
CCSM2 (for climate simulation) X1 deg. (100km×100km)High Resolution Model x0.1 deg. (10km× 1 0km)
Equatorial Current
Sea Surface Temp.
Glonbal1990-2000 Monthly SST
1990-1991 Daily SST
Kuroshio
Gulf Stream
1990-2000 Monthly Vel.
Kuroshio
Ohsumi
Tsushima
Tsugaru
Soya
Tokara
SSH & Volume Transport Section
0
10
20
30
40
50
60
Kuroshio (tokara)Kuroshio (izu)
Sv)黒潮の各断面通過流量 (
年1990 20001995
Izu30-60 Sv
Tokara13 Sv
Kuroshio
Ohsumi
Tsushima Tsugaru
Soya
Tokara
Kuroshio
Ohsumi
Tsushima Tsugaru
Soya
Tokara
-1
0
1
2
3
4
5TsushimaTsugaruSoya
年
Sv)日本海における各断面通過流量 (
1990 20001995
Soya 0.7SvTsugaru 1.5SvTsushima 2.2Sv
Japan Sea
Kuroshio
Volume Transport
Sensitivity Analysis of POP x0.1
To improve Gulf Stream & Kuroshio, etc.
→ Change Strength of Horizontal Mixing
Viscosity & Diffusivity of Bi-harmonic Mixing
case 01a: am = -2.7e18 , ah = -9.0e17
Same Horizontal Mixing (basic) case 01b: x1/2
case 01c: x1/3
Surface Forcing : Monthly Climatology
Global Diagnostics
Global Mean KE
Global Mean PT
Global Mean SAL
case 01a Same (basic)
case 01b x1/2
case 01c x1/3
SSH case 01a x1/3 9th, 10th year
9th 10th
SSH case 01c x1/3 9th, 10th year
9th 10th
Viscosity – SSH (Kuroshio)
Global Mean KE
- 01a basic- 01b x1/2- 01c x1/3
x1/2 x1/3
basic
Low
High
Tsushima
Tsugaru
Soya
Es_01a
Es_01b Es_01c
(Sv)
0.38
2.61
3.0
Es_01c
1.10.340.38Soya
1.62.402.22Tsugaru
2.72.742.6Tsuhima(Korea)
Obs.Es_01bEs_01a
0.38
2.61
3.0
Es_01c
1.10.340.38Soya
1.62.402.22Tsugaru
2.72.742.6Tsuhima(Korea)
Obs.Es_01bEs_01a0.14 0.16
0.18 0.21
0.14 0.16
0.18 0.21
Tsushima
Soya
Tsugaru
Volume Transport
Tokara
Kuroshio
case_01c year-12 , Jan.,Mar.,and May
Jan
Mar May
New Sections are planning
for Volume
Transport Checking
case 01a Same (basic)
case 01b x1/2
case 01c x1/3
Volume Transport (Kuroshio)
Global Mean KE
- 01a basic- 01b x1/2- 01c x1/3
Viscosity – SSH (Gulf Stream)
x1/2 x1/3
basic
Sensitivity Analysis of POP x0.1
To improve Gulf Stream, etc.
→ Change Restoring condition, Topography
Viscosity & Diffusivity of Bi-harmonic Mixing
case 01e viscosity & diffusivity x1/3
+ w/o restoring
case 01f viscosity & diffusivity x1/3
+ w/ topography change
case 01f w/ topography change
case 01f x1/3 w/ topo. change case 01e x1/3 w/o restoring
SSH at 6th year
Global Diagnostics
Global Mean KE
Global Mean PT
Global Mean SAL
case 01a Same (basic) case 01b x1/2 case 01c x1/3 case 01d GM scheme case 01e x1/3 w/o restoring case 01f x1/3 w/ topo. change
Global Diagnostics
Global Mean KE
Global Mean PT
Global Mean SAL
case 20d (basic)
case 00a GM mixing
With NCEP daily forcing
Future Research Plan
1) POP x0.1 deg. * Improvement of the Model Sensitivity Analysis Horizontal & Vertical Mixing etc. Vertical Resolution 40 Layer -> 106 Layer Active Ice Model 2) POP x1 deg. * Tuning for CCSM2 Computation
Research Plan in FY2003
1) POP x0.1 deg. * Improvement of the Model Sensitivity Analysis : Horizontal & Vertical Mixing etc. Vertical Resolution : 40 Layer -> 106 Layer Active Ice Model ? * Analysis the Results and Write Paper
2) POP x1 deg. * Tuning for CCSM2 Computation
3) Regional Nesting model Porting to ES Center Nesting to POP x1 deg.