17
Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Embed Size (px)

Citation preview

Page 1: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Accelerating a climate physics model with OpenCL

CMSC 601 Spring 11 – Research Skills

Dibyajyoti Ghosh

Page 2: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

What is climate physics model?• Global weather is controlled by many interconnected events.

Includes changes in atmosphere and oceans, ebb and flow of sea ice etc.

• World’s most powerful super computers can simulate these events.

• CCSM-2 model simulate Earth’s climate patterns in considerable detail through 700 billion calculations to recreate a single day of the world’s climate.

• Scientists use these data to understand ocean currents, predict weather patterns, study O3 layer among others.

http://www.ucar.edu/communications/CCSM/overview.html

Page 3: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Background• Solar radiation component of

NASA’s GEOS-v5 takes ~20% of model computation time.

• NASA interested in analysis of performance and cost benefit using non traditional computing systems.

• GEOS-v5 - 20+ old, written in Fortran (mostly), still evolving.

• Cannot be entirely rewritten due to production constraints.

http://www.ucar.edu/communications/CCSM/overview.html

Page 4: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Related Work

• Accelerating Climate Models with the IBM Cell Processor – Shujia Zhou et al, 2008

• GPU Computing for Atmospheric Modeling - Kelly, Rory NCAR, Boulder, July-Aug. 2010

• Accelerating Atmospheric Modeling Through Emerging Multi-core Technologies - Linford, John Christian , Virginia Tech, 2010

• Exploiting Array Syntax in Fortran for Accelerator Programming - Matthew J. Sottile, Craig E Rasmussen, Wayne N. Weseloh, Robert W. Robey, Los Alamos National Laboratory

Page 5: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Motivation

http://www.cc.gatech.edu/~bader/AFRL-GT-Workshop2009/AFRL-GT-Bader.pdf

No data on how OpenCL fares against GCC in vectorization.

OpenCL - created with goal of unifying hybrid systems.No literature on OpenCL portability among architectures.

Page 6: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

6

a b c d e f g h i j k l m n o p

OP(a)

OP(b)

OP(c)

OP(d)

Data in Memory:

VOP( a, b, c, d ) VR1

a b c dVR1

VR2

VR3

VR4

VR5

0 1 2 3

What is vectorization?

Vector Registers

Vector operation

Data elements packed into vectors Vector length Vectorization Factor (VF)

VF = 4 original serial loop:

for(i=0; i<N; i++){ a[i] = a[i] + b[i];}

loop in vector notation:for (i=0; i<N; i+=VF) { a[i:i+VF-1] = a[i:i+VF-1] + b[i:i+VF-1];}

vectorization

Thanks to Dorit Nuzman, IBM www.hipeac.net/system/files/4_Nuzman.ppt for this wonderful slide

Page 7: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

OpenCL trivia• A framework for heterogeneous computing

resources developed by Apple Inc. now supported by all major vendors.

• A subset of C language with additional features to facilitate parallel processing.

http://www.khronos.org/opencl/

Page 8: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

How data ||-ism works on OpenCL?

• Kernel is the code for a work item that is executed on a device (CPU or GPU or others).

• Imagine a NxN grid with one kernel invocation per grid.

Page 9: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Our Approach Used code from the production version of the

NASA GEOS-v5 climate model.

• Step #1 – Identify computation intensive sections from the weather model.

• Step #2 – Port these sections to OpenCL on IBM Cell B.E. and then to Mac OSX to test on Intel CPU.

• Step #3 – Analyze performance and reason the performance.

Page 10: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Findings - I

Speedup on IBM Cell B.E. with OpenCL Speedup on Mac OSX with OpenCL

Serial VS parallel speedup of a code section analyzed on Mac OSX

Page 11: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Findings - II

1. Speedup achieved ~40x on both IBM and Intel CPUs.

2. Code NOT portable among architectures, sections of code not functioning due to incomplete OpenCL implementation on Mac OSX Intel based architecture.

3. GCC vectorization fails in certain cases compared to OpenCL. We attempted compilation of serial code with gcc -O2 -ftree-vectorize flag.

Page 12: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Road Ahead

• Making appropriate changes to the solar radiation code for Mac OS X Intel CPU based architecture. Remember some parts of the code base is non-functional on Intel CPUs.

• Modify the OpenCL code to run on GPUs and understand if performance is portable, in addition to code.

Page 13: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Summary

• OpenCL’s attempt towards portability in high performance computing is still a long road ahead.

• GCC vectorization fails against OpenCL.

Page 14: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Acknowledgements

• Dr. Shujia Zhou, MC2 Lab• Fahad Zafar, MC2 Lab• Center for Hybrid Multicore Productivity

Research, UMBC• CMSC 601 folks

Page 15: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

http://www.asianjobportal.com/wp-content/uploads/2010/11/25_questions_interview.jpg

Page 16: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Vectorization Analysis - IA part of the serial code with gcc vectorization error output

Page 17: Accelerating a climate physics model with OpenCL CMSC 601 Spring 11 – Research Skills Dibyajyoti Ghosh

Vectorization Analysis - IIA part of the OpenCL code with vectorized instruction set for the loop-construct in the last slide