16
Programming on IBM Cell Triblade Jagan Jayaraj ,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Programming on IBM Cell Triblade

  • Upload
    mireya

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Programming on IBM Cell Triblade. Jagan Jayaraj ,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009. Rayleigh–Taylor instability. - PowerPoint PPT Presentation

Citation preview

Page 1: Programming on IBM Cell Triblade

Programming on IBM Cell TribladeJagan Jayaraj ,Pei-Hung Lin, Mike Knox and Paul WoodwardUniversity of MinnesotaApril 1, 2009

Page 2: Programming on IBM Cell Triblade

•An instability of an interface between two fluids of different densities, which occurs when the lighter fluid is pushing the heavier fluid.

•Using multi-fluids Piecewise-Parabolic Method(PPM) to implement R-T instability simulation

•Program is written in Fortran

Rayleigh–Taylor instability

Page 3: Programming on IBM Cell Triblade

TriBlade

▫Two QS22 blades, each with 2 PowerXCell 8i CPUs

▫LS21 blade with two dual-core AMD Opterons

▫16GB memory for LS21 and 8GB memory for QS22

Page 4: Programming on IBM Cell Triblade
Page 5: Programming on IBM Cell Triblade

LCSE Cell Cluster

•6 Triblades

•4 QS22 Cell blades

•2 QS20 Cell blades

•4 AMD Quadcore Systems

Page 6: Programming on IBM Cell Triblade
Page 7: Programming on IBM Cell Triblade

Login instructions

•Account credentials should be in your email.

•Guest account: lcse / lcse$ncsa!•Login steps:

▫SSH to frodo.lcse.umn.edu▫Once logged in to frodo SSH to an assigned

Cell Processor host AMD – rra001a ~ rra006a Cell – rra001b / rra001c ~ rra006b/rra006c

Page 8: Programming on IBM Cell Triblade

Software available•Cell SDK 3.1•OpenMPI 1.3•DaCS Fortran bindings•Compilers

▫AMD: gfortran, gcc 4.1.2▫PPU: ppuxlf, ppu-gcc▫SPU: spuxlf, spu-gcc

•Example code is available on /mnt/scratch/NCSA_Example

Page 9: Programming on IBM Cell Triblade

Compilation and Execution

•On AMD node:▫make ppm4f-x86

•On Cell node:▫make ppm4f-ppu

•On AMD node:▫./ppm4f-x86

Page 10: Programming on IBM Cell Triblade

Three levels of parallelism:within-Cell within-node node-to-node

Compute-communication overlapDMADaCSMPI

Triblade programming paradigm

Page 11: Programming on IBM Cell Triblade

Single code for Roadrunner and non-RR systems◦Using lots #ifdef, #if, #endif…◦Using preprocessor to generate three codes

Minimize the manual translation for SPU code◦Using Fortran to Cell C translator,

Tedious portions of the SPU code can be translated.Fortran codes for PPU and AMD

◦Fortran binding programs for C intrinsic librariesKeep memory footprint small

Programming for IBM Cell Tri-blade

Page 12: Programming on IBM Cell Triblade

Single Source Code

Preprocessor

PPU Fortran codeSPU Fortran code AMD Fortran code

Translation

SPU C code Fortran Binding Programs

SPU C Compiler

PPU Fortran

Compiler

GNU Fortran

Compiler

AMD ExecutablePPU ExecutableSPU

Executable Embedded

Page 13: Programming on IBM Cell Triblade

Division of labor▫Define jobs for AMD, PPU and SPU clearly

AMD: I/O, MPI, relay data to Cell…

PPU: Transfer data, manage SPUs

SPU: Just compute

Page 14: Programming on IBM Cell Triblade

▫Three codes for three different ISAs

▫Different endian-ness between PPU and AMD Need to do byte-swapping

▫64bit/32bit conversion SPU supports 32bit address only, but DaCS

requires 64bit address mode

Items to care

Page 15: Programming on IBM Cell Triblade

Translator•Fortran to C with Cell extensions

•Needs directives

•Built with ANTLR

•Handles:▫Vector and scalar loops▫DMAs (Including List DMAs)▫Variable declarations▫Conditional vector moves

Page 16: Programming on IBM Cell Triblade

References• Woodward, P. R., J. Jayaraj, P.-H. Lin, and P.-C. Yew, “Moving Scientific Codes to

Multicore Microprocessor CPUs,” Computing in Science & Engineering, special issue on novel architectures, Nov., 2008, p. 16-25. Also available at www.lcse.umn.edu/CiSE.

• Woodward, P. R., J. Jayaraj, P.-H. Lin, and D. Porter, “Programming Techniques for Moving Scientific Simulation Codes to Roadrunner,” tutorial given 3/12/08 at Los Alamos, link available at www.lanl.gov/roadrunner/rrtechnicalseminars2008.

• Woodward, P. R., J. Jayaraj, P.-H. Lin, and W. Dai, “First Experience of Compressible Gas Dynamics Simulationon the Los Alamos Roadrunner Machine,” submitted to Concurrency and Computation Practice and Experience, preprint available at www.lcse.umn.edu/RR-docs.

• http://www.lcse.umn.edu/NCSA_Workshop/