16
Programming on IBM Cell Triblade Jagan Jayaraj ,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Embed Size (px)

DESCRIPTION

TriBlade ▫Two QS22 blades, each with 2 PowerXCell 8i CPUs ▫LS21 blade with two dual-core AMD Opterons ▫16GB memory for LS21 and 8GB memory for QS22

Citation preview

Page 1: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Programming on IBM Cell TribladeJagan Jayaraj ,Pei-Hung Lin, Mike Knox and Paul WoodwardUniversity of MinnesotaApril 1, 2009

Page 2: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

•An instability of an interface between two fluids of different densities, which occurs when the lighter fluid is pushing the heavier fluid.

•Using multi-fluids Piecewise-Parabolic Method(PPM) to implement R-T instability simulation

•Program is written in Fortran

Rayleigh–Taylor instability

Page 3: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

TriBlade

▫Two QS22 blades, each with 2 PowerXCell 8i CPUs

▫LS21 blade with two dual-core AMD Opterons

▫16GB memory for LS21 and 8GB memory for QS22

Page 4: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009
Page 5: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

LCSE Cell Cluster•6 Triblades

•4 QS22 Cell blades

•2 QS20 Cell blades

•4 AMD Quadcore Systems

Page 6: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009
Page 7: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Login instructions•Account credentials should be in your

email.•Guest account: lcse / lcse$ncsa!•Login steps:

▫SSH to frodo.lcse.umn.edu▫Once logged in to frodo SSH to an assigned

Cell Processor host AMD – rra001a ~ rra006a Cell – rra001b / rra001c ~ rra006b/rra006c

Page 8: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Software available•Cell SDK 3.1•OpenMPI 1.3•DaCS Fortran bindings•Compilers

▫AMD: gfortran, gcc 4.1.2▫PPU: ppuxlf, ppu-gcc▫SPU: spuxlf, spu-gcc

•Example code is available on /mnt/scratch/NCSA_Example

Page 9: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Compilation and Execution•On AMD node:

▫make ppm4f-x86

•On Cell node:▫make ppm4f-ppu

•On AMD node:▫./ppm4f-x86

Page 10: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Three levels of parallelism:within-Cell within-node node-to-node

Compute-communication overlapDMADaCSMPI

Triblade programming paradigm

Page 11: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Single code for Roadrunner and non-RR systems◦Using lots #ifdef, #if, #endif…◦Using preprocessor to generate three codes

Minimize the manual translation for SPU code◦Using Fortran to Cell C translator,

Tedious portions of the SPU code can be translated.Fortran codes for PPU and AMD

◦Fortran binding programs for C intrinsic librariesKeep memory footprint small

Programming for IBM Cell Tri-blade

Page 12: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Single Source Code

Preprocessor

PPU Fortran codeSPU Fortran code AMD Fortran code

Translation

SPU C code Fortran Binding Programs

SPU C Compiler

PPU Fortran

Compiler

GNU Fortran

Compiler

AMD ExecutablePPU ExecutableSPU Executable Embedded

Page 13: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Division of labor▫Define jobs for AMD, PPU and SPU clearly

AMD: I/O, MPI, relay data to Cell…

PPU: Transfer data, manage SPUs

SPU: Just compute

Page 14: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

▫Three codes for three different ISAs

▫Different endian-ness between PPU and AMD Need to do byte-swapping

▫64bit/32bit conversion SPU supports 32bit address only, but DaCS

requires 64bit address mode

Items to care

Page 15: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

Translator•Fortran to C with Cell extensions

•Needs directives

•Built with ANTLR

•Handles:▫Vector and scalar loops▫DMAs (Including List DMAs)▫Variable declarations▫Conditional vector moves

Page 16: Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

References• Woodward, P. R., J. Jayaraj, P.-H. Lin, and P.-C. Yew, “Moving Scientific Codes to

Multicore Microprocessor CPUs,” Computing in Science & Engineering, special issue on novel architectures, Nov., 2008, p. 16-25. Also available at www.lcse.umn.edu/CiSE.

• Woodward, P. R., J. Jayaraj, P.-H. Lin, and D. Porter, “Programming Techniques for Moving Scientific Simulation Codes to Roadrunner,” tutorial given 3/12/08 at Los Alamos, link available at www.lanl.gov/roadrunner/rrtechnicalseminars2008.

• Woodward, P. R., J. Jayaraj, P.-H. Lin, and W. Dai, “First Experience of Compressible Gas Dynamics Simulationon the Los Alamos Roadrunner Machine,” submitted to Concurrency and Computation Practice and Experience, preprint available at www.lcse.umn.edu/RR-docs.

• http://www.lcse.umn.edu/NCSA_Workshop/