Download ppt - Essential Overview

Transcript
Page 1: Essential Overview

© 2005 IBM

Essential Overview

Louisiana Tech UniversityRuston, Louisiana

Charles GrasslIBM

January, 2006

Page 2: Essential Overview

2 © 2005 IBM Corporation

Agenda

• Hardware• Software• Documentation

Page 3: Essential Overview

3 © 2005 IBM Corporation

Hardware Overview

• Processors:

• Nodes:

• Clusters:

Page 4: Essential Overview

4 © 2005 IBM Corporation

Product Naming

New Name Old Name Market Processor

iSeries AS400 Commercial RS64

pSeriesRS600

SPSP2

TechnicalPOWER3POWER4POWER5

xSeriesIA-32IA-64

ServerXeonAMD

zSeries ES9000 Mainframe RS64

Page 5: Essential Overview

5 © 2005 IBM Corporation

Processor Progression

Processor Years Clock Rate Feature

POWER2 1990 - 1994 20 – 60 MHz RISC

P2SC 1994 - 1998 60 – 150 MHz Bandwidth

POWER3 1998 – 2002 200 – 450 MHz Single Chip

POWER4 2001 – 2005 1 – 1.9 GHz Dual Core

POWER5 2004 - 1.5 – 1.9 GHz Multi-Thread

Page 6: Essential Overview

6 © 2005 IBM Corporation

POWER5 Systems

• POWER5 processors•Single and Dual processor chips

• Modules•Dual Chip Modules (DCM)

•Multi Chip Modules (MCM)• Nodes

•Multiple modules• p5-575• p5-595

• Cluster•Multiple nodes

•Connected with High Speed Switch (HPS)

Page 7: Essential Overview

7 © 2005 IBM Corporation

Systems (“Nodes”)

Model ProcessorsClock Rate

(GHz)Memory

(x 2^30 byte)

p5-595 16-6416-64 1.65, 1.91.65, 1.9 20002000

p5-590 8-32 1.65, 1.9 1000

p5-575 8 1,5, 1.9 256

p5-570 2-16 1.65, 1.9 512

p5-550 2-4 1.65 64

p5-520 2 1.65 32

p5-510 1,2 1.65 1 - 32

Page 8: Essential Overview

8 © 2005 IBM Corporation

POWER5 Processor Systems

MCM

Chip

Processor

DCM p5-575

p5-595

Cluster

Page 9: Essential Overview

9 © 2005 IBM Corporation

Cluster 1600

Multi ProcessorNodes

Physical ViewLogical View

Network,Disk System

Page 10: Essential Overview

10 © 2005 IBM Corporation

Local System Name

• IBM p5-575 nodes• 1.9 GHz POWER5 processors

• Single processor chips

• 8 processors per node

• HPS interconnect

• “575” distinction:• Dual Chip Module (DCM)• 8 DCMs• One or two processors

per chip• Single Core (SC)

• Dual Core (DC)

• “595” distinction:• Multi Chip Module (MCM)

construction• 8 MCMs

Page 11: Essential Overview

11 © 2005 IBM Corporation

POWER5 Processors

• Multi-processor chip• High clock rate: Multiple GHz• Three cache levels

• Bandwidth

• Latency hiding

• Shared Memory• Large memory size

Page 12: Essential Overview

12 © 2005 IBM Corporation

POWER5 Features

• Private L1 cache• Shared L2 cache• Shared L3 cache• Interleaved memory• Hardware Prefetch• Multiple Page Size support

Page 13: Essential Overview

13 © 2005 IBM Corporation

Processor Characteristics

• High frequency clocks•Deep pipelines

•High asymptotic rates

• Superscalar• Speculative out-of-order instructions • Up to 8 outstanding cache line misses• Large number of instructions in flight• Branch prediction• Hardware Prefetching

Page 14: Essential Overview

14 © 2005 IBM Corporation

Processor FeaturesPOWER4 POWER5

Clock 1.0 – 1.9 GHz 1.5 – 1.9 - … GHz

Caches Three levels Three levels

L3 Speed 1/3 clock frequency ½ clock frequency

Virtualization Up to 32 partitions Up to 254 partitions

Partitions Unit processor Fractional

Power Mang. Static Dynamic

ThreadExecution

Single Thread Multi Threading

Memory Store

Single Buffer Double Buffer

Renaming Registers

GP: 72FP: 80

GP: 120FP: 120

Page 15: Essential Overview

15 © 2005 IBM Corporation

Caches and Memory

POWER4 POWER5

L1 CacheData: 32 kbyteInstruction: 64 kbyte2-way Assoc., FIFO

Data: 32 kbyteInstruction: 64 kbyte4-way Assoc., LRU

L2 Cache 1.5 Mbyte8-way Assoc., FIFO

1.9 Mbyte10-way Assoc., LRU

L3 Cache32 Mbyte8-way Assoc., LRU120 Cycles

36 Mbyte12-way Assoc., LRU~80 Cycles

Memory Bandwidth 4 Gbyte/s / Chip 16 Gbyte/s / Chip

Page 16: Essential Overview

16 © 2005 IBM Corporation

POWER4+ POWER5

Frequency (GHz) 1.7 1.9

L2 Latency (Cycles) 12 12

L3 Latency (Cycles) 120 80

Memory Latency (Cycles) 351 220

Copy Bandwidth4 proc. (Gbyte/s)

8 18

Linpack RateN=1000 (Gflop/s)

3.9 5.6

SPECint_base2000 1077 1398

SPECfp_base2000 1598 2576

POWER4 – POWER5 Comparison

Page 17: Essential Overview

17 © 2005 IBM Corporation

POWER5 Design: Summary

• More gates•170 million 260 million

• Enhancements• Increased cache associativity

• Increased number of rename registers

•Reduced L3 and cache latency

• New features•Simultaneous Multi Threading

•Dynamic power management

Page 18: Essential Overview

18 © 2005 IBM Corporation

Processor Systems (Nodes)

• Multiple processors• Multiple modules• Various construction formats

•Multi Chip Modules

•Dual Chip Modules

• Shared memory

Page 19: Essential Overview

19 © 2005 IBM Corporation

Multi Chip and Dual Chip Modules

Multi Chip Module (MCM)p5-590p5-595

ChipPOWER5 Processor

Dual Chip Module (MCM)p5-570p5-575

Page 20: Essential Overview

20 © 2005 IBM Corporation

Dual Chip Module

• Each Module:• 1 processor chip

• 1 L3 cache

• 1 Memory card• Each Processor

Chip• 2 processors

• L1 caches• Registers• Functional

units

• 1 L2 cache

• 1 path to memory

36Mbyte

L3

Memory

Page 21: Essential Overview

21 © 2005 IBM Corporation

Multi Chip Module

• Each Module:• 4 processor chips

• 4 L3 cache chips

• 2 Memory cards

• Each Processor Chip• 2 processors

• L1 caches

• Registers

• Functional units

• 1 L2 cache

• 1 path to memory

Memory

Memory

Memory

Memory

Page 22: Essential Overview

22 © 2005 IBM Corporation

POWER5 Multi Chip Module

• Four POWER5 chips• Four L3 cache chips• 95mm 95mm• 4,491 signal I/Os• 89 layers of metal

Page 23: Essential Overview

23 © 2005 IBM Corporation

POWER5 Dual Chip Module

• One POWER5 chip

• Single or Dual Core

• One L3 cache chips

Page 24: Essential Overview

24 © 2005 IBM Corporation

L3

Modifications to POWER4 System Structure

P P

L2

Memory

L3

Fab Ctl

P P

L2

L3

Memory

L3

Fab Ctl

L3 L3

Mem Ctl Mem Ctl

Mem Ctl Mem Ctl

Page 25: Essential Overview

25 © 2005 IBM Corporation

Switch Technology

• Internal network•In lieu of GigEthernet, Myrinet, Quadrics, etc.

• Fourth generation•HPS Switch (POWER2 generation)

•SP Switch (POWER2 -> POWER3)

•SP Switch 2 (POWER3 -> POWER4)

•HPS (POWER4 -> POWER5)• Multiple links per node

•Match number of links to number of processors

Page 26: Essential Overview

26 © 2005 IBM Corporation

High Performance Switch (HPS)

• Also Known As “Federation”• Follow on to SP Switch2

•Also known as “Colony”• Specifications:

•2 Gbyte/s (bidirectional)

•5 microsecond latency• Configuration:

•Up to four adaptors per node• 2 links per adaptor• 16 Gbyte/s per node

Page 27: Essential Overview

27 © 2005 IBM Corporation

HPS Specifications

Latency[microsec.]

Bandwidth, single

[Mbyte/s]

Bandwidth, multiple

[Mbyte/s]

SP Switch 2 15 350 550

HPS 5 1800 1930

Page 28: Essential Overview

28 © 2005 IBM Corporation

Software Overview

• Operating System• AIX

• Compilers• C

• C++

• Fortran• Batch Queue

• LoadLeveler (IBM)

• LSF (Platform)

• PBS

• Gridware

Page 29: Essential Overview

29 © 2005 IBM Corporation

AIX

• Current Version: AIX 5.3• Processors:

• POWER3

• POWER4

• POWER5• Linux Affinity• Logical PARtitions (LPAR) Nodes

• Operating system

• Memory

• Network connections• Kernel Address Size:

• 64-bit

• 32-bit

Page 30: Essential Overview

30 © 2005 IBM Corporation

Linux on POWER

• Native Linux, SuSE7 SuSE8• Rpm's and package managers• Cluster Systems Manager• 64-bit kernel• 32/64-bit applications support (SuSE8)

Compiler User Name

C Xlc

C++ xlC

Fortran xlf

Page 31: Essential Overview

31 © 2005 IBM Corporation

Compilers

C and C++• Visual Age C and C++

Professional for AIX• Versions 6, 7, 8

• ANSI C

• C++

• Compiler names:• xlc

• xlC

Fortran• XL Fortran for AIX

• Versions 8, 9, 10

• Fortran 77

• Fortran 90

• Compiler names:• xlf77

• xlf90

Page 32: Essential Overview

32 © 2005 IBM Corporation

Compiler Names

Compiler User Name

Fortran 77 xlf77

Fortran 90 xlf90

C xlc

C++ xlC

MPI compile mpxlf, mpcc

Reentrant xlf_r, xlc_r

AIX uses different compiler names to perform some tasks which are handled by compiler flags on most other systems

Page 33: Essential Overview

33 © 2005 IBM Corporation

Compiler Usage

Language Command Feature Extension

ANSI Cxlc

xlc_rANSI

Thread safe.c

Extended C cc Pre-ANSI .c

MPI, C mpxlc MPI .c

C++xlC

xlC_r Thread safe.C .cc .cpp

Fortran 77xlf

xlf_r Thread safe.f

Fortran 90xlf90

xlf90_r Thread safe.f

MPI fortran mpxlf MPI .f

Page 34: Essential Overview

34 © 2005 IBM Corporation

User Limits

• Set by the system administrator• Ulimit:

•C or K shell built-in

•Sets or reports resource limits

•Limits are defined in /etc/security/limits

• Sizes are in 512 byte blocks• Times are in seconds

•$ ulimit -a

Page 35: Essential Overview

35 © 2005 IBM Corporation

Ulimit Defaults

Value

Limit Definition Default Typical

fsize File Size 2097151 Unlimited (-1)

core Core File Size 2097151 Unlimited (-1)

cpu Per Process limit -1 (unlimited)

Unlimited (-1)

data Data Segment Size 262144 Unlimited (-1)

stack Stack Segment Size 65536 *Unlimited (-1)

No. files File Descriptor Limit

2000 2000

* 64-bit address mode

Page 36: Essential Overview

36 © 2005 IBM Corporation

Other Defaults

• Thread control• /etc/environment

• AIXTHREAD_SCOPE=S• AIXTHREAD_MNRATIO=1:1• AIXTHREAD_COND_DEBUG=OFF• AIXTHREAD_GUARDPAGES=4• AIXTHREAD_MUTEX_DEBUG=OFF• AIXTHREAD_RWLOCK_DEBUG=OFF

Page 37: Essential Overview

37 © 2005 IBM Corporation

Batch Queuing

• Compile on any AIX node•Use –qarch=pwr5

• Submit job with available batch utility• Use appropriate queue name• Available queuing systems:

•LoadLeveler

•PBS

•Gridware

•LSF

Page 38: Essential Overview

38 © 2005 IBM Corporation

Cluster Layout

CompileAnd

SubmitNode

Node 0 Node 1

Network

Node 2

Page 39: Essential Overview

39 © 2005 IBM Corporation

Documentation

• Software:•www.software.ibm.com

• Products A-Z• X -> xl C, xl C/C++, xl Fortran

•www.servers.ibm.com/aix• Compilers

• /usr/vac/doc

• /usr/vacpp/doc

• /usr/lpp/xlf/doc• Redbooks:

•www.redbooks.ibm.com/• IBM eServer p5 590 and 595 System Handbook

Page 40: Essential Overview

40 © 2005 IBM Corporation

Documentation

• AIX Commands Reference•AIX command:

• /usr/sbin/infocenter• /opt/ibm_help/help_start.sh

•http://www.unet.univie.ac.at/aix/aixgen/wbinfnav/aixcmdsrefbooks.htm• Google search: “AIX Commands Reference”

Page 41: Essential Overview

41 © 2005 IBM Corporation

Documentation Library

Google Search: AIX 5L documentation Libraryhttp://publibn.boulder.ibm.com/cgi-bin/ds_rslt

Page 42: Essential Overview

42 © 2005 IBM Corporation

Summary: Architecture

• System architecture• Processors

• Nodes

• Cluster

• Processors• POWER5

• Three levels of cache

• Nodes:• Eight processor p5-575

• Cluster:• 14 p5-575 nodes

• HPS interconnect


Recommended