Upload
geoffrey-porter
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Sima Dezső
2007 őszi félév
(Ver. 2.1) Dezső Sima, 2007
Többmagos Processzorok (5)
10.3 IBM’s MC processors
• POWER4 180 nm10/2001• POWER4+ 130 nm11/2002
10.3.1 POWER line
• POWER5 130 nm 5/2004
• POWER5+ 90 nm10/2005• POWER6 65 nm2007
Figure: The evolution of IBM’s major RISC lines
92 93 94 95 96 97 98 999190 02 030100 04 058988
OS/400
Commercial computing
IMPI/48
AIX
Technical computing
PowerPC/32
PowerPC AS/64
PowerPC/64
POWER/32
A10 A30
A50 Pulsar SStar
601 604 604e
POWER POWER2
Power3
Power3-II
P2SC
AS/400 e-Server iSeries
RS/6000 e-Server pSeries
(Scalar CISC)
(~2.G. superscalar)
(~1.G. superscalar)
(3.G. superscalar)
(3.G. superscalar)
(1.-2.G. superscalar)
Upwards binary compatible extension
Transition
Derived from
Northstar SStar
POWER4 POWER5
PowerPC/64 ext.
PowerPC AS/64 ext.(1.G. superscalar)
PSC
AS/400-line
06 07
POWER4+ POWER5+
POWER6
10.3.1 Evolution of IBM’s major RISC lines
Figure : POWER4 chip logical view [3.6]
10.3.1 POWER4 (1)
Built-In-SelfTest
Service Processor
Power On Reset
Core interface Unit(crossbar)
Non-CacheableUnit
MultiChip Module
Figure: Logical view of the L3 controller [3.5]
10.3.1 POWER4 (2)
Figure: The memory cotroller of the POWER4 [3.5]
10.3.1 POWER4 (3)
Figure: I/O controller of the POWER4 [3.5]
Fabric Controller
10.3.1 POWER4 (4)
Figure: POWER4 chip [3.11]
10.3.1 POWER4 (5)
10.3.1 POWER4 (6)
Table: Main features of IBM’s dual-core POWER line
Off-chipMem. contr.
L3
L21.44 MB/sharedSize/allocation
On-chipImplementation
32 MBSize
32 MB
Tags on-chip
SCM1/MCM2
115/125
Tags on-chip, data off-chip
1.3
174 mtrs
412 mm2
180 nm
10/2001
DC
POWER4
L3 size
L3 impl.
Power management
Dual threaded
Packaging
TDP [W]
Implementation
fc [GHz]
Nr. of transistors
Die size
Technology
Introduced
Dual/Quad-Core
POWER line
1 SMC: Single Chip Module2 MCM: Multi Chip Module3 DCM: Dual Chip Module
4 DCM: Dual Core Module5 QCM: Quad Core Module6 DPM: Dynamic Power Management
10.3.2 POWER4+ (1)
Figure: New features of the POWER5+ [3.3]
10.3.1 POWER4+ (2)
Table: Main features of IBM’s dual-core POWER line
On-chipOff-chipMem. contr.
L3
L21.5 MB/shared1.44 MB/sharedSize/allocation
On-chipOn-chipImplementation
32 MB32 MBSize
SCM1/MCM2
70
1.7
184 mtrs
380 mm2
130 nm
11/2002
DC
POWER4+
32 MB
Tags on-chip
SCM1/MCM2
115/125
Tags on-chip, data off-chip
1.3
174 mtrs
412 mm2
180 nm
10/2001
DC
POWER4
L3 size
L3 impl.
Power management
Dual threaded
Packaging
TDP [W]
Implementation
fc [GHz]
Nr. of transistors
Die size
Technology
Introduced
Dual/Quad-Core
POWER line
1 SMC: Single Chip Module2 MCM: Multi Chip Module3 DCM: Dual Chip Module
4 DCM: Dual Core Module5 QCM: Quad Core Module6 DPM: Dynamic Power Management
Figure 5.14: Contrasting POWER4 and POWER5 system structures [3.1]
10.3.1 POWER5 (1)
Figure: Block diagram of the POWER5 (1) [3.1]
10.3.1 POWER5 (2)
Figure: Block diagram of the POWER5 (2) [3.12]
10.3.1 POWER5 (3)
10.3.1 POWER5 (4)
Figure: Floorplan of the POWER5 [3.13]
POWER4 POWER5
180 nm, 412 mm2130 nm, 389 mm2 (~3 % enlarged)
10.3.1 POWER5 (6)
Figure: Contrasting the floor plans of the POWER4 and POWER5 dies [3.11], [3.13]
Figure: Packaging alternatives of the POWER4/5 processors
Source: Partridge R. and Ghatpande S., IBM Introduces POWER5+ and Quad-Core Modules in System p5,” Tech Trends Monthly, Nov./Dec. 2005,
POWER5+Dual-Core Module
10.3.1 POWER5 (7)
POWER4 MCM Photo 32-way System Showing 4 MCMs and L3 Cache
Figure: Quad–Chip POWER4 module (MCM) and a 32-way POWER4 system [3.7]
10.3.1 POWER5 (8)
Figure.: Interpretation of Dual-Chip Modules (DCMs) and Multi-Chip Modules (MCM) of the POWER5 [3.7]
10.3.1 POWER5 (9)
Figure: Photos of Dual-Chip Modules (DCMs) and Multi-Chip Modules (MCM) of the POWER5 [3.7]
10.3.1 POWER5 (10)
Figure: The Multi-chip module of the POWER5 [3.10]
10.3.1 POWER5 (11)
10.3.1 POWER5 (12)
Table: Main features of IBM’s dual-core POWER line
On-chipOn-chipOff-chipMem. contr.
L3
L21.9 MB/shared1.5 MB/shared1.44 MB/sharedSize/allocation
On-chipOn-chipOn-chipImplementation
36 MB32 MB32 MBSize
36 MB
Tags on-chip
DPM6
DCM3/MCM2
80 (est)
1.65/1.9
276 mtrs
389 mm2
130 nm
5/2004
DC
POWER5
SCM1/MCM2
70
1.7
184 mtrs
380 mm2
130 nm
11/2002
DC
POWER4+
32 MB
Tags on-chip
SCM1/MCM2
115/125
Tags on-chip, data off-chip
1.3
174 mtrs
412 mm2
180 nm
10/2001
DC
POWER4
L3 size
L3 impl.
Power management
Dual threaded
Packaging
TDP [W]
Implementation
fc [GHz]
Nr. of transistors
Die size
Technology
Introduced
Dual/Quad-Core
POWER line
1 SMC: Single Chip Module2 MCM: Multi Chip Module3 DCM: Dual Chip Module
4 DCM: Dual Core Module5 QCM: Quad Core Module6 DPM: Dynamic Power Management
Source: Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, http://www.redbooks.ibm.com/redpapers/pdfs/redp4150.pdf
Figure: Block diagram of the POWER5+
10.3.1 POWER5+ (1)
Figure: Dual-Core Modules (DCMs) and Quad-Core Modules (QCM) of the POWER5+ [3.14]
10.3.1 POWER5+ (2)
10.3.1 POWER5+ (3)
Table: Main features of IBM’s dual-core POWER line
On-chipOn-chipOn-chipOff-chipMem. contr.
L3
L21.9 MB/shared1.9 MB/shared1.5 MB/shared1.44 MB/sharedSize/allocation
On-chipOn-chipOn-chipOn-chipImplementation
36 MB36 MB32 MB32 MBSize
36 MB
Tags on-chip
DPM6
DCM3/MCM2
80 (est)
1.65/1.9
276 mtrs
389 mm2
130 nm
5/2004
DC
POWER5
SCM1/MCM2
70
1.7
184 mtrs
380 mm2
130 nm
11/2002
DC
POWER4+
32 MB
Tags on-chip
SCM1/MCM2
115/125
Tags on-chip, data off-chip
1.3
174 mtrs
412 mm2
180 nm
10/2001
DC
POWER4
36 MB
Tags on-chip
DPM6
DCM4/QCM5
70
1.92
276 mtrs
230 mm2
90 nm
10/2005
DC
POWER5+
L3 size
L3 impl.
Power management
Dual threaded
Packaging
TDP [W]
Implementation
fc [GHz]
Nr. of transistors
Die size
Technology
Introduced
Dual/Quad-Core
POWER line
10.3
1 SMC: Single Chip Module2 MCM: Multi Chip Module3 DCM: Dual Chip Module
4 DCM: Dual Core Module5 QCM: Quad Core Module6 DPM: Dynamic Power Management
POWER6 POWER5+
Figure: Contrasting the block diagrams of the POWER5 and POWER6 processors [3.15]
Hardware support of decimal arithmetic
10.3.1 POWER6 (1)
10.3.1 POWER6 (2)
Table: Main features of IBM’s dual-core POWER line
On-chipOn-chipOn-chipOff-chipMem. contr.
L3
L22*4 MB/private1.9 MB/shared1.9 MB/shared1.5 MB/shared1.44 MB/sharedSize/allocation
On-chipOn-chipOn-chipOn-chipOn-chipImplementation
64 MB?36 MB36 MB32 MB32 MBSize
36 MB
Tags on-chip
DPM6
DCM3/MCM2
80 (est)
1.65/1.9
276 mtrs
389 mm2
130 nm
5/2004
DC
POWER5
SCM1/MCM2
70
1.7
184 mtrs
380 mm2
130 nm
11/2002
DC
POWER4+
32 MB
Tags on-chip
SCM1/MCM2
115/125
Tags on-chip, data off-chip
1.3
174 mtrs
412 mm2
180 nm
10/2001
DC
POWER4
36 MB
Tags on-chip
DPM6
DCM4/QCM5
70
1.92
276 mtrs
230 mm2
90 nm
10/2005
DC
POWER5+
32 MBL3 size
Tags on-chipL3 impl.
n.a.Power management
Dual threaded
n.a.Packaging
~100TDP [W]
Implementation
4-5fc [GHz]
750 mtrsNr. of transistors
341 mm2Die size
65 nmTechnology
2007Introduced
DCDual/Quad-Core
POWER6POWER line
1 SMC: Single Chip Module2 MCM: Multi Chip Module3 DCM: Dual Chip Module
4 DCM: Dual Core Module5 QCM: Quad Core Module6 DPM: Dynamic Power Management
10.3 IBM’s MC processors
• Cell BE 90 nm2/2006
10.3.2 Cell BE
Figure: The history and development cost of the Cell BE [3.17], [3.22]
10.3.2 Cell BE (1)
AUC: Atomic Update Cache
BIC: Bus Interface Contr.
EIB: Element Interface Bus
LS: Local Store of 256 KB
MFC: Memory Flow Controller
MIC: Memory Interface Contr.
PPE: Power Processing Element
PXU: POWER Execution Unit
SMF: Synergistic Memory Flow
Unit
SPU: Synergistic Processor Unit
SXU: Synergistic Execution Unit
XDR: Rambus DRAM
Figure: Block diagram of the Cell BE [3.19]
10.3.2 Cell BE (2)
PPE: dual-threaded > 200 GFLOPS (SP) > 20 GFLOPS (DP) > 25 GB/s memory BW > 75 GB/s I/O BW > 300 GB/s EIB BW fc > 4 GHz (lab)
Figure: Main design parameters of the Cell BE [3.28]
10.3.2 Cell BE (3)
Design parameters of the Cell BE:
Figure : Cell SPE architecture [3.16]
10.3.2 Cell BE (4)
Figure: Block diagram of the SPE [3.19]
10.3.2 Cell BE (5)
Figure: Pipeline stages of the Cell BE [3.19]
10.3.2 Cell BE (6)
Figure: Floor plan of a single SPE [3.19]
10.3.2 Cell BE (7)
Principle of operation of the Element Interface Bus (EIB) [3.23]
10.3.2 Cell BE (8)
Figure: The Element Interface Bus EIB) [3.19]
10.3.2 Cell BE (9)
Figure: The Synergistic Memory Flow unit (SMF) [3.19]
10.3.2 Cell BE (10)
Figure: PPE block diagram [3.28]
Figure: Floor plan of the Cell BE processor [3.19]
235 mm2
241 mtrs
10.3.2 Cell BE (11)
10.3.2 Cell BE (12)
Table: Main features of the IBM’s Cell BE
L3
On-chipMemory controller
Ring basedInterconnection network
Up to 75 MB/sI/O bandwidth
PPE: 2-waySPE:
Multithreading
95 W @ 3GHzTDP [W]
25 GB/sMemory bandwidth
PPE: 512 KBSPE: 256 KB Local Store (128*128 bit)
L2
3.0/3.2fc [GHz]
234 mtrsNr. of transistors
221 mm2Die size
90 nmTechnology
9/2006 (in the QS20 BladeCenter)Introduction
PPE: 64-bit RISCSPE: Dual-issue 32-bit SIMD with 128 bit capability
Cores
PowerPC 2.02Architecture
Heterogeneous1xPPE, 8*SPE
Implementation
Cell BESeries
Source: Brochard L., A Cell History,” Cell Workshop, April, 2006 http://www.irisa.fr/orap/Constructeurs/Cell/Cell%20Short%20Intro%20Luigi.pdf
Figure: Cell BE Blade Roadmap
10.3.2 Cell BE (13)
Source: Hofstee H. P., „Real-time Superconputing and Technology for Games and Entertainment,” 2006, http://www.cercs.gatech.edu/docs/SC06_Cell_111606.pdf
Figure: Roadmap of the Cell BE
10.3.2 Cell BE (14)
10.3 Literature (1)
POWER4, POWER4+
[3.3] Grassl C., „New IBM Components for HPCx”, Dec. 2003, http://www.hpcx.ac.uk/about/events/annual2003/Grassl.pdf
[3.1] Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http://www.llnl.gov/computing/tutorials/ibm_sp/
[3.2] DeMone P., „Sizing Up the Super Heavyweights,” Real Word Technologies, Sept. 2004, http://h21007.www2.hp.com/dspp/files/unprotected/Itanium/sizingsuperheavys.pdf
[3.4] Krevell K., „IBM’s POWER4 Unveiling Continuues”, Microprocessor Report, Nov. 20. 2000, pp- 1-4
[3.5] Tendler, J.M., Dodson, S., Fields S., Le H., Sinharoy B.: Power4 System Microarchitecture, IBM Server, Technical White Paper, October 2001, http://www-03.ibm.coom/servers/eserver/pseries/hardware/whitepapers/power4.pdf
POWER5, POWER5+
[3.9] Grassl C., „New IBM Components for HPCx”, Dec. 2003, http://www.hpcx.ac.uk/about/events/annual2003/Grassl.pdf
[3.7] Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http://www.llnl.gov/computing/tutorials/ibm_sp/
[3.8] DeMone P., „Sizing Up the Super Heavyweights,” Real Word Technologies, Sept. 2004, http://h21007.www2.hp.com/dspp/files/unprotected/Itanium/sizingsuperheavys.pdf
[3.10] Kalla R., „IBM’s POWER5 Microprocessor Design and Methodology,” 2003, www-csl.csres.utexas.edu/users/billmark/teach/cs352-05-spring/lectures/Lecture22-RonKallaIBM.pdf
[3.6] Tendler, J.M., Dodson, S., Fields S., Le H., Sinharoy B.: Power4 System Microarchitecture,, IBM J. Res. & Dev. Vol. 46, No. 1, Jan. 2002, pp. 5-25,
http://www.research.ibm.com/journal/rd/461/tendler.pdf
[3.11] Kalla R., Sinharoy B., Tendler J.: Simultaneous Multi-threading Implementation in Power5 – IBM’s Next Generation POWER Microprocessor, 2003
http://www.hotchips.org/archives/hc15/3_Tue/11.ibm.pdf
[3.12] Krevell K., „POWER5 Tops on Bandwidth”, Microprocessor Report, Dec. 2003 http://studies.ac.upc.edu/ETSETB/SEGPAR/microprocessors/power5%20(2)%20(mpr).pdf
[3.13] Shinharoy B., Kalla R.N., Tendler J.M., Eickenmeyer R.J., Joyner J.B., „POWER5 system microarchitecture,” IBM J. R&D, Vol. 49, No. 4/5, 2005, pp. 505-521
[3.15] Kanter D., „IBM Previews the Power6,” Oct. 2006, [email protected]
[3.14] Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, http://www.redbooks.ibm.com/redpapers/pdfs/redp4150.pdf
POWER6
POWER5, POWER5+ (cont.)
Cell BE
[3.17] Brochard L., A Cell History,” Cell Workshop, April, 2006 http://www.irisa.fr/orap/Constructeurs/Cell/Cell%20Short%20Intro%20Luigi.pdf
[3.19] Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf
[3.16] Blachford N.: „Cell Architecture Explained Version 2”, http://www.blachford.info/computer/Cell/Cell1_v2.html
[3.18] Day M. and Hofstee P., „Hardware and Software Architectures for the Cell Broadband Engine processor, ” CODES, Sept. 2006, http://www.casesconference.org/cases2005/pdf/Cell-tutorial.pdf
10.3 Literature (2)
10.3 Literature (3)
Cell BE (cont.)
[3.23] Keable C., „And we also have hardware...” 17th Machine Evaluation Workshop, Dec. 2006, http://www.cse.clrc.ac.uk/disco/mew17/talks/Keable_IBM_MEW17.pdf
[3.21] Hofstee H. P., „Real-time Superconputing and Technology for Games and Entertainment,” 2006, http://www.cercs.gatech.edu/docs/SC06_Cell_111606.pdf
[3.26] Solie, D., „Technology Trends Presentation,” Power Symposium, Aug. 2006, http://www-03.ibm.com/procurement/proweb.nsf/objectdocswebview/ file14+-+darryl+solie+-+ibm+power+symposium+presentation/$file/ 14+-+darryl+solie-ibm-power+symposium+presentation+v2.pdf
[3.27] - „Cell Broadband Engine processor – based systems,” White Paper, IBM Corp., 2006
[3.25] Krewell K., „Cell Moves Into The Limelight,” Microprocessor Report, Febr. 14 2005, pp. 1-9
[3.20] Gschwind M., Hofstee H. P., Flachs B. K., Hophkins M., Watanabe Y., Yamazaki T „Synergistic Processing in Cell's Multicore Architecture,” IEEE Micro, Vol. 26, No. 2, 2006, pp. 10-24
[3.24] Krolak D., „Unleashing the Cell Broadband Engine Processor,” MPR Fall Proc. Forum, Nov. 2005, http://www-128.ibm.com/developerworks/power/library/pa-fpfeib/?ca=dgr-lnxwCellConnects
[3.22] Hofstee H. P., „Cell today and tomorrow,” 2005, http://www.stanford.edu/class/ee380/Abstracts/Cell_060222.pdf
[3.28] - „Cell Architecture”, Course Code L1T1H1-10, 2006, http://www.power.org/resources/devcorner/cellcorner/CellTraining_Track1/CourseCode_L1T1H1-10_ CellArchitecture.pdf