High Performance Processor neeraj/doc/pentium/ ¢  Intel Pentium processor 1993 3,100,000 Intel

  • View
    1

  • Download
    0

Embed Size (px)

Text of High Performance Processor neeraj/doc/pentium/ ¢  Intel Pentium processor 1993 3,100,000...

  • High Performance Processor Architecture

    Neeraj Goel

    2004csz8035

    Embedded System Group

    Dept. of Computer Science and Engineering

    Indian Institute of Technology Delhi

    http://embedded.cse.iitd.ernet.in/

    HU810 Seminar

  • Outline

    Introduction

    History and Future prediction

    Pentium 4 features

    Pipelining

    Superscalar features

    Hyper-Threading

    Conclusion and future

    HU810 Seminar

  • Moore’s Law

    Intel Microprocessors(source:www.intel.com)

    HU810 Seminar

  • Intel’s Processors : past and Current

    Year of Introduction Transistors 8008 1972 2,500

    8080 1974 5,000

    8086 1978 29,000

    286 1982 120,000

    Intel386 processor 1985 275,000

    Intel486 processor 1989 1,180,000

    Intel Pentium processor 1993 3,100,000

    Intel Pentium II processor 1997 7,500,000

    Intel Pentium III processor 1999 24,000,000

    Intel Pentium 4 processor 2000 42,000,000

    Intel Itanium processor 2002 220,000,000

    Intel Itanium 2 processor 2003 410,000,000 HU810 Seminar

  • How to increase performance

    Pipelining Breaking a large system in number of stages

    Instruction level parallelism Software codes are serially written Independent instructions can be executed parallel Large number of function units required

    Thread level parallelism Application are written with threads Operating system can have threads Different application on different thread

    HU810 Seminar

  • How Pentium is getting high performance

    Rapid execution, more pipelining stages

    Out of order execution

    Speculative execution

    Hyper threading

    Trace cache

    Store to load forwarding enhancements

    HU810 Seminar

  • Pipelining

    The concept of splitting a job into sub-processes in which the output of one sub-process feeds into the next.

    A mechanical example of a pipeline is a washer/dryer system for clothing.

    HU810 Seminar

  • Pipelining

    The concept of splitting a job into sub-processes in which the output of one sub-process feeds into the next.

    A mechanical example of a pipeline is a washer/dryer system for clothing.

    More stages means more throughput also more latency

    Issue : All stages should be of almost equal delay otherwise slowest stage will determine clock cycle

    Fetch Decode Execute Write−back

    HU810 Seminar

  • Superscalar Architecture

    We can have large number if functional units but program is serial

    Will multiple instruction fetch solve the problem?

    HU810 Seminar

  • Superscalar Architecture

    We can have large number if functional units but program is serial

    Will multiple fetch solve the problem?

    Issues Dependencies Branches

    HU810 Seminar

  • Speculative Execution

    Situation: There is pipeline of 20 stages and all are waiting for branch to be resolved

    Effect: Benefits of pipelining and superscalar will vanish at branch instructions?

    Solution?

    HU810 Seminar

  • Speculative Execution

    Situation: There is pipeline of 20 stages and all are waiting for branch to be resolved

    Effect: Benefits of pipelining and superscalar will vanish on branches?

    Execute both if and else instructions simultaneously

    Discard wrong one when result of branch come

    HU810 Seminar

  • Thread level parallelism

    Multi-processors Supercomputers

    Chip Multi-Processing Dual core chips like Intel’s Xeon

    Simultaneous Multi-threading One processor and multiple thread Different from multi-programing and multi-tasking

    HU810 Seminar

  • Hyper-threading

    Makes a single processor appear as multiple logical processors

    Each logical processor keeps a its own copy of the architecture state

    OS view the logical processors as physical processors

    Logical processors share a single set of physical resources

    HU810 Seminar

  • Hyper-threading

    Makes a single processor appear as multiple logical processors

    Each logical processor keeps a its own copy of the architecture state

    OS view the logical processors as physical processors

    Logical processors share a single set of physical resources

    HU810 Seminar

  • Conclusion and Future

    Future processor will need more performance - higher clock speed

    Not possible with shrinking device dimensions

    Need architectural solutions

    SMP and CMP will be solution

    More instruction level parallelism can be exploited using compiler techniques

    HU810 Seminar

  • Thank You

    Thank You

    HU810 Seminar

  • Backup

    Backup

    HU810 Seminar

  • Source Files

    http://www.cse.iitd.ernet.in/ neeraj/doc

    HU810 Seminar

  • Some Definitions

    Cache An on chip memory with very less access time Cost is more usually required data can be placed there

    Clock speed Mentioned in MHz and GHz MHz : Million instructions per second

    Buses Data, Address and Control Bus width -> Number of parallel bits that can be accessed

    HU810 Seminar

  • Block Diagram of Pentium 4

    HU810 Seminar

    Outline Moore's Law Intel's Processors : past and Current How to increase performance How Pentium is getting high performance Pipelining Pipelining Superscalar Architecture Superscalar Architecture Speculative Execution Speculative Execution Thread level parallelism Hyper-threading Hyper-threading Conclusion and Future Thank You Backup Source Files Some Definitions Block Diagram of Pentium 4