of 21/21
High Performance Processor Architecture Neeraj Goel 2004csz8035 Embedded System Group Dept. of Computer Science and Engineering Indian Institute of Technology Delhi http://embedded.cse.iitd.ernet.in/ HU810 Semina

High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

  • View
    2

  • Download
    0

Embed Size (px)

Text of High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor...

  • High Performance Processor Architecture

    Neeraj Goel

    2004csz8035

    Embedded System Group

    Dept. of Computer Science and Engineering

    Indian Institute of Technology Delhi

    http://embedded.cse.iitd.ernet.in/

    HU810 Seminar

  • Outline

    Introduction

    History and Future prediction

    Pentium 4 features

    Pipelining

    Superscalar features

    Hyper-Threading

    Conclusion and future

    HU810 Seminar

  • Moore’s Law

    Intel Microprocessors(source:www.intel.com)

    HU810 Seminar

  • Intel’s Processors : past and Current

    Year of Introduction Transistors8008 1972 2,500

    8080 1974 5,000

    8086 1978 29,000

    286 1982 120,000

    Intel386 processor 1985 275,000

    Intel486 processor 1989 1,180,000

    Intel Pentium processor 1993 3,100,000

    Intel Pentium II processor 1997 7,500,000

    Intel Pentium III processor 1999 24,000,000

    Intel Pentium 4 processor 2000 42,000,000

    Intel Itanium processor 2002 220,000,000

    Intel Itanium 2 processor 2003 410,000,000HU810 Seminar

  • How to increase performance

    PipeliningBreaking a large system in number of stages

    Instruction level parallelismSoftware codes are serially writtenIndependent instructions can be executed parallelLarge number of function units required

    Thread level parallelismApplication are written with threadsOperating system can have threadsDifferent application on different thread

    HU810 Seminar

  • How Pentium is getting high performance

    Rapid execution, more pipelining stages

    Out of order execution

    Speculative execution

    Hyper threading

    Trace cache

    Store to load forwarding enhancements

    HU810 Seminar

  • Pipelining

    The concept of splitting a job into sub-processes in whichthe output of one sub-process feeds into the next.

    A mechanical example of a pipeline is a washer/dryersystem for clothing.

    HU810 Seminar

  • Pipelining

    The concept of splitting a job into sub-processes in whichthe output of one sub-process feeds into the next.

    A mechanical example of a pipeline is a washer/dryersystem for clothing.

    More stages means more throughput also more latency

    Issue : All stages should be of almost equal delay otherwiseslowest stage will determine clock cycle

    Fetch Decode Execute Write−back

    HU810 Seminar

  • Superscalar Architecture

    We can have large number if functional units but program isserial

    Will multiple instruction fetch solve the problem?

    HU810 Seminar

  • Superscalar Architecture

    We can have large number if functional units but program isserial

    Will multiple fetch solve the problem?

    IssuesDependenciesBranches

    HU810 Seminar

  • Speculative Execution

    Situation: There is pipeline of 20 stages and all are waitingfor branch to be resolved

    Effect: Benefits of pipelining and superscalar will vanish atbranch instructions?

    Solution?

    HU810 Seminar

  • Speculative Execution

    Situation: There is pipeline of 20 stages and all are waitingfor branch to be resolved

    Effect: Benefits of pipelining and superscalar will vanish onbranches?

    Execute both if and else instructions simultaneously

    Discard wrong one when result of branch come

    HU810 Seminar

  • Thread level parallelism

    Multi-processorsSupercomputers

    Chip Multi-ProcessingDual core chips like Intel’s Xeon

    Simultaneous Multi-threadingOne processor and multiple threadDifferent from multi-programing and multi-tasking

    HU810 Seminar

  • Hyper-threading

    Makes a single processor appear as multiple logicalprocessors

    Each logical processor keeps a its own copy of thearchitecture state

    OS view the logical processors as physical processors

    Logical processors share a single set of physical resources

    HU810 Seminar

  • Hyper-threading

    Makes a single processor appear as multiple logicalprocessors

    Each logical processor keeps a its own copy of thearchitecture state

    OS view the logical processors as physical processors

    Logical processors share a single set of physical resources

    HU810 Seminar

  • Conclusion and Future

    Future processor will need more performance - higher clockspeed

    Not possible with shrinking device dimensions

    Need architectural solutions

    SMP and CMP will be solution

    More instruction level parallelism can be exploited usingcompiler techniques

    HU810 Seminar

  • Thank You

    Thank You

    HU810 Seminar

  • Backup

    Backup

    HU810 Seminar

  • Source Files

    http://www.cse.iitd.ernet.in/ neeraj/doc

    HU810 Seminar

  • Some Definitions

    CacheAn on chip memory with very less access timeCost is moreusually required data can be placed there

    Clock speedMentioned in MHz and GHzMHz : Million instructions per second

    BusesData, Address and ControlBus width -> Number of parallel bits that can beaccessed

    HU810 Seminar

  • Block Diagram of Pentium 4

    HU810 Seminar

    OutlineMoore's LawIntel's Processors : past and CurrentHow to increase performanceHow Pentium is getting high performancePipeliningPipeliningSuperscalar ArchitectureSuperscalar ArchitectureSpeculative ExecutionSpeculative ExecutionThread level parallelismHyper-threadingHyper-threadingConclusion and FutureThank YouBackupSource FilesSome DefinitionsBlock Diagram of Pentium 4