16
CS 7960-4 Lecture 24 ceeding the Dataflow Limit via Value Predict M.H. Lipasti, J.P. Shen Proceedings of MICRO-29 December 1996

CS 7960-4 Lecture 24

  • Upload
    ayita

  • View
    40

  • Download
    1

Embed Size (px)

DESCRIPTION

CS 7960-4 Lecture 24. Exceeding the Dataflow Limit via Value Prediction M.H. Lipasti, J.P. Shen Proceedings of MICRO-29 December 1996. Dependences. Value Locality. Avg. accuracy 49%. Avg. accuracy 61%. Value Predictor. CT Design. VP Microarchitecture. - PowerPoint PPT Presentation

Citation preview

Page 1: CS 7960-4    Lecture 24

CS 7960-4 Lecture 24

Exceeding the Dataflow Limit via Value Prediction

M.H. Lipasti, J.P. ShenProceedings of MICRO-29

December 1996

Page 2: CS 7960-4    Lecture 24

Dependences

Page 3: CS 7960-4    Lecture 24

Value Locality

Avg. accuracy 49%

Avg. accuracy 61%

Page 4: CS 7960-4    Lecture 24

Value Predictor

Page 5: CS 7960-4    Lecture 24

CT Design

Page 6: CS 7960-4    Lecture 24

VP Microarchitecture

• Value prediction happens at dispatch

• Results are immediately bypassed to dependents, but predicted instrs also go thru the pipeline

• Dependents remain in issueq until verification

• Predicted and verified results have to be broadcast to the issue queue

Page 7: CS 7960-4    Lecture 24

Verifier

• Similarities with pre-execution – a speculative thread and a verifier thread

• Dependent instructions can produce results instantaneously, but verifier executes in sequence

• Verification takes a cycle – can slow the verification thread and slows the squashing process

• Verification increases contention for resources and issue queue occupancy

Page 8: CS 7960-4    Lecture 24

Dependent Instructions

Completed: t+1Verified: t+9

Completed: t+4Verified: t+10

Completed: t+1Verified: t+10

Completed: t+8

Completed: t+9Completed: t+9

Completed: t+1V-completed: t+8Verified: t+9

Completed: t+4Verified: t+10V-completed: t+13

Completed: t+1Verified: t+10

No prediction

Correct prediction

Incorrect prediction

Page 9: CS 7960-4    Lecture 24

Configurations

Page 10: CS 7960-4    Lecture 24

Results

• Bullet

Page 11: CS 7960-4    Lecture 24

Infinite Processor Model

• Bullet

Limitations: branch prediction, fetch, store bandwidth, verifier thread

Page 12: CS 7960-4    Lecture 24

Efficient Use of Transistors

• Bullet

Page 13: CS 7960-4    Lecture 24

Future Work

• Better predictions, hit rates, strides

• Value prediction for critical instructions/high confidence predictions

• Speculation along multiple paths in the value space

• Value prediction for stores

Page 14: CS 7960-4    Lecture 24

Power Implications

• Increased activity increased power consumption

• Higher performance potentially lower energy (reduced clock distribution energy)

Page 15: CS 7960-4    Lecture 24

Next Class’ Paper

• “Energy Efficient Co-Adaptive Instruction Fetch and Issue”, A. Buyuktosunoglu, T. Karkhanis, D. H. Albonesi, P. Bose, Proceedings of ISCA-30, June, 2003

Page 16: CS 7960-4    Lecture 24

Title

• Bullet