A Proof of Correctness of a Processor Implementing Tomasulo’s Algorithm without a Reorder Buffer Ravi Hosabettu (Univ. of Utah) Ganesh Gopalakrishnan (Univ

A Proof of Correctness of a Processor Implementing

Tomasulo’s Algorithm without a Reorder Buffer

Ravi Hosabettu (Univ. of Utah)

Ganesh Gopalakrishnan (Univ. of Utah)

Mandayam Srivas (SRI International)

2

Talk Organization

• Motivation

• Completion Functions Approach

• Key contribution of the talk

• Detailed illustration

• Conclusions

3

Motivation

• Pipelined processor verification– Increasingly complex designs– Need for formal verification

• Theorem provers– Focus on the relevant aspects only

• To verify large, complex designs:– Automation– Decomposition

4

Problem Definition

• Need a verification methodology that

– Is amenable to decomposition

– Uses decision procedures

• Solution: Completion Functions Approach

5

What are Completion Functions?

• Desired effect of retiring an unfinished instruction in an atomic fashion

a b c

RFC_b

6

Abstraction Function

• Need to define an abstraction function

• Flushing the pipeline

• Our idea: Define abstraction function as a Composition of Completion Functions

Impl.MachineStep

Spec.MachineStep

7

Main Features

• Decomposition into verification conditions

RF

a b c

C_bC_a C_c

L_ab

Abs. fn = C_a o C_b o C_cOne VC is: C_a == L_ab o C_b

8

Main Features Continued

• VCs generated systematically & discharged often automatically

• Incremental verification

• No explicit intermediate abstraction

• Methodology implemented in PVS

9

Examples Verified

• Three examples (CAV98)– DLX– Dual issue DLX– Example with limited out-of-order execution

• Example with a reorder buffer & alu instructions only (CAV99)

10

In-order vs Out-of-order Retirement

I

D

W

W

D

E

E

W

I

I

D

E

Same effect on

RF

Currentstate

stateNext

11

Contributions of this Paper

• Extend completion functions approach to handle out-of-order retirement– hard to support exceptions & speculation– specialized processors

• Verified an implementation of Tomasulo’s algorithm without a reorder buffer

12

Details of the Proof

• The implementation model

• Correctness criterion

• Key ideas

• Proof of correctness

• Liveness proof

13

Processor Model

EU1 EUm

RS

RF RTT

14

Correctness Criterion

AbstractionAbstraction

I_step

A_step/

impl_st

15

Completion Functions Approach for Out-of-order Retirement

• Completion function returns the value computed by an instruction– Recursively complete instructions it is

dependent on

• Abstraction function updates all registers– Latest pending instruction to write a register

16

The Completion Function

EU1 EUm

Value_issued

Value_executed

Value_dispatched

RS

RF RTT

17

Value_issued Definition

opcode

src1_rsesrc1_val

.

.

. = 0

/= 0

op1 := src1_val

op1 := Complete(src1_rse)

Value_issued := alu(opcode,op1,op2)

Reservation Station Entry

18

Abstraction Function

RF RTT

rsi..

.0

Complete(rsi)

Unchanged

19

Main Verification Condition

Same

Complete(rsi)

Complete(rsi)

D DI IE

DI EE ENextstate

Currentstate

20

Instruction-state Transitions

I EDisp?

Not Disp?

Exec?

Not Exec?

Wback?

Not Wback?

D

21

Establishing the Main Verification Condition

Value_executed

Value_dispatched

Nextstate

Currentstate D DI IE

DI EE E

Same

22

Another Scenario

DIE E I

D DI IE I

Current stateNext state

InductionHypothesis

FeedbackLogic

Correctness

23

Correctness Criterion Repeated

AbstractionAbstraction

I_step

A_step/

impl_st

24

Proving Commutative Diagram

Complete(new)

RTT innext state

ISA spec. value

RTT incurrent state

Complete(rsi)Complete(rsi)

rsi

rsirsi

.

.

rsi

new

0

.

.Complete(rsi) Unchanged

Spec. path Impl. path

25

Invariants Needed

• Measure function related– Measure of instruction producing the source

value is less than the given instruction

• Instruction validity invariants

26

PVS Proof Statistics

• Proof strategies– Induction obligations: Very similar strategy– Operand correctness & commutative diagram:

Very similar strategy– Invariants: No uniform strategy

• Manual effort– 7 person days of “first time” effort

• 500 seconds on 167MHz UltraSparc

27

Liveness Properties

• Two liveness properties– Eventually the processor gets flushed– Eventually a new instruction is executed

• Again based on Instruction-state transitions

28

Liveness Proof

I D EDisp?

Not Disp?

Exec?

Not Exec?

Wback?

Not Wback?

Scheduler

29

Related Work

• Theorem proving:– Arons & Pnueli

• Model checking:– McMillan– Berezin et al– Henzinger et al

• MAETT, Incremental flushing

30

Work in Progress

• Mechanizing the liveness proofs

• Bringing the methodology closer to practice– More automated decision procedures– Automatic discovery of invariants– Integration into the design process

31

Conclusions

• Well suited for verifying processors with out-of-order retirement

• Completion Functions Approach has been applied on a wide variety of examples

32

Recent Verification Effort

• Also applied to verify a processor with:– a reorder buffer– alu, memory & branch instructions– store buffer & load value forwarding– exceptions– speculative execution– user & supervisory modes of operation

33

Manual Effort on all Examples

• DLX/Dual issue DLX : 2 months– Initial experiments

• Simple out-of-order execution: 14 days

• In-order retirement, reorder buffer example: 12 days

• Out-of-order retirement: 7 days

• Significantly complex example: 35 days

34

Conclusions

• Reasonable manual effort

• Scope for further increasing the automation

• Completion Functions Approach is a promising and a viable approach to verifying complex pipelined processors

Documents

A Proof of Correctness of a Processor Implementing Tomasulo’s Algorithm without a Reorder Buffer Ravi Hosabettu (Univ. of Utah) Ganesh Gopalakrishnan (Univ