View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Scaling Formal Methods Toward Hierarchical Protocols
in Shared Memory Processors
Presenters: Ganesh Gopalakrishnan and Xiaofang ChenSchool of Computing , University of Utah, Salt Lake City, UT 84112
{ganesh, xiachen}@cs.utah.edu
http://www.cs.utah.edu/formal_verification
GRC CADTS Review, Berkeley, March 18, 2008
Supported by SRC Contract TJ-1318 (Intel Customization)
2
Multicores are the future!Their caches are visibly central…
(photo courtesy of
Intel Corporation.)
> 80% of chipsshipped will bemulti-core
3
Hierarchical Cache Coherence Protocols will play a major role in multi-core processors
Chip-level protocols
Inter-cluster protocols
Intra-cluster protocols
dirmem dirmem
…
State Space grows multiplicatively across the hierarchy!
Verification will become harder
4
Protocol design happens in “the thick of things” (many interfaces, constraints of performance, power, testability).
From “High-throughput coherence control and hardware messaging in Everest,” by Nanda et.al., IBM J.R&D 45(2), 2001.
5
Future Coherence Protocols
Cache coherence protocols that are tuned for the contexts in which they are operating can significantly increase performance and reduce power consumption [Liqun Cheng]
Producer-consumer sharing pattern-aware protocol [Cheng et.al, HPCA07] 21% speedup and 15% reduction in network traffic
Interconnect-aware coherence protocols [Cheng et.al., ISCA06] Heterogeneous Interconnect Improve performance AND reduce power 11% speedup and 22% wire power savings
Bottom-line: Protocols are going to get more complex!
6
Main Result #1 : Hierarchical
RAC
L2 Cache+Local Dir
L1 Cache
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
L1 Cache
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
Intra-cluster
Inter-cluster
Developed way to reduce verification complexity
of hierarchical (CMP) protocols using A/G
7
Main Result #2 : Refinement
Developed way to Verify a Proposed Refinement of
ONE unit into its low level (RTL) implementation
8
Main Result #2 : Refinement
Developed way to Verify a Proposed Refinement of
ONE unit into its low level (RTL) implementation
Murphi
9
Main Result #2 : Refinement
Developed way to Verify a Proposed Refinement of
ONE unit into its low level (RTL) implementation
Murphi
10
Main Result #2 : Refinement
Developed way to Verify a Proposed Refinement of
ONE unit into its low level (RTL) implementation
Murphi
HMurphi
11
Differences in Modeling: Specs vs. Impls
home remote
One step in high-level
Multiple steps in low-level
an atomic guarded/command
home
router
buf
remote
12
Our Refinement Check
Spec(I)
I
Spec(I’)Spec
transition
Multi-step Impl
transactionI’
Guard for Spec transition must
hold
I is a reachable Impl state
Observable vars changed
by either must match
13
Workflow of Our Refinement Check
Hardware Murphi
Impl model
Product model in
Hardware Murphi
Product model in VHDL
MurphiSpec model
Property check
Muv
Check implementation meets specification
14
Anticipated Future Result
Developed way to Verify a Proposed Refinement of
the ENTIRE hierarchy
15
Anticipated Future Result
Deal with pipelining
Sequential InteractionPipelined Interaction
16
Anticipated Future Result
Develop ways to “tease apart” protocols that are “blended in”
e.g. for power-down or post-si observability enhancement
More protocols…
.. do they interfere?
17
Basics
PI : Ganesh Gopalakrishnan Industrial Liaisons : Ching Tsun Chou (Intel), Steven M. Geman (IBM), John
W. O’Leary (Intel), Jayanta Bhadra (Freescale), Alper Sen (Freescale), Aseem Maheshwari (TI)
Primary Student : Xiaofang Chen Graduation Date : Writing PhD Dissertation; in the market Other Students: Yu Yang (PhD), Guodong Li (PhD), Michael DeLisi (BS/MS) Anticipated Results:
Hierarchical : Methodology for Hierarchical (Cache Coherence) Protocol Verification, with Emphasis on Complexity Reduction (was in original SRC proposal)
Refinement : Methodology for Expressing and Verifying Refinement of Higher Level Protocol Descriptions (not in original SRC proposal)
18
Basics
Deliverables (Papers, Software, Xiaofang’s Dissertation) Hierarchical:
Methodology for Applying A/G Reasoning for Complexity Reduction Verified Protocol Benchmarks – Inclusive, Non-Inclusive, Snoopy
(Large Benchmarks) Automatic Abstraction Tool in support of A/G Reasoning
Refinement: Muv Language Design (for expressing Designs) Refinement Checking Theory and Methodology Complete Muv tool implementation
19
What’s Going On
Accomplishments during the past year Hierarchical:
Finishing Non-inclusive Hierarchical Protocol Verif
Developing and Verifying a Hier. Protocol with a
Snoopy First Level
20
Insert Table of Hier + Snoopy Here
21
What’s Going On
Accomplishments during the past year (contd.) Refinement:
HMurphi was fleshed out in great detail
Most of Muv was implemented (a large portion during
IBM T.J. Watson Internship) – joint work with Steven
German and Geert Janssen
22
What’s Going On
Future directions Hierarchical + Refinement
Develop ways to verify hierarchies of HMurphi modules interacting Pipelining Teasing out protocols supporting non-functional aspects
Power-down protocols Protocols to enhance Post-si Observability
Architectural Characterization How do we describe the “ISA” of future multi-core
machines? How do we make sure that this ISA has no hidden
inconsistencies
23
What’s Going On
Technology Transfer & Industrial Interactions With Liaisons
Publications FMCAD 06, 07, HLDVT 07, TECHCON 07 (best
session paper award), Journal paper (under prep),
Dissertation (under prep)
Request to IBM for Open-sourcing Muv has been
placed
24
Overview of “Hierarchical”
Given a protocol to verify, create a
verification model that models a small
number of clusters acting on a single
cache line Verification Model
Inv P
Home
Remote
Global directory
25
2. Exploit Symmetries
Model “home” and the two “remote”s
(one remote, in case of symmetry)
Verification Model
Inv P
26
4. Initial abstraction will be extreme; slowly back-off from this extreme…
Inv P1 Inv P2
Inv P3
P1 fails
Diagnose failure
Bug
report to user
False Alarm
Diagnose where guard
is overly weak
Add Strengthening Guard
Introduce Lemma to ensure
Soundness of Strengthening
27
Overview of Theory Involved
rule g1 ==> a1;
rule g2 ==> a2;
invariant P;rule g1 ==> a1;
rule g2 /\ cond2 ==> a2;
invariant P /\ (g1 => cond1);
rule g1 /\ cond1 ==> a1;
rule g2 ==> a2;
invariant P /\ (g2 => cond2);
28
3. Create Abstract Models (three models in this example)
Inv P
Inv P1 Inv P2
Inv P3
29
Step 1 of Refinement
Inv P1 Inv P2
Inv P3
Inv P1 Inv P2
Inv P3’
30
Step 2 of Refinement
Inv P1 Inv P2
Inv P3
Inv P1 Inv P2
Inv P3’
Inv P1 Inv P2’
Inv P3’
31
Final Step of Refinement
Inv P1 Inv P2
Inv P3
Inv P1 Inv P2
Inv P3’
Inv P1’ Inv P2’
Inv P3’
Inv P1 Inv P2’
Inv P3’’
32
Detailed Presentation of Refinement
Note: Three examples have been presented in full detail at
http://www.cs.utah.edu/formal_verification/muv
33
Here, arrange the rest of the slides + the new ones you are making as you feel best. Most of the remaining slides are quite good, so your work need not include any “clean-up” but just delete those already covered…
34
Project Summary: Year 2
Verification of hierarchical cache coherence protocols Non-inclusive multicore benchmark Compositional approach one level a time Can reduce >95% explicit state space
Refinement check: protocol RTL Impls vs. Specs Refinement theory and methodology Compositional approach theory
Publications FMCAD 2007, HLDVT 2007 TECHCON 2007 (best session paper award)
35
Yearly Summary: 2007 - 2008
Refinement check: protocol RTL Impls vs. Specs A comprehensive tool path
Can find bugs on RTL protocols with realistic features
A simple pipelined stack example Verification of hierarchical cache coherence protocols
A snoop multicore protocol benchmark
36
A Simple Snoop Multicore Protocol
Motivation: Snoop protocols commonly used in 1st level of caches
Have applied our approach on directory protocols
How about snoop protocols?
L1 Cache
L2 Cache
RAC
Global Dir
Main Mem
Cluster 1
L1 Cache L1 Cache
L2 Cache
RAC
Cluster 2
L1 Cache
37
Applying Our Approach
L1 Cache
L2 Cache
Global Dir
Main Mem
Cluster 1
L1 Cache L2 Cache
RAC
Cluster 2
L2 Cache
RAC
Cluster 1
Abstracted protocols
Model checking results
Model checkpassed
Use mem (GB)
1.8
1.8
1.8
Model check time (sec)
86
6
7
# of states
552,375
474
15,371
Original model
Abs. intra
Abs. inter
Monolithicapproach
Our approach
Yes
Yes
Yes
38
Cycle accurate RTL level
Refinement Check Spec vs. Impl
Specification
Abstraction level
Model size
39
Differences in Execution: Specs vs. Impls
Interleaving in HL
Concurrency in LL
40
Our Approach of Refinement Check
Modeling
Specification: Murphi
Implementation: Hardware Murphi
Use transactions in Impl to relate to Spec
Verification
Muv: Hardware Murphi synthesizable VHDL
Tool: IBM SixthSense and RuleBase
41
What Are Transactions?
Group a multi-step execution in implementations
Spec
Impl
42
Outline
Project background
Extensions to the tool path
Experimental results
Future work
43
Tool Path
Initial efforts from IBM By German and Janssen Hardware Murphi language
Muv: Hardware Murphi Synthesizable VHDL
Our extensions -- enable refinement check Language extensions
Muv extensions
44
Basic Features of Hardware Murphi vs Murphi
…
signal s1, s2 …
s1 <= …
chooserule rules; end; …
firstrule rules; end; …
transaction
rule-1; rule-2; …
end; …
45
Language Extensions to Hardware Murphi (I)
--include spec.m
correspondence
u1[0..7] :: v1[1..8]; u1 :: v2; end;
Directives
Joint variables correspondence
46
Language Extensions to Hardware Murphi (II)
transactionset p1:T1; p2:T2 do
transaction …
end;
Transactionset
rule:id guard ==> action;
ruleset p1:T1; p2:T2 do
rule:id …
end;
Rules with IDs
47
Language Extensions to Hardware Murphi (III)
<< id.guard() >>;
<< id.action() >>;
<< id[v1][v2].guard() >>; …
Execute a rule by ID
var[i] <:= data;
Fine-grained assignments for write-write conflicts
48
How to Annotate an Impl Model with Spec?
…
transaction
rule-1 g1 a1;
rule-2 g2 a2;
end;
…
…
rule:id g a;
…
impl.m
spec.m
49
How to Annotate an Impl Model with Spec?
--include spec.m
correspondence
u1 :: v1; …
end; …
transaction
rule-1 g1 a1;
<< id.guard() >>;
<< id.action() >>;
rule-2 g2 a2;
end; …
…
rule:id g a;
…
impl.m
spec.m
50
The Framework of Muv
Hardware Murphi model AST
parserAST’
pre-processor
AST’’
refinement
check analysis
VHDL modeltranslator
Constant propagation
rule:id, <<id.guard()>>
ruleset, transactionset …
51
Our Extensions to Muv
Language extension support
Refinement check assertions generation Ensure exclusive write to a variable
Serializability for Spec rules
Enableness for Spec rules
Joint variables equivalence when inactive
Mostly done with static analysis
52
Refinement Extensions to Muv (I)
v := d;
for i: s1..s2 do
assert (update_bits[i] = false);
end;
v := d;
for i: s1..s2 do
update_bits[i] := true;
end;
No write-write conflicts
53
Refinement Extensions to Muv (II)
Serializability for specification rules
S0 S1 S0 S1
t1
t2
t3S’1 S’2
t1 t2 t3
Obtain read and write sets of variables of each rule
Analyze read-write dependency
Check for cycles
54
Check for Dependency Cycles
S0 S1 S0 S1
t1
t2
t3S’1 S’2
t1 t2 t3
t3 write v2
t3 read v1
r(v1)w(v3)
t2 write v1
t2 read v2
t1
55
Refinement Extensions to Muv (III)
rule:id
guard action;
bool function id_guard() {…}
void procedure id_action(…) {…}
Enableness of specification rules
<< id.guard() >>;
<< id.action() >>;
assert id_guard();
id_action();
56
Refinement Extensions to Muv (IV)
Joint variables equivalence when inactive
For each joint variable v When all transactions that write to v are inactive
v must be equivalent in Impl and Spec
…
Transaction T1 …
Transaction T2 …
…
Assert
inactive(T1) & inactive(T2)
=>
v = v’;
57
Outline
Project background
Extensions to the tool path
Experimental results
Future work
58
A Driving Protocol Benchmark
S. German and G. Janssen, IBM Research Tech Report 2006
Buf
Buf
Buf Remote
Dir Cache
Mem
Router
Buf
Buf
Buf
Local
Home
Remote
Dir Cache
Mem
Local
Home
59
More Detail of the Cache Example
Hardware Murphi model ~2500 LOC
15 transactionsets
Generated VHDL ~1800 assertions, of which ~1600 are write-write
conflicts check assertions
Took ~16min with SixthSense for all assertions
Took ~13min w/o write-write conflicts check
60
Bugs Found with Refinement Check
Benchmark satisfies cache coherence already
Bugs still found Bug 1: router unit loses messages
Bug 2: home unit replies twice for one request
Bug 3: cache unit gets updated twice from 1 reply
Refinement check is an automatic way of
constructing such checks
61
Model Checking Approaches
Monolithic Straightforward property check
Compositional Divide and conquer
62
Compositional Refinement Check
Reduce the verification complexity
Basic Techniques Abstraction
Removing details to make verification easier
Assume guarantee A simple form of induction which introduces
assumptions and justifies them
63
Experimental Results
Verification Time
1-bit 10-bit
1-day
Datapath
Configurations 2 nodes, 2 addresses, SixthSense
30 min
Monolithic approach
Compositional approach
64
A Simple 2-Stage Pipelined Stack
pipelined pushes pipelined pops
overlapped pop & push
Push: push data + increase counter
Pop: decrease counter + pop data
65
Outline
Project background
Extensions to the tool path
Experimental results
Future work
66
Future Work
Muv-like refinement check for interaction modules RTL modules interaction via communication
protocols
Interfaces involving buffers and pipelining
Refinement of initial RTL protocols Power-down issues
Post-silicon validation support
Runtime verification support
Safe augmentation of verified protocols
Cheap re-verification