Upload
ulisses-costa
View
793
Download
1
Embed Size (px)
DESCRIPTION
The 3rd millestone in our project about Cryptol.
Citation preview
The Cryptol Epilogue:Swift and Bulletproof VHDL
Pedro Pereira Ulisses Costa
Formal Methods in Software Engineering
June 18, 2009
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Last milestone’s recap!
We had to
Generate an efficient and equivalent C implementation
We showed you
The first part of the user’s guide to the toolset
Cryptol → C conversion
An introduction to the Formal Methods’ subset
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
This time
We had to
Generate an efficient and equivalent VHDL implementation
We will show you
The last part of the user’s guide to the toolset ⇒ remaininginterpreter modes
Cryptol → VHDL conversion
Hardware performance analysis
Real application of the Formal Methods’ suite
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Intermediate Representation
IR is what Cryptol generates after parsing + type-checking
Format between the Abstract Syntax Tree and all the otherbackends
Explicitly annotated with types ⇒ allows for type-directedevaluation/translation in backends
Can be viewed using the :def command
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Relevant Interpreter modes for Hardware design
Symbolic
Performs symbolic interpretation on the IR
LLSPIR
Compiles to LLSPIR, optimizing the circuit, and also providesrough profiling information of the final circuit
VHDL
Compiles to LLSPIR and then translates to VHDL, useful forgenerating VHDL that is manually integrated into another design
FPGA
Compiles to LLSPIR, translates to VHDL and uses external toolsto synthesize the VHDL to an architecture dependent netlist
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Cryptol → VHDL conversion
Step 1
Remove constructs from specialized Cryptol implementation whichare unsupported in the FPGA compiler
Step 2
Convert top-level function to stream model for performanceanalysis
Step 3
Adjust implementation according to space and time requirements
Step 4
Use reg pragma to pipeline the implementation
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Step 1: FPGA backend limitations
The following are not supported
Division by powers of other than 2 (hardware’s limitation)
Recursive functions (recursive streams are fine)
High-order functions (partially, since functions are allowed tobe passed as parameters but cannot be returned)
These limitations rarely are a problem; in fact, only the second oneapplied to our specification and was easily resolved.
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Formal Methods to the rescue!
Let’s continue, but first...
Is our implementation
Safe ?
Correct ?
Equivalent ?
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Safety Checking
:safe command
No evil zeroes
No illegal index accesses
And more but these are sufficient
snow3g v0.95> :set sbvsnow3g v0.95> :safe encrypt“encrypt” is safe; no safety violations exist.
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Theorem Proving
:prove command
Theorems are boolean functions
Proves theorem is equivalent to the function that alwaysreturns true regardless of its inputs
plaintext ⇔ decrypt . encrypt
theorem EncDec: {pt k i}. pt == decrypt(encrypt(pt , k,
i), k, i);
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Theorem Proving
JAIG
snow3g v0.95> :prove EncDecGenerating formal model of EncDecGenerating formal model of f where f : ([4][32],[4][32],[4][32]) ->Bit; f x = True;37.519% (01:19:16 ETA)
JAIG eventually froze and crashed.
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Theorem Proving
ABC
snow3g v0.95> :set abcsnow3g v0.95> :set symbolic +vsnow3g v0.95> :prove EncDecGenerating formal model of EncDecGenerating formal model of f where f : ([4][32],[4][32],[4][32]) ->Bit; f x = True;Q.E.D.
ABC took 2 minutes to finish the proof.
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Equivalence Checking
:eq command
Works with an incremental development model: successiveversions of an algorithm can be proven equivalent to aprevious specification ⇒ stepwise-refinement approach
Checks whether Cryptol’s translation to another languageremains formally equivalent ⇒ Cryptol → VHDL for instance
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Equivalence Checking
Step 1 - :fm command
snow3g v0.95> :set abcsnow3g v0.95> :set symbolic +vsnow3g v0.95> :fm encrypt ”./enc.aig”Generating formal model of encrypt: ./enc.aig
Step 2 - :eq command
snow3g v0.95> :set LLSPIRsnow3g v0.95> :eq encrypt ”./enc.aig”True
Took less than 5 minutes to finish.
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Checkpoint
Our implementation is
Safe X
Correct X
Equivalent X
What about efficiency?
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Technical Jargon
Clockrate
Rate of clock cycles per second on the FPGA measured in Hz
Latency or Propagation delay
Amount of time between inputs fed to the circuit andcorresponding outputs measured in number of clock cycles orseconds respectively
Output rate
Indicates how long one must wait before feeding input into thecircuit to produce output and is measured in inverse clock cycles
Throughput
Amount of information that is output from the circuit per unit oftime measured in bits/second
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Circuit representations: Combinatorial vs Sequential
Combinatorial circuit
Output is a pure function of present input and has no state
Unclocked
Sequential circuit
Output depends on past inputs or state
Clocked or Unclocked
Practical computer circuits contain a mixture of combinational andsequential logic
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Circuit representations: Combinatorial vs Sequential
Combinatorial circuit
adderC : ([8] ,[8]) -> [255][8];
adderC (a, b) = [| (a + b) || (a, b) <- [0..254] |];
Sequential circuit
adderS : [8] -> [255][8];
adderS b = take (255, outs)
where outs = [ b ] # [| (a + b) || a <- outs |];
Cryptol’s generated circuits must be clocked, otherwise it’s notpossible to make use of clock constraints to produce useful timing
analysis
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Modelling Sequential Circuits
Step Model
Models circuits that are later lifted into stream model
Unclocked
Variation of type: (input, state) → (output, state)
Stream Model
Model uses infinite sequences over time
Each element in the input or output corresponds to somenumber of clock cycles ⇒ latency of the circuit
Variation of type: [inf]input → [inf]output
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Performance Analysis
LLSPIR and FPGA modes report estimates of circuit latency,clockrate, space utilization and the longest path in a circuit
Guides towards a more efficient (faster and/or smaller)implementation
Cryptol expects top-level function to be defined in the streammodel and will forcibly lift it otherwise
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Performance Analysis: LLSPIR
Underestimates clockrate and provides rough estimate ofspace utilization
Users are encouraged to refine an implementation as much aspossible in this mode before beginning synthesis in FPGAmode
Translation from LLSPIR to VHDL is trivial and takes lesstime than synthesis ⇒ if implementation is correct in LLSPIR,its correctness is highly probable in VHDL
Use :translate to compile a function to LLSPIR, producing a.dot file and :set +v to print the performance information
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Performance Analysis: FPGA
FSIM mode reports space utilization accurately but reportedclockrate is overestimated (theoretical maximum)
TSIM mode reports the exact obtainable clockrate for aparticular place-and-route attempt
fpga clockrate and fpga optlevel settings can significantlyinfluence the place-and-route tool ⇒ experimentation isadvised to obtain maximum possible clockrate
External profiling tools
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Step 2: Lift top level functions
encrypt
encrypt : ([4][ wsize], [4][ wsize], [4][ wsize])
-> [4][ wsize ];
encrypt(pt , key , iv)
= [| k ^ p || k <- GenKS(key , iv) || p <- pt |];
enc lifted
enc_lifted : [inf ]([4][ wsize], [4][ wsize], [4][ wsize])
-> [inf ][4][ wsize ];
enc_lifted ins = [| encrypt in || in <- ins |];
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Performance Analysis: LLSPIR
enc lifted
snow3g v0.94> :set LLSPIR +vsnow3g v0.94> :translate enc liftedSorry, not implemented: timing dependencies are too complicated.LLSPIR is not in canonical form.
Some serious optimization is required!
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Step 3: Space/Time Tradeoffs
Block RAM
par and seq pragmas
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Space/Time Tradeoffs: Block RAM
FPGA implementation of constant sequences such as S-Boxes
Simplifies design effort and reduces computational logic
The compiler tries the conversion by default
Doesn’t work if there are dynamic elements
It’s really fast!
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Space/Time Tradeoffs: Block RAM
MULxPOW
MULxPOW : ([8], [8], [8]) -> [8];
MULxPOW(v, i, c) = res @ i
where res = [ v ] # [| MULx(e, c) || e <- res |];
The latency of this implementation is 28, because Cryptolimplements synchronous circuits whose latency must beknown statically ⇒ latency of this circuit is equal to theworst-case latency
We can be more efficient by implementing it as 8 static256-element lookup tables ⇒ Block RAMs
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Space/Time Tradeoffs: Block RAM
MULa before static tables
=== Circuit Timing ===circuit latency: 246 cycles (245 cycles plus propagation delay)circuit rate: N/Aoutput length: one elementtotal time: 246 cycles (245 cycles plus propagation delay)
MULa after static tables
=== Circuit Timing ===circuit latency: 3 cycles (2 cycles plus propagation delay)circuit rate: N/Aoutput length: one elementtotal time: 3 cycles (2 cycles plus propagation delay)
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Space/Time Tradeoffs: par and seq
par
Forces paralelization
Replicates circuitry
Faster but consumes more space
seq
Forces sequentialization
Reuses circuitry over multiple clock cycles
Slower but consumes less space
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
par pragma
Example
map : {a b} (a -> b, [4]a) -> [4]b;
map(f, xs) = [| (f x) || x <- xs |];
There’s no need to use par because it’s the compiler’s defaultaction in order to improve overall performance
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
seq pragma
Example
map : {a b} (a -> b, [4]a) -> [4]b;
map(f, xs) = seq [| (f x) || x <- xs |];
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Step 4: Pipelining
reg pragma
Sequential circuits in the stream model can be pipelined
Separation of a function into several smaller computationalunits
Each unit is a stage in the pipeline consuming output fromprevious stage and producing output to the next
Typically increases area and latency of circuit but candramatically increase clockrate and throughput
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Performance Analysis - LLSPIR
enc lifted
snow3g v0.95> :translate enc lifted=== Circuit Timing ===circuit latency: 25 cycles (24 cycles plus propagation delay)circuit rate: one element per cycleoutput length: unboundedtotal time: unbounded
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Conclusions
Language
Combination of arithmetics and sequence manipulations ⇒compact syntax and easy to learn
Infinite sequences
Size and shape polymorphism
Really captures the elegance and abstract mathematicalessence of ciphers’ specifications
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Conclusions
Formal Methods’ tools
Possible to check if implementations are safe to execute,correct and formally identical to their specifications
They work in real scenarios
Push button package ⇒ avoids specific annotations and effortto learn external languages
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Conclusions
FPGA synthesis
Performance analysis
Compiler pragmas are provided to make simple and effectivespace/time tradeoffs
Can generate more efficient than hand-made implementations⇒ saving loads of time
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL
Questions
?
Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL