37
The Cryptol Epilogue: Swift and Bulletproof VHDL Pedro Pereira Ulisses Costa Formal Methods in Software Engineering June 18, 2009 Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

The Cryptol Epilogue: Swift and Bulletproof VHDL

Embed Size (px)

DESCRIPTION

The 3rd millestone in our project about Cryptol.

Citation preview

Page 1: The Cryptol Epilogue: Swift and Bulletproof VHDL

The Cryptol Epilogue:Swift and Bulletproof VHDL

Pedro Pereira Ulisses Costa

Formal Methods in Software Engineering

June 18, 2009

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 2: The Cryptol Epilogue: Swift and Bulletproof VHDL

Last milestone’s recap!

We had to

Generate an efficient and equivalent C implementation

We showed you

The first part of the user’s guide to the toolset

Cryptol → C conversion

An introduction to the Formal Methods’ subset

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 3: The Cryptol Epilogue: Swift and Bulletproof VHDL

This time

We had to

Generate an efficient and equivalent VHDL implementation

We will show you

The last part of the user’s guide to the toolset ⇒ remaininginterpreter modes

Cryptol → VHDL conversion

Hardware performance analysis

Real application of the Formal Methods’ suite

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 4: The Cryptol Epilogue: Swift and Bulletproof VHDL

Intermediate Representation

IR is what Cryptol generates after parsing + type-checking

Format between the Abstract Syntax Tree and all the otherbackends

Explicitly annotated with types ⇒ allows for type-directedevaluation/translation in backends

Can be viewed using the :def command

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 5: The Cryptol Epilogue: Swift and Bulletproof VHDL

Relevant Interpreter modes for Hardware design

Symbolic

Performs symbolic interpretation on the IR

LLSPIR

Compiles to LLSPIR, optimizing the circuit, and also providesrough profiling information of the final circuit

VHDL

Compiles to LLSPIR and then translates to VHDL, useful forgenerating VHDL that is manually integrated into another design

FPGA

Compiles to LLSPIR, translates to VHDL and uses external toolsto synthesize the VHDL to an architecture dependent netlist

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 6: The Cryptol Epilogue: Swift and Bulletproof VHDL

Cryptol → VHDL conversion

Step 1

Remove constructs from specialized Cryptol implementation whichare unsupported in the FPGA compiler

Step 2

Convert top-level function to stream model for performanceanalysis

Step 3

Adjust implementation according to space and time requirements

Step 4

Use reg pragma to pipeline the implementation

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 7: The Cryptol Epilogue: Swift and Bulletproof VHDL

Step 1: FPGA backend limitations

The following are not supported

Division by powers of other than 2 (hardware’s limitation)

Recursive functions (recursive streams are fine)

High-order functions (partially, since functions are allowed tobe passed as parameters but cannot be returned)

These limitations rarely are a problem; in fact, only the second oneapplied to our specification and was easily resolved.

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 8: The Cryptol Epilogue: Swift and Bulletproof VHDL

Formal Methods to the rescue!

Let’s continue, but first...

Is our implementation

Safe ?

Correct ?

Equivalent ?

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 9: The Cryptol Epilogue: Swift and Bulletproof VHDL

Safety Checking

:safe command

No evil zeroes

No illegal index accesses

And more but these are sufficient

snow3g v0.95> :set sbvsnow3g v0.95> :safe encrypt“encrypt” is safe; no safety violations exist.

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 10: The Cryptol Epilogue: Swift and Bulletproof VHDL

Theorem Proving

:prove command

Theorems are boolean functions

Proves theorem is equivalent to the function that alwaysreturns true regardless of its inputs

plaintext ⇔ decrypt . encrypt

theorem EncDec: {pt k i}. pt == decrypt(encrypt(pt , k,

i), k, i);

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 11: The Cryptol Epilogue: Swift and Bulletproof VHDL

Theorem Proving

JAIG

snow3g v0.95> :prove EncDecGenerating formal model of EncDecGenerating formal model of f where f : ([4][32],[4][32],[4][32]) ->Bit; f x = True;37.519% (01:19:16 ETA)

JAIG eventually froze and crashed.

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 12: The Cryptol Epilogue: Swift and Bulletproof VHDL

Theorem Proving

ABC

snow3g v0.95> :set abcsnow3g v0.95> :set symbolic +vsnow3g v0.95> :prove EncDecGenerating formal model of EncDecGenerating formal model of f where f : ([4][32],[4][32],[4][32]) ->Bit; f x = True;Q.E.D.

ABC took 2 minutes to finish the proof.

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 13: The Cryptol Epilogue: Swift and Bulletproof VHDL

Equivalence Checking

:eq command

Works with an incremental development model: successiveversions of an algorithm can be proven equivalent to aprevious specification ⇒ stepwise-refinement approach

Checks whether Cryptol’s translation to another languageremains formally equivalent ⇒ Cryptol → VHDL for instance

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 14: The Cryptol Epilogue: Swift and Bulletproof VHDL

Equivalence Checking

Step 1 - :fm command

snow3g v0.95> :set abcsnow3g v0.95> :set symbolic +vsnow3g v0.95> :fm encrypt ”./enc.aig”Generating formal model of encrypt: ./enc.aig

Step 2 - :eq command

snow3g v0.95> :set LLSPIRsnow3g v0.95> :eq encrypt ”./enc.aig”True

Took less than 5 minutes to finish.

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 15: The Cryptol Epilogue: Swift and Bulletproof VHDL

Checkpoint

Our implementation is

Safe X

Correct X

Equivalent X

What about efficiency?

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 16: The Cryptol Epilogue: Swift and Bulletproof VHDL

Technical Jargon

Clockrate

Rate of clock cycles per second on the FPGA measured in Hz

Latency or Propagation delay

Amount of time between inputs fed to the circuit andcorresponding outputs measured in number of clock cycles orseconds respectively

Output rate

Indicates how long one must wait before feeding input into thecircuit to produce output and is measured in inverse clock cycles

Throughput

Amount of information that is output from the circuit per unit oftime measured in bits/second

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 17: The Cryptol Epilogue: Swift and Bulletproof VHDL

Circuit representations: Combinatorial vs Sequential

Combinatorial circuit

Output is a pure function of present input and has no state

Unclocked

Sequential circuit

Output depends on past inputs or state

Clocked or Unclocked

Practical computer circuits contain a mixture of combinational andsequential logic

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 18: The Cryptol Epilogue: Swift and Bulletproof VHDL

Circuit representations: Combinatorial vs Sequential

Combinatorial circuit

adderC : ([8] ,[8]) -> [255][8];

adderC (a, b) = [| (a + b) || (a, b) <- [0..254] |];

Sequential circuit

adderS : [8] -> [255][8];

adderS b = take (255, outs)

where outs = [ b ] # [| (a + b) || a <- outs |];

Cryptol’s generated circuits must be clocked, otherwise it’s notpossible to make use of clock constraints to produce useful timing

analysis

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 19: The Cryptol Epilogue: Swift and Bulletproof VHDL

Modelling Sequential Circuits

Step Model

Models circuits that are later lifted into stream model

Unclocked

Variation of type: (input, state) → (output, state)

Stream Model

Model uses infinite sequences over time

Each element in the input or output corresponds to somenumber of clock cycles ⇒ latency of the circuit

Variation of type: [inf]input → [inf]output

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 20: The Cryptol Epilogue: Swift and Bulletproof VHDL

Performance Analysis

LLSPIR and FPGA modes report estimates of circuit latency,clockrate, space utilization and the longest path in a circuit

Guides towards a more efficient (faster and/or smaller)implementation

Cryptol expects top-level function to be defined in the streammodel and will forcibly lift it otherwise

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 21: The Cryptol Epilogue: Swift and Bulletproof VHDL

Performance Analysis: LLSPIR

Underestimates clockrate and provides rough estimate ofspace utilization

Users are encouraged to refine an implementation as much aspossible in this mode before beginning synthesis in FPGAmode

Translation from LLSPIR to VHDL is trivial and takes lesstime than synthesis ⇒ if implementation is correct in LLSPIR,its correctness is highly probable in VHDL

Use :translate to compile a function to LLSPIR, producing a.dot file and :set +v to print the performance information

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 22: The Cryptol Epilogue: Swift and Bulletproof VHDL

Performance Analysis: FPGA

FSIM mode reports space utilization accurately but reportedclockrate is overestimated (theoretical maximum)

TSIM mode reports the exact obtainable clockrate for aparticular place-and-route attempt

fpga clockrate and fpga optlevel settings can significantlyinfluence the place-and-route tool ⇒ experimentation isadvised to obtain maximum possible clockrate

External profiling tools

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 23: The Cryptol Epilogue: Swift and Bulletproof VHDL

Step 2: Lift top level functions

encrypt

encrypt : ([4][ wsize], [4][ wsize], [4][ wsize])

-> [4][ wsize ];

encrypt(pt , key , iv)

= [| k ^ p || k <- GenKS(key , iv) || p <- pt |];

enc lifted

enc_lifted : [inf ]([4][ wsize], [4][ wsize], [4][ wsize])

-> [inf ][4][ wsize ];

enc_lifted ins = [| encrypt in || in <- ins |];

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 24: The Cryptol Epilogue: Swift and Bulletproof VHDL

Performance Analysis: LLSPIR

enc lifted

snow3g v0.94> :set LLSPIR +vsnow3g v0.94> :translate enc liftedSorry, not implemented: timing dependencies are too complicated.LLSPIR is not in canonical form.

Some serious optimization is required!

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 25: The Cryptol Epilogue: Swift and Bulletproof VHDL

Step 3: Space/Time Tradeoffs

Block RAM

par and seq pragmas

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 26: The Cryptol Epilogue: Swift and Bulletproof VHDL

Space/Time Tradeoffs: Block RAM

FPGA implementation of constant sequences such as S-Boxes

Simplifies design effort and reduces computational logic

The compiler tries the conversion by default

Doesn’t work if there are dynamic elements

It’s really fast!

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 27: The Cryptol Epilogue: Swift and Bulletproof VHDL

Space/Time Tradeoffs: Block RAM

MULxPOW

MULxPOW : ([8], [8], [8]) -> [8];

MULxPOW(v, i, c) = res @ i

where res = [ v ] # [| MULx(e, c) || e <- res |];

The latency of this implementation is 28, because Cryptolimplements synchronous circuits whose latency must beknown statically ⇒ latency of this circuit is equal to theworst-case latency

We can be more efficient by implementing it as 8 static256-element lookup tables ⇒ Block RAMs

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 28: The Cryptol Epilogue: Swift and Bulletproof VHDL

Space/Time Tradeoffs: Block RAM

MULa before static tables

=== Circuit Timing ===circuit latency: 246 cycles (245 cycles plus propagation delay)circuit rate: N/Aoutput length: one elementtotal time: 246 cycles (245 cycles plus propagation delay)

MULa after static tables

=== Circuit Timing ===circuit latency: 3 cycles (2 cycles plus propagation delay)circuit rate: N/Aoutput length: one elementtotal time: 3 cycles (2 cycles plus propagation delay)

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 29: The Cryptol Epilogue: Swift and Bulletproof VHDL

Space/Time Tradeoffs: par and seq

par

Forces paralelization

Replicates circuitry

Faster but consumes more space

seq

Forces sequentialization

Reuses circuitry over multiple clock cycles

Slower but consumes less space

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 30: The Cryptol Epilogue: Swift and Bulletproof VHDL

par pragma

Example

map : {a b} (a -> b, [4]a) -> [4]b;

map(f, xs) = [| (f x) || x <- xs |];

There’s no need to use par because it’s the compiler’s defaultaction in order to improve overall performance

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 31: The Cryptol Epilogue: Swift and Bulletproof VHDL

seq pragma

Example

map : {a b} (a -> b, [4]a) -> [4]b;

map(f, xs) = seq [| (f x) || x <- xs |];

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 32: The Cryptol Epilogue: Swift and Bulletproof VHDL

Step 4: Pipelining

reg pragma

Sequential circuits in the stream model can be pipelined

Separation of a function into several smaller computationalunits

Each unit is a stage in the pipeline consuming output fromprevious stage and producing output to the next

Typically increases area and latency of circuit but candramatically increase clockrate and throughput

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 33: The Cryptol Epilogue: Swift and Bulletproof VHDL

Performance Analysis - LLSPIR

enc lifted

snow3g v0.95> :translate enc lifted=== Circuit Timing ===circuit latency: 25 cycles (24 cycles plus propagation delay)circuit rate: one element per cycleoutput length: unboundedtotal time: unbounded

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 34: The Cryptol Epilogue: Swift and Bulletproof VHDL

Conclusions

Language

Combination of arithmetics and sequence manipulations ⇒compact syntax and easy to learn

Infinite sequences

Size and shape polymorphism

Really captures the elegance and abstract mathematicalessence of ciphers’ specifications

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 35: The Cryptol Epilogue: Swift and Bulletproof VHDL

Conclusions

Formal Methods’ tools

Possible to check if implementations are safe to execute,correct and formally identical to their specifications

They work in real scenarios

Push button package ⇒ avoids specific annotations and effortto learn external languages

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 36: The Cryptol Epilogue: Swift and Bulletproof VHDL

Conclusions

FPGA synthesis

Performance analysis

Compiler pragmas are provided to make simple and effectivespace/time tradeoffs

Can generate more efficient than hand-made implementations⇒ saving loads of time

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL

Page 37: The Cryptol Epilogue: Swift and Bulletproof VHDL

Questions

?

Pedro Pereira, Ulisses Costa The Cryptol Epilogue: Swift and Bulletproof VHDL