Advanced Encryption Standard Hardware

Embed Size (px)

Citation preview

  • 7/29/2019 Advanced Encryption Standard Hardware

    1/43

    FPGA Implementation ofAdvanced Encryption standards

    Srihari Sridharan

    October 22nd 2007

  • 7/29/2019 Advanced Encryption Standard Hardware

    2/43

    Efficient Implementation of Rijndael

    Encryption in Reconfigurable Hardware:

    Improvements and Design Tradeoffs

    Francois-Xavier Standaert,Gael Rouvroy,Jean-

    Jacques Quisquater, and Jean-Didler Legat

    CHES

    Springer-Verlag Berlin Heidelberg 2003

  • 7/29/2019 Advanced Encryption Standard Hardware

    3/43

    OUTLINE Performance Evaluation of AES Algorithm

    Effective FPGA implementation

    Heuristics to evaluate hardware efficiency

    Derive at optimum throughput/area

    efficiency

    Optimum Throughput = 18.5 Gbps , Area =

    542 slices , 10 RAM blocks

  • 7/29/2019 Advanced Encryption Standard Hardware

    4/43

    Hardware Description

  • 7/29/2019 Advanced Encryption Standard Hardware

    5/43

    Hardware Description XILINX VIRTEX E

    32448 slices

    64986 LUTs,F.Fs

    208 RAM Blocks

    Synthesis Synopsys Circuit modeling - VHDL

  • 7/29/2019 Advanced Encryption Standard Hardware

    6/43

    Hardware Description 2 Slices per CLB

    Slice 2 L.C

    L.C one 4-I/p LUT + storage + additionallogic

    Storage element Latch/Edge Triggered D

    F.FAdditional Logic Mux F5,F6

    Arithmetic logic CY logic + XOR + AND

  • 7/29/2019 Advanced Encryption Standard Hardware

    7/43

    Evaluation Paramaters 2 Types of Performance evaluation

    parameters. In terms of performance

    Throughput : bits processed per sec Area : Slices Ratio is an evaluation parameter

    In terms of resource Nbr of LUTs Nbr of Registers Ratio is Evaluation parameter

  • 7/29/2019 Advanced Encryption Standard Hardware

    8/43

    Encryption Block

  • 7/29/2019 Advanced Encryption Standard Hardware

    9/43

    Plain Text - Block Ciphers Input 128 bit blocks State transformed S[r+c] = in[r+4c] Out[r+4c] = S[r+c] 0

  • 7/29/2019 Advanced Encryption Standard Hardware

    10/43

    Implementation 2 Types of Optimization Algorithmic

    SBox Multiplexer Model RAM based Composite field

    MixColumns MixColumns transform Mixadd transform

    Architectural Loop unrolling Pipelining Sub-Pipelining

  • 7/29/2019 Advanced Encryption Standard Hardware

    11/43

    SBOX - Mux Model Sbox Table

  • 7/29/2019 Advanced Encryption Standard Hardware

    12/43

    Mux Model - Background N i/p boolean function G(x) represented

    by

    In AES

    Which is bit representation

    Implemented as

  • 7/29/2019 Advanced Encryption Standard Hardware

    13/43

    Mux Model MUX Model

  • 7/29/2019 Advanced Encryption Standard Hardware

    14/43

    Mux Model Realization on FPGA

    LUT based

    4 I/p 4 o/p Lookup Four 4 I/p 1 o/p LUT

    Coupled 4:1 Mux

    Realizing 4:1 Mux through three 2:1 Mux

    ba

    c

    d

    s0

    s1

  • 7/29/2019 Advanced Encryption Standard Hardware

    15/43

    Mux Model - Implementation

  • 7/29/2019 Advanced Encryption Standard Hardware

    16/43

    Mux Model - Analysis

    1 Bit output

    Repeated 16 times and looped 16 times

    Critical path LUT4 + MUXF5 + MUXF6

    2 level pipelining

    12 clock pulses

  • 7/29/2019 Advanced Encryption Standard Hardware

    17/43

    Implementation

    2 Types of Optimization Algorithmic

    SBox Multiplexer Model

    RAM based Composite field

    MixColumns MixColumns transform Mixadd transform

    Architectural Loop unrolling Pipelining Sub-Pipelining

  • 7/29/2019 Advanced Encryption Standard Hardware

    18/43

    SBOX RAM Based

    Lookup type BRAM two single port 256x8 bit Write enable of RAM made low Input held low ROM implemented 1 clock Design

    SBOX = 16x16x8 = 2048 bits = 2Kbits 16 SBOx for each state 1 BRAM = two 2Kbit RAM Hence 8 BRAM required

  • 7/29/2019 Advanced Encryption Standard Hardware

    19/43

    Implementation

    2 Types of Optimization Algorithmic

    SBox Multiplexer Model RAM based

    Composite field MixColumns

    MixColumns transform Mixadd transform

    Architectural Loop unrolling Pipelining Sub-Pipelining

  • 7/29/2019 Advanced Encryption Standard Hardware

    20/43

    Composite field - Math Basics

    Byte representation in Galois Field GF(28)

    For e.g. 01100011 is x6 + x5 + x+ 1. Addition Modulo 2 Arithmetic (No subtraction) Multiplication polynomial multiplication modulo irreducible

    polynomial (deg = 8) m(x) = x8 + x4 + x3 + x+1 Multiplicative inverse

    b(x)a(x) + m(x)c(x) = 1.

    b

    -1

    (x) = a(x) mod m(x) because a(x) b(x) mod m(x) = 1,

    E.g 3m 1 (mod 11) , 3-1 m(mod 11)

  • 7/29/2019 Advanced Encryption Standard Hardware

    21/43

    Composite model equationsMultiplicative Inverse

    GF(28) = GF(24) 2

    GF(24) = a1x + a0

    Inverse given by

    X belongs to x2 + x + = 0

    b0=(a0+a1)-1

    b1=a1-1

    = a0.(a0+a1)+ a12

  • 7/29/2019 Advanced Encryption Standard Hardware

    22/43

    Composite field - AffineTransformation

    Linear transformation + Translation

    Transformation = rotations, scaling, shear Translation = shift In AES

  • 7/29/2019 Advanced Encryption Standard Hardware

    23/43

    Composite field - implementation

  • 7/29/2019 Advanced Encryption Standard Hardware

    24/43

    Implementation

    2 Types of Optimization Algorithmic

    SBox Multiplexer Model RAM based Composite field

    MixColumns

    MixColumns transform Mixadd transform

    Architectural Loop unrolling Pipelining Sub-Pipelining

  • 7/29/2019 Advanced Encryption Standard Hardware

    25/43

    Mixcolumns transform- Background

    Four-term polynomials

    Coefficients are bytes

    M(x) = X4 + 1

    Product defined as a(x) X b(x) = d(x)

  • 7/29/2019 Advanced Encryption Standard Hardware

    26/43

    Mixcolumns transform - Equations

    Solution

    Multiplication of GF(28) polynomial withX = multiplication by 02 = left shift plusConditional XOR (based on MSB)

  • 7/29/2019 Advanced Encryption Standard Hardware

    27/43

    Mixcolumns transform - Implementation

  • 7/29/2019 Advanced Encryption Standard Hardware

    28/43

    Mixcolumns transform -Implementation

    To implement

    03a1 = (02 + 01)a1 = 02a1 + a1 Hence we have 2 multiplication with x (a0,a1) 5 XOR addition Above two + a1+a2+a3

    2 level pipelined

  • 7/29/2019 Advanced Encryption Standard Hardware

    29/43

    Mixcolumns transform -Implementation

  • 7/29/2019 Advanced Encryption Standard Hardware

    30/43

    Implementation

    2 Types of Optimization Algorithmic

    SBox Multiplexer Model RAM based Composite field

    MixColumns MixColumns transform

    Mixadd transform

    Architectural Loop unrolling Pipelining Sub-Pipelining

  • 7/29/2019 Advanced Encryption Standard Hardware

    31/43

    Mixadd transform - Principle

    Inside X(a0) or X(a1) Mostly shiftoperator

    In both the bytes XOR is done only to 3bits

    So these three bits separately added

    Now pipelined Combined with Key addition

  • 7/29/2019 Advanced Encryption Standard Hardware

    32/43

    Mixadd transform -Implementation

  • 7/29/2019 Advanced Encryption Standard Hardware

    33/43

    Implementation

    2 Types of Optimization Algorithmic

    SBox Multiplexer Model RAM based Composite field

    MixColumns MixColumns transform Mixadd transform

    Architectural Loop unrolling Pipelining Sub-Pipelining

  • 7/29/2019 Advanced Encryption Standard Hardware

    34/43

    Unrolled Architecture

    10 AES roundunrolled

    Lots ofhardware

    Area is

    increased Throughput is

    Increased

  • 7/29/2019 Advanced Encryption Standard Hardware

    35/43

    Implementation

    2 Types of Optimization Algorithmic

    SBox Multiplexer Model RAM based Composite field

    MixColumns MixColumns transform Mixadd transform

    Architectural Loop unrolling

    Pipelining Sub-Pipelining

  • 7/29/2019 Advanced Encryption Standard Hardware

    36/43

    Pipelined Architecture - I

    At a time onlyone round

    Hardwarereduced

    Throughputreduced

    Area reduced

  • 7/29/2019 Advanced Encryption Standard Hardware

    37/43

    Pipelined Architecture - II

    All 10 rounds takeninside loop

    Loss of mixaddcombination

    Additional Mux

    Good choice in ASIC

  • 7/29/2019 Advanced Encryption Standard Hardware

    38/43

    Heuristic optimization

  • 7/29/2019 Advanced Encryption Standard Hardware

    39/43

    Results

    Pipelined -I architecture

    Unrolled Architecture

  • 7/29/2019 Advanced Encryption Standard Hardware

    40/43

    Results Contd

    Comparison

    RAM/unrolled

    RAM/pipelined

    Mux/pipelined

    composite/pipelined

  • 7/29/2019 Advanced Encryption Standard Hardware

    41/43

    Summary

    http://www.cs.bc.edu/~straubin/cs381-05/blockciphers/rijndael_ingles2004.swf

  • 7/29/2019 Advanced Encryption Standard Hardware

    42/43

    Conclusion

    Algorithmic and Architectural DesignTradeoffs were evaluated

    Optimum Design principle foundthrough heuristics

    Throughput = 1563Mbps

    Performance (throughput/Area) = .69

  • 7/29/2019 Advanced Encryption Standard Hardware

    43/43

    Phase 2 preview

    Implement SBOX RAM based

    Implement Mixcoloumn Mixcoloumn

    transform

    Implement Addkey Direct XOR

    Implement ShiftRow Simple cyclic

    shift