35
Comparison of Hardware Implementations of S-box and T-box architectures of AES Bhupathi Kakarlapudi and Nitin Alabur ECE 746 : Secure Telecommunication systems Instructor: Dr. Kris Gaj

AES T-Box Slides

Embed Size (px)

Citation preview

Page 1: AES T-Box Slides

Comparison of Hardware Implementationsof S-box and T-box architectures of AES

Bhupathi Kakarlapudi and Nitin Alabur

ECE 746 : Secure Telecommunication systemsInstructor: Dr. Kris Gaj

Page 2: AES T-Box Slides

2

Agenda

Introduction

Motivation

Overview of architectures

Implementations

Key Scheduling

Test vectors and tools used

Results

Conclusion

Page 3: AES T-Box Slides

3

Introduction to AES

In 1997, NIST initiated a contest known as AES todevelop a Federal Information Processing Standard.

Standard Should be capable of protecting sensitivegovernment information well into the next centuary.

After 5 years of extensive analysis, Rijndael was chosenas the winner of the contest, and become a officialstandard in Nov. 2001

AES is expected to be used by U.S. Government and, onvoluntary basis by a private sector.

Page 4: AES T-Box Slides

4

Motivation

AES T-Box implementations for decryption andcombined encryption decryption units in software showedbetter throughput, compared to S-box implementations insoftware.

This performance improvement was shown in hardwareon Altera Flex devices by Viktor Fischer and MilosDrutarovsky.

Our idea is to show the same performance improvementof T-box architecture in hardware on Xilinx FPGAfamilies Virtex 5 & Spartan 3E.

Page 5: AES T-Box Slides

5

S-box vs T-box

S-box architecture uses 8 x 8 look-up tables and theremaining round operations for encryption/ decryptionoperations

T-box Architecture uses 8 x 32 look-up tables and theremaining XOR operations for encryption/decryptionoperations.

T-box architecture uses 4 times more memory than S-box.

(S-box :16 times 8 x8 ::: T-box: 16 times 8 x 32)

Page 6: AES T-Box Slides

6

S-box Architecture Overview

This architecture structure is same as generalproposed architecture of AES.

Encryption starts with add round key, andperforms

Round Operations:

subbytes (uses 8 x 8 Look-up tables), shift rows, MixColumn and add roundkey.

Last round doesn’t include Mix column operation.

Page 7: AES T-Box Slides

7

S-box Enc/Dec

Subbytes

Shift Rows

MixColumn

Plaintext

Ciphertext

K0

Ki

i<Nr

i=Nr

InvMixColumn

InvShift Rows

InvSubbytes

Ciphertext

Plaintext

KNr

Ki

i>=0

i=Nr

a) Encryption b) Decryption

Nr : Total Number of Rounds

Page 8: AES T-Box Slides

8

T-box architecture overview

This architecture allows the computation of the entireround only using look-up tables and XOR operations.

Pre-computed look-up tables represent the combinedoperation of subbytes and mixcolumn transformations.

T-box tables are of size 8 x 32 bits.

Memory of T-box Table

One T-box Table: 256 x 32(4B) = 1KB

Four T-box tables = 4KB ( Fast Implementations)

Page 9: AES T-Box Slides

9

Description of T-box Tables

S15S11S7S3

S14S10S6S2

S13S9S5S1

S12S8S4S0

State (128 bit)

.

02 03 01 01

01 02 03 01

01 01 02 03

03 01 01 02

S0

S1

S2

S3

=

02 * S0 03* S1 01* S2 01* S3

01 * S0 02* S1 03* S2 01* S3

01 * S0 01* S1 02* S2 03* S3

03 * S0 01* S1 01* S2 02* S3

First rows elements, s0, s4, s8, s12

Second rows elements, s1, s5, s9, s13

Mix Column Operation In AES

T0 T1 T2 T3

Page 10: AES T-Box Slides

10

T-Box Tables

T0[a] =

02. S[a]

S[a]

S[a]

03.S[a]

T1[a] =

03. S[a]

02.S[a]

S[a]

S[a]

T2[a] =

S[a]

03.S[a]

02.S[a]

S[a]

T3[a] =

S[a]

S[a]

03.S[a]

02.S[a]

T0-1[a] =

0E. S[a]

09.S[a]

0D.S[a]

0B.S[a]

T1-1[a] =

0B. S[a]

0E.S[a]

09.S[a]

0D.S[a]

T2-1[a] =

0D.S[a]

0B.S[a]

0E.S[a]

09.S[a]

T3-1[a] =

09.S[a]

0D.S[a]

0B.S[a]

0E.S[a]

Page 11: AES T-Box Slides

11

Round Operation Computation

e0, j

e1, j

e2, j

e3, j

= T0 [a0,j] T1 [a1,j+c1] T2 [a2, j+c2] T3 [a3, j+c3]

K0, j

K1, j

K2, j

k3, j

e0, j

e1, j

e2, j

e3, j

= T0 [a0,j] Rotbyte( T0 [a1,j+c1]) Rotbyte( T0 [a2, j+c2] Rotbyte( T0 [a3, j+c3]) Kj

j- indicates key word

Mod 4

Page 12: AES T-Box Slides

12

T-box ArchitecturePlaintext

T Tables

Enc XOR Network

Derived Subbytes

Shift Rows

8 8 8 8..

32 3232 32

K[0]

Ki

KNr

128

Cipher text

128Ciphertext

T-1 Tables

Dec XOR Network

Derived InvSubbytes

InvShift Rows

8 8 8 8..

32 3232 32

K[Nr]

Inv Ki

K0

128

Plaintext

128

128 128

128 128

128 128

128

.. ..

a) Encryption b) Decryption

Page 13: AES T-Box Slides

13

Modified Decryption in T-box

InvShiftRows

Inv Subbytes

Add RoundKey

InvMixcolumns

InvSubbytes

Inv Shiftrows

InvMixcolumn

Inv Add RoundKey

KNr KNr

a) Standard decryption round b) Modified decryption round

Page 14: AES T-Box Slides

14

S-box Basic Iterative Architecture

SubBytes&

Inv Subbytes

Shift Rows

MixColumns

Shift Rows

InvMixColumns

Data input

Round key

R

Round keyRound key

Data Output

Decryption CircuitEncryption Circuit

Ref: Dr Gaj and Chodowiec Publication

Page 15: AES T-Box Slides

15

S-box Basic Iterative Architecture(1)

This architecture can only encrypt one block of data at atime and number of clock cycles necessary toencrypt/decrypt is equal to the total number of cipherrounds.

Critical path is located in the decryption circuit andincludes Invshift rows-addroundkey-Inv Mixcolumns- 3-to-1 multiplexer - Inv subbytes.

This architecture takes 11,13 and 15 clock cycles toprocess data for key sizes 128,192 and 256

Page 16: AES T-Box Slides

16

T-box Iterative architecture

Subbytes Inv subbytes

Shift rows Inv shiftrows

Data input

Round Key

Enc Unit Dec Unit

Enc round Dec round

Inv Round KeyRound Key

Round Key Round Key

Data output

Ref: Dr Gaj and Chodowiec Publication

Page 17: AES T-Box Slides

17

Key Scheduling

Key scheduling unit supports all three key sizes i.e128, 192 and 256.

It requires a key setup phase, during which roundkeys are computed and stored in internal memory.

This unit produces 64 bit key per clock cycle,independent of the size of the main key.

Page 18: AES T-Box Slides

18

Key: Block Diagram

32

Input 64 bits

32

32

32

32 32Ki-2 Ki-1

Ki-4 Ki-3

Ki-6 Ki-5

Ki-8 Ki-7

Ki-NkKi+1-Nk

Ki Ki+1Output64 bits

Rcon

0Rot Sub

Register

Ref: Dr Gaj and Chodowiec Publication

Page 19: AES T-Box Slides

19

Interface

Page 20: AES T-Box Slides

20

Interface - Virtex

AES ENC/DECUNIT

CLK

RESET

DATA_IN

DATA_IN_WRITE

DATA_IN_READY

KEY_IN

KEY_IN_WRITE

KEY_IN_READY

ENC/DEC

DATA_OUT

FULL

WRITE

128

128

128

Page 21: AES T-Box Slides

21

Interface - Spartan

Page 22: AES T-Box Slides

22

Test Vectors

Test vectors provided by NIST in the fips 197publication

Contains intermediate state values

Test vectors for encryption and decryption areavailable for different key sizes

Separate decryption test vectors available fordecryption schemes using normal key and inversekeys

Page 23: AES T-Box Slides

23

Design tools used

Aldec Active HDL 7.2 used for functional simulation

Xilinx ISE Design Suite 10.1 used for synthesis andimplementation

Page 24: AES T-Box Slides

24

Results

Page 25: AES T-Box Slides

25

Throughput (Gbps)

0.3190.9070.3551.01256

0.3381.020.4031.35192

0.3761.180.4261.53128

SpartanVirtexSpartanVirtexKey Size

T-boxS-box

Page 26: AES T-Box Slides

26

Throughput

Comparison: Throughput

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

128_Virtex 192_Virtex 256_Virtex 128_Spartan 192_Spartan 256_Spartan

Implementation

Th

rou

gh

pu

t(G

bp

s)

S-box

T-box

Page 27: AES T-Box Slides

27

Area (CLB slices)

11,68716863019622256

11,68716932913641192

11,68716963019633128

SpartanVirtexSpartanVirtexKey Size

T-boxS-box

Page 28: AES T-Box Slides

28

Area

Comparison: Area

0

2000

4000

6000

8000

10000

12000

14000

128_Virtex 192_Virtex 256_Virtex 128_Spartan 192_Spartan 256_Spartan

Implementation

Are

a(C

LB

sli

ce

s)

S-box

T-box

Page 29: AES T-Box Slides

29

Throughput/Area

27.27538.038317.821618.846256

28.90602.721354.652104.060192

32.15693.113376.962415.910128

SpartanVirtexSpartanVirtexKey Size

T-boxS-box

Page 30: AES T-Box Slides

30

Throughput/Area

Comparison: Throughput/Area

0

500

1000

1500

2000

2500

3000

128_Virtex 192_Virtex 256_Virtex 128_Spartan 192_Spartan 256_Spartan

Implementations

Ra

tio S-box

T-box

Page 31: AES T-Box Slides

31

Problems encountered

Unable to map the T – tables to the BRAMs.

By default, the tool implemented the tables as logicinstead of BRAMs

Possibility of the T-box architectures having higherlatency due to on the fly calculation of inverse round keys

Page 32: AES T-Box Slides

32

Conclusion

Our S-box implementations perform better than the T-boximplentations

Area of T-box implementations nearly four times morethan that of the S-box implementations.

Page 33: AES T-Box Slides

33

Conclusion (2)

Comparatively the throughputs of S-box implementationsare 11%, 29% and 31% higher than that of thecorresponding T-box implementations with key size 128bits, 192 bits and 256 bits

The throughput/areaCLB of the S-box implementation is atleast 10x and more than corresponding T-boximplementations

Page 34: AES T-Box Slides

34

Scope for future work

Implement the T-box architecture implementations suchthat BRAMs are used to store the T table values

Partial or complete loop unrolling can be implemented forthe S-box architectures to further increase the throughput

For the T-box implementations, the inverse round keyscan be precomputed and stored in the memory, whichmay reduce the min clock period.

Page 35: AES T-Box Slides

35

Questions?