Deriving cryptographic keys via power consumption Roman...

Preview:

Citation preview

Deriving cryptographic keys via power consumption

Roman Korkikian

Power trace plotting

Запустите виртуальную машину, вы должны увидеть терминал с тремя вкладками и редактор (тоже с несколькими вкладками).

Перейдите в редактор и откройте вкладку Task1 она нам потребуется в дальнейшем (если у вас нет распечатки).

Все слайды есть на флешках. Откройте, если вам не видно на проекторе.

What this workshop about?

Рома

Ты

Questions during/after and instead of workshop.

Who am I?

Roman Korkikian

What this workshop about?

The workshop agenda:1. Hands-on theory (you will execute couple of commands). During

this part I will not come to help you. I will help you after theory.2. Theory (1-2 hours).3. Practice (1-2 hours) so you may leave if you don’t want it to do.

What this workshop about?

The workshop goal:

Atract you interest with side-channel attacks.

What this workshop about?

The workshop is about side-channel attacks:

how to extract the cryptographic key using physical data

What this workshop about?

The workshop is about side-channel attacks:

how to extract the cryptographic key using physical data

Sounds crazy?

PART 0. Power trace acquisition

Power traces acquisition

Each electronic device require power to perform computations.

Typical power supplies are 0.95 –1.8V depending on the technology.

During the work electronic device drains current to the ground, transforms energy and etc, i.e. device consume power.

Pin connected to power supply

Power traces acquisition

https://github.com/kokke/tiny-AES128-C/blob/master/aes.c

Power trace plotting

AES

algorithm

Power trace explanation

Power trace plotting

Start Virtual Machine and open folder:

Desktop/DPA_Workshop/AES_STM8_Example

There should be file Task1 – open the file and follow all the steps (7-13) to plot power curves.

Power traces acquisition

Power trace explanation

• AES is a particular algorithm:• 11 Key mixing operations;

• 10 Sbox and Shift rows operations;

• 9 MixColumn operations;

• Hence from power trace we can understand where each operation took place.

Power trace explanation

Power trace explanation

• AES engine acquisition:• 10.000 plaintext/ciphertext pairs and corresponing power trace during

encryption are available in folder data.

• We clearly see that AES algorithm reveals its operation via power consumption.

• What we will do with all of that preasious information?! EXTRACT THE KEY!

Power trace explanation

• Key extraction:• We build simple power models that depend on the key byte and know

information and then we try all the bytes and among them we will choose thecorrect one.

• Lets come back to Virtual Machine and continue Task 1 (items 14 to17).

Power trace explanation

Power trace explanation

Power trace explanation

Power trace explanation

• Steps 14 to 17 computesrelationship between power traces and the first byte of thelast round state:��� = �����[�����]

• ��� is used twice in a program –• first time ��� is obtained after

AddRndKey

• second time ��� is read from thememory to compute

��[���]

Power trace explanation

• We were correlating first byte of the key. Now I want you to rewrite two numbers in the code so you will correlate another key byte.

• Do the Task1 from lines 18 till 25.

• Check the position of the pike – it shall move.

Power trace explanation

Power trace explanation

Power trace explanation

• So in this workshop I will try to explain:• Why and how power consumption leak binary information.

• How to build simple power consumption models based on binary data.

• How to search for a key.

PART 1. The truth is out there…*

* This part is boring, don’t not fall asleep

Intuition lies

- R. Pacalet

Examples of side-channels

Examples of side-channel techniques that

YOU KNOW FROM SCHOOL

Examples of side channels: model of the atom

Examples of side channels: volume definition

Examples of side-channels

• Side channel techniques – are the methods that allow characterizeobject’s properties by observing reaction and feedback of otherobjects.

• For cryptography – side channel attacks are the attacks that use physical information to extract binary data.

How binary and physical data are linked?

Algorithms are just physical signals

Algorithm

Computing machinery

Computing is physical process

To perform computations energy has to be spent irreversably

Side channel leakages

• CMOS (electronic) devices consume electrical power, but also transformenergy to other observable types of side-channel.

Side channel leakages

Power consumption

Power consumption

• Hardware works from frequencies from several MHz till several GHz.

• Power consumption shall be measured accordingly (sampling rate may be smaller than the actual frequency).

• Use fast digital oscilloscopes.

• In our case the bandwidth was 250 MHz with microcontroller running at 16 MHz.

Power consumption

• Home setup.

• Bandwidth up to 250 MHz, sampling10 GHz – can attack up to 500 MHz devices.

• 1.8K $

• Industrial setup.

• All type of devices

• >30K $

Power consumption

• Roughly speaking power consumption is a measure of current/voltage needed for the correct device work – at different moments of time current that flows through the device will be different.

• We consider CMOS systems – they consume power only on transaction.

Power consumption

Lets keep it simple:

each hardware is based on registers (and logic), registersconsume a lot of power on

transaction when 0 overwrites 1 or 1 overwrites 0.

Power consumption: D flip flop

• D flip flop consumes power on clock’s edge.

• Lets see small gif

Power consumption: D flip flop

• Flip flop power consumption on D input transaction

0

0.2

0.4

0.6

0.8

1

1.2

0 to 1 1 to 0 1 to 1 0 to 0

Power consumption in microcontroller

Power consumption

Different instructions – different impedance/path/power consumption

Power consumption

Power consumption

• Well… registers consume power but the rest of the circuit also consumes a lot (even more than one register).

• What to do?

Mathematics behind side channel attacks

Mathematics

• Consider a small example:• Two asset values 2 and 1.

• Each time asset value is returned a significant noise is added:

Asset values

Noise values

+

Mathematics

Mathematics

Guess 1: sequence of two asset values Guess 2: sequence of two asset valuesRed is 2

Blue is 1

We know asset values, but we don’t know its appearance

Mathematics

• Take all, average and extract:• Mean(red bars) – mean(blue bars) approximately 1

• For guess 1 this difference is equal to 1.042

• For guess 2 this difference is equal to -0.297

• Why?

Mathematics

• Law of large numbers (check wiki): the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.

Mathematics

• Asset values 0 and 1

• Noise is Gaussian with mean

128 and variance of 78

• Average for asset value 1 will

converge to 128 + 1, and for

asset value 0 the average

converges to 128 + 0

Mathematics

• Asset values 0 and 1

• Noise is Gaussian with mean

128 and variance of 78

• Average for asset value 1 will

converge to 128 + 1, and for

asset value 0 the average

converges to 128 + 0

Mathematics

• Law of large numbers for our case:

• Correct guess:

mean(n21+2, n22+2, n23+2 … n2i+2) = mean(noise) + 2

mean(n11+1, n12+1, n13+1 … n1i+1) = mean(noise) + 1

• Wrong guess:

mean(n21+2, n22+1, n23+2 … n2i+1) = mean(noise) + 1.5

mean(n11+1, n12+2, n13+1 … n1i+1) = mean(noise) + 1.5

Mathematics

Guess 1: sequence of two asset values Guess 2: sequence of two asset values

Difference = 1.042 Difference = -0.297

Mathematics

• That was for an arbitrary data, but what happens in reality for an arbitrary instruction:

R = f(x,y)

Mathematics

R = func(x,y)

HW = 0

HW = 1

HW = 2

HW = 3

HW = 4

HW = 5

HW = 6

HW = 7

HW = 8

Ave

rag

ed

po

we

r co

nsu

mp

tio

n

Number of averages

Mathematics

• Mathematics is simple – average will show the difference in power consumption thus revealing Hamming weight only for the correctmodel.

• Due to noise we need a lot of acquisitions.

• Lets finally go to side channel attacks.

PART 2. Side channel attacks

Side channel preliminaries

• Power consumption does depend on number of bits set to 1

• An 8bit operation (Sbox[C xor K]) happened during the measurement

• You know C (10.000 values) but you don’t know K.

• You know power consumption trace (lets assume for simplicity thatyou know time of the operation).

Side channel preliminaries

1. Create power model for each key value: S = Sbox[C xor K]

Power consumption: D flip flop

• Flip flop power consumption on D input transaction

0

0.2

0.4

0.6

0.8

1

1.2

0 to 1 1 to 0 1 to 1 0 to 0

Mathematics

R = func(x,y)

HW = 0

HW = 1

HW = 2

HW = 3

HW = 4

HW = 5

HW = 6

HW = 7

HW = 8

Ave

rag

ed

po

we

r co

nsu

mp

tio

n

Number of averages

Side channel preliminaries

ШифротекстПервый

байт

Ключ 0х00 Ключ 0х01...

Ключ 0хFF

S HW(S) S HW(S) S HW(S)

9cb4cad1a9b4031c678e5afc997dd75a 9C 1C 3 75 5 ... 0 0

505d4db951930c707c8ef4bbe52f25ea 50 6C 4 70 3 ... 1B 3

a3a1e523d7c2d1b17c0da136d0be1559 A3 71 4 1A 3 ... A7 5

0f8164f091690deff2e3f326c9877c1f 0F FB 7 D7 6 ... 17 4

9c4df2500df0f92cd6f6baaff1f765b5 9C 1C 3 75 5 ... 0 0

... ... ... ... ... ... ... ... ...

ad38d78c2590807ac3a8b992056b6c3a AD 18 2 AA 4 ... 48 2

c766e6776c7d80e1acd68dd825591595 C7 31 3 C7 5 ... 76 5

fd3d63130367d08e31fe96dd2b5d49ca FD 21 2 55 4 ... 6A 4

368d5ad4525dce11c9a7c63afe4bf2b2 36 24 2 B2 4 ... 12 3

S = Sbox[C xor K]

Side channel preliminaries

1. Create power model for each key value: S = Sbox[C xor K]

2. Distribute power measurements according to the model.

Side channel preliminaries

S = Sbox[C xor K]

Side channel preliminaries

3

4

4

7

3

...

2

3

2

2S = Sbox[C xor K]

Side channel preliminaries

3

4

4

7

3

...

2

3

2

2S = Sbox[C xor K]

Side channel preliminaries

3

4

4

7

3

...

2

3

2

2S = Sbox[C xor K]

Side channel preliminaries

S = Sbox[C xor K]

5

3

3

6

5

...

4

5

4

4

Side channel preliminaries

1. Create power model for each key value: S = Sbox[C xor K]

2. Distribute power measurements according to the model.

3. Among all distributions get the key with the best dependency.

Side channel preliminaries

Side channel preliminaries

1. By eyes this is not comfortable to observe these results, instead usecoefficients of mutual dependence as correlation coefficient.

Side channel preliminaries

Lets do a small demo and then you will try to develop your own firstside-channel attack

Do steps 26-27 of the Task1

Side channel preliminaries

Power trace explanation

• Steps 14 to 17 computesrelationship between power traces and the first byte of thelast round state:��� = �����[�����]

• ��� is used twice in a program –• first time ��� is obtained after

AddRndKey

• second time ��� is read from thememory to compute

��[���]

Side channel preliminaries

PART 2. Practice 1-2 hours

Practice

If you did not finish Task1 and still you want to do it – I will help youafter small explanation

Practice

• AES algorithm can be also implemented in hardware (faster, morereliable, protected against timing attacks etc.).

• You will develop and attack the algorithm in two ways.

• Firstly short explanation how algorithm is done.

Practice

128 bits register

Shift Rows / Sbox MixColumn Add round key

Practice

• Assume that the only thing that consumes power is 128 bit register so you need to correlate its values.

• AES is executed in 11 clock cycles.

• You need to identify register value that you can compute andcorrelate. Follow description of the Task2.

Misc

• What to read:• Scholar.google.com (side channel attacks, power analysis, correlation power

analysis and etc.)

• COSADE, CARDIS, CHES and other academic conferences.

• Журнал Хакер – я пишу туда серию статей по аппаратным атакам (не только по второстепенным каналам). Пока раз в два месяца.

Practice

• This is just the beginning of Side Channel Attacks. Nowadays there aremethods tha allow:

• Extract the key without plaintexts and ciphertexts information.

• Attack protected algorithms.

• Reverse the code using side channel techniques.

• Reverse the circuit (approximate IP location).

• Use side-channels to communicate.

• Possible other applications.

• Definitely side-channels are not dead and there are enourmous placeto improve.

Roman KorkikianRoman.korkikian@yandex.com

Recommended