ECC optimization on Sandy BridgeThe cost of cofactor h = 1
Daan [email protected]
Radboud University Nijmegen
1 April 2019
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 1 / 30
Outline
Introduction
Preliminaries
Cofactor security
ECC implementation
Results
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 2 / 30
Outline
Introduction
Preliminaries
Cofactor security
ECC implementation
Results
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 2 / 30
Elliptic curves
E : y2 = x3 + ax + b
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 3 / 30
Elliptic curves
E : y2 = x3 + ax + b
−4 −2 0 2 4
x
−4
−2
0
2
4
y
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 3 / 30
Elliptic curves: addition
E : y2 = x3 + ax + b
−4 −2 0 2 4
x
−4
−2
0
2
4
yP
Q
−R
R
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 3 / 30
Elliptic curves: doubling
E : y2 = x3 + ax + b
−4 −2 0 2 4
x
−4
−2
0
2
4
yP
−R
R
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 3 / 30
Elliptic curves
I Coordinates include the point at infinity OI Define P +O = P
I Curve equation: E : y2 = x3 + ax + b
I Coordinates are defined over a field Fq
I I.e. integers modulo q
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 4 / 30
Elliptic curves
I Coordinates include the point at infinity OI Define P +O = P
I Curve equation: E : y2 = x3 + ax + b
I Coordinates are defined over a field Fq
I I.e. integers modulo q
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 4 / 30
Elliptic curves
I Coordinates include the point at infinity OI Define P +O = P
I Curve equation: E : y2 = x3 + ax + b
I Coordinates are defined over a field Fq
I I.e. integers modulo q
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 4 / 30
Elliptic curves: actually
E : y2 = x3 − 3x + 1 defined over F11
0 1 2 3 4 5 6 7 8 9 10 11
x
−5
−4
−3
−2
−1
0
1
2
3
4
5
y
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 5 / 30
Elliptic curves: actual addition
E : y2 = x3 − 3x + 1 defined over F11
0 1 2 3 4 5 6 7 8 9 10 11
x
−5
−4
−3
−2
−1
0
1
2
3
4
5
y
P
Q
−R
R
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 5 / 30
Group arithmetic
I We can do arithmetic with these rules! :)
I Addition: P + Q
I Subtraction: P − Q
I Neutral element: O, i.e. “zero”
I Scalar multiplication: [k]P = P + P + ... + P︸ ︷︷ ︸k times
I Discrete log problem:given P,Q where [k]P = Q, hard to find k
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 6 / 30
Group arithmetic
I We can do arithmetic with these rules! :)
I Addition: P + Q
I Subtraction: P − Q
I Neutral element: O, i.e. “zero”
I Scalar multiplication: [k]P = P + P + ... + P︸ ︷︷ ︸k times
I Discrete log problem:given P,Q where [k]P = Q, hard to find k
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 6 / 30
Group arithmetic
I We can do arithmetic with these rules! :)
I Addition: P + Q
I Subtraction: P − Q
I Neutral element: O, i.e. “zero”
I Scalar multiplication: [k]P = P + P + ... + P︸ ︷︷ ︸k times
I Discrete log problem:given P,Q where [k]P = Q, hard to find k
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 6 / 30
Elliptic curves are cyclic
I Points form a cycle: O +P−−→ P+P−−→ [2]P
+P−−→ [3]P+P−−→ ...
+P−−→ [n − 1]P+P−−→ O
I The order n should contain a large prime factor
I Only one cycle if n is prime
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 7 / 30
Elliptic curves are cyclic
I Points form a cycle: O +P−−→ P+P−−→ [2]P
+P−−→ [3]P+P−−→ ...
+P−−→ [n − 1]P+P−−→ O︸ ︷︷ ︸
n steps
I The order n should contain a large prime factor
I Only one cycle if n is prime
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 7 / 30
Cofactors
I If n is not a primeThen n = h · `
I I.e. small loops are possible:
E.g. if 4|n, then there is a point T4: O +T4−−→ T4+T4−−→ [2]T4
+T4−−→ [3]T4+T4−−→ O︸ ︷︷ ︸
only 4 steps!
I h is called the cofactor
I This property is often harmless
I I.e. sometimes it’s the opposite of harmless
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 8 / 30
Cofactors
I If n is not a primeThen n = h · `
I I.e. small loops are possible:
E.g. if 4|n, then there is a point T4: O +T4−−→ T4+T4−−→ [2]T4
+T4−−→ [3]T4+T4−−→ O︸ ︷︷ ︸
only 4 steps!
I h is called the cofactor
I This property is often harmless
I I.e. sometimes it’s the opposite of harmless
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 8 / 30
Cofactors
I If n is not a primeThen n = h · `
I I.e. small loops are possible:
E.g. if 4|n, then there is a point T4: O +T4−−→ T4+T4−−→ [2]T4
+T4−−→ [3]T4+T4−−→ O︸ ︷︷ ︸
only 4 steps!
I h is called the cofactor
I This property is often harmless
I I.e. sometimes it’s the opposite of harmless
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 8 / 30
Cofactors
I If n is not a primeThen n = h · `
I I.e. small loops are possible:
E.g. if 4|n, then there is a point T4: O +T4−−→ T4+T4−−→ [2]T4
+T4−−→ [3]T4+T4−−→ O︸ ︷︷ ︸
only 4 steps!
I h is called the cofactor
I This property is often harmless
I I.e. sometimes it’s the opposite of harmless
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 8 / 30
A brief history...
I 1999: elliptic curves popularized
I 2006: Curve25519 published by Bernstein
I “Safe” for implementors
I Super fast
I Has cofactor h = 8
I 2014: Monero cryptocurrency
I Uses Curve25519
I 2017: vulnerability in Monero found
I Allowed anyone to create coins out of thin air
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 9 / 30
A brief history...
I 1999: elliptic curves popularized
I 2006: Curve25519 published by Bernstein
I “Safe” for implementors
I Super fast
I Has cofactor h = 8
I 2014: Monero cryptocurrency
I Uses Curve25519
I 2017: vulnerability in Monero found
I Allowed anyone to create coins out of thin air
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 9 / 30
A brief history...
I 1999: elliptic curves popularized
I 2006: Curve25519 published by Bernstein
I “Safe” for implementors
I Super fast
I Has cofactor h = 8
I 2014: Monero cryptocurrency
I Uses Curve25519
I 2017: vulnerability in Monero found
I Allowed anyone to create coins out of thin air
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 9 / 30
A brief history...
I 1999: elliptic curves popularized
I 2006: Curve25519 published by Bernstein
I “Safe” for implementors
I Super fast
I Has cofactor h = 8
I 2014: Monero cryptocurrency
I Uses Curve25519
I 2017: vulnerability in Monero found
I Allowed anyone to create coins out of thin air
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 9 / 30
A brief history...
I 1999: elliptic curves popularized
I 2006: Curve25519 published by Bernstein
I “Safe” for implementors
I Super fast
I Has cofactor h = 8
I 2014: Monero cryptocurrency
I Uses Curve25519
I 2017: vulnerability in Monero found
I Allowed anyone to create coins out of thin air
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 9 / 30
A brief history...
I 1999: elliptic curves popularized
I 2006: Curve25519 published by Bernstein
I “Safe” for implementors
I Super fast
I Has cofactor h = 8
I 2014: Monero cryptocurrency
I Uses Curve25519
I 2017: vulnerability in Monero found
I Allowed anyone to create coins out of thin air
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 9 / 30
The Monero vulnerability
I Transaction involves a ring signature
I Double-spending is prevented by a key image I
I I binds the transaction to signer’s public key P
I Binding is in zero-knowledge
I Key image I should be unique
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 10 / 30
The Monero vulnerability
I Transaction involves a ring signature
I Double-spending is prevented by a key image I
I I binds the transaction to signer’s public key P
I Binding is in zero-knowledge
I Key image I should be unique
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 10 / 30
The Monero vulnerability
I Transaction involves a ring signature
I Double-spending is prevented by a key image I
I I binds the transaction to signer’s public key P
I Binding is in zero-knowledge
I Key image I should be unique
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 10 / 30
The Monero vulnerability
I Transaction involves a ring signature
I Double-spending is prevented by a key image I
I I binds the transaction to signer’s public key P
I Binding is in zero-knowledge
I Key image I should be unique
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 10 / 30
The Monero vulnerability
I Transaction involves a ring signature
I Double-spending is prevented by a key image I
I I binds the transaction to signer’s public key P
I Binding is in zero-knowledge
I Key image I should be unique
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 10 / 30
Monero transactions
I Have generators G1,G2; private key x ; public key P; key image I .
I signx(m)
I Sign m with private key x
I Choose commitment u ∈R hZ`
I Compute a2 = [u]G2; c = H(m, a1, a2); r = u + cx
I Output signature s = (a1, a2, r)
I verifyP,I (m, s)
I [r ]G1?= a1 + [c]P
I [r ]G2?= a2 + [c]I
I I unique?
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 11 / 30
Monero transactions
I Have generators G1,G2; private key x ; public key P; key image I .
I signx(m)
I Sign m with private key x
I Choose commitment u ∈R hZ`
I Compute a2 = [u]G2; c = H(m, a1, a2); r = u + cx
I Output signature s = (a1, a2, r)
I verifyP,I (m, s)
I [r ]G1?= a1 + [c]P
I [r ]G2?= a2 + [c]I
I I unique?
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 11 / 30
Attacking Monero signatures
I Challenge. Find some signature+keypair a2, c , r , and I , s.t.
[r ]G2 = a2 + [c]I = a2 + [c]I ′,
where I 6= I ′.
I Solution. Choose I ′ = I + Tα, where α|c and [α]Tα = O.
I Correctness.
a2 + [c]I ′ = a2 + [c](I + Tα)
= a2 + [c]I +[ cα
][α]Tα
= a2 + [c]I +[ cα
]O
= a2 + [c]I
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 12 / 30
Attacking Monero signatures
I Challenge. Find some signature+keypair a2, c , r , and I , s.t.
[r ]G2 = a2 + [c]I = a2 + [c]I ′,
where I 6= I ′.
I Solution. Choose I ′ = I + Tα, where α|c and [α]Tα = O.
I Correctness.
a2 + [c]I ′ = a2 + [c](I + Tα)
= a2 + [c]I +[ cα
][α]Tα
= a2 + [c]I +[ cα
]O
= a2 + [c]I
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 12 / 30
Attacking Monero signatures
I Challenge. Find some signature+keypair a2, c , r , and I , s.t.
[r ]G2 = a2 + [c]I = a2 + [c]I ′,
where I 6= I ′.
I Solution. Choose I ′ = I + Tα, where α|c and [α]Tα = O.
I Correctness.
a2 + [c]I ′ = a2 + [c](I + Tα)
= a2 + [c]I +[ cα
][α]Tα
= a2 + [c]I +[ cα
]O
= a2 + [c]I
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 12 / 30
Attacking Monero signatures
I Challenge. Find some signature+keypair a2, c , r , and I , s.t.
[r ]G2 = a2 + [c]I = a2 + [c]I ′,
where I 6= I ′.
I Solution. Choose I ′ = I + Tα, where α|c and [α]Tα = O.
I Correctness.
a2 + [c]I ′ = a2 + [c](I + Tα)
= a2 + [c]I +[ cα
][α]Tα
= a2 + [c]I +[ cα
]O
= a2 + [c]I
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 12 / 30
Attacking Monero signatures
I Challenge. Find some signature+keypair a2, c , r , and I , s.t.
[r ]G2 = a2 + [c]I = a2 + [c]I ′,
where I 6= I ′.
I Solution. Choose I ′ = I + Tα, where α|c and [α]Tα = O.
I Correctness.
a2 + [c]I ′ = a2 + [c](I + Tα)
= a2 + [c]I +[ cα
][α]Tα
= a2 + [c]I +[ cα
]O
= a2 + [c]I
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 12 / 30
Attacking Monero signatures
I Challenge. Find some signature+keypair a2, c , r , and I , s.t.
[r ]G2 = a2 + [c]I = a2 + [c]I ′,
where I 6= I ′.
I Solution. Choose I ′ = I + Tα, where α|c and [α]Tα = O.
I Correctness.
a2 + [c]I ′ = a2 + [c](I + Tα)
= a2 + [c]I +[ cα
][α]Tα
= a2 + [c]I +�
���
[ cα
]O
= a2 + [c]I
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 12 / 30
Attacking Monero signatures
I Challenge. Find some signature+keypair a2, c , r , and I , s.t.
[r ]G2 = a2 + [c]I = a2 + [c]I ′,
where I 6= I ′.
I Solution. Choose I ′ = I + Tα, where α|c and [α]Tα = O.
I Correctness.
a2 + [c]I ′ = a2 + [c](I + Tα)
= a2 + [c]I +[ cα
][α]Tα
= a2 + [c]I +�
���
[ cα
]O
= a2 + [c]I
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 12 / 30
Surely this could have been prevented?
Easy fix:
I Protocol assumed [r ]G2 = a2 + [c]I , only for a single I
I Fix: check if the order of I is `
I i.e. check [`]I?= O
I Fun fact: this check makes the verification 2× slower
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 13 / 30
Surely this could have been prevented?
Easy fix:
I Protocol assumed [r ]G2 = a2 + [c]I , only for a single I
I Fix: check if the order of I is `
I i.e. check [`]I?= O
I Fun fact: this check makes the verification 2× slower
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 13 / 30
Surely this could have been prevented?
Easy fix:
I Protocol assumed [r ]G2 = a2 + [c]I , only for a single I
I Fix: check if the order of I is `
I i.e. check [`]I?= O
I Fun fact: this check makes the verification 2× slower
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 13 / 30
Surely this could have been prevented?
Easy fix:
I Protocol assumed [r ]G2 = a2 + [c]I , only for a single I
I Fix: check if the order of I is `
I i.e. check [`]I?= O
I Fun fact: this check makes the verification 2× slower
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 13 / 30
Why didn’t they validate points?
Look at the docs:
(highlight added by me)
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 14 / 30
Why didn’t they validate points?
Look at the docs:
(highlight added by me)
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 14 / 30
Surely this could have been prevented?
Easy fix:
I Protocol assumed [r ]G2 = a2 + [c]I , only for a single I
I Fix: check if the order of I is `
I i.e. check [`]I?= O
I Better fix: use a prime order curve
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 15 / 30
Outline
Introduction
Preliminaries
Cofactor security
ECC implementation
Results
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 16 / 30
Goal of this thesis
What is the actual performance benefit of Curve25519over traditional (Weierstrass) curves?
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 17 / 30
Our contribution
Our research:
I Implement variable base-point scalar multiplication
I That is the algorithm for computing [k]P,
I for a prime-order curve,
I that looks similar to Curve25519,
I on Sandy Bridge microarchitecture
I Compare performance with Curve25519 (Sandy2x)
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 18 / 30
Our contribution
Our research:
I Implement variable base-point scalar multiplication
I That is the algorithm for computing [k]P,
I for a prime-order curve,
I that looks similar to Curve25519,
I on Sandy Bridge microarchitecture
I Compare performance with Curve25519 (Sandy2x)
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 18 / 30
Selecting a curve
I I.e. E : y2 = x3 − 3x + 13318, defined over F2255−19.
I Prime order curve; same field as Curve25519
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 19 / 30
Selecting a curve
I I.e. E : y2 = x3 − 3x + 13318, defined over F2255−19.
I Prime order curve; same field as Curve25519
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 19 / 30
Scalar multiplication overview
field arithmetic
fe add fe sub fe mul fe carry
addition formulas
ge double ge add
scalar multiplication
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 20 / 30
Field element representation
I Use double-precision floating points
I Allows 4× vectorized operations using SIMD instructions
I Radix-221.25 redundant representation
I Use 12 limbs to represent 255-bit numbers
I I.e. f = f0 + f1 + ... + f11
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 21 / 30
Field element representation
I Use double-precision floating points
I Allows 4× vectorized operations using SIMD instructions
I Radix-221.25 redundant representation
I Use 12 limbs to represent 255-bit numbers
I I.e. f = f0 + f1 + ... + f11
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 21 / 30
Field element representation
I Use double-precision floating points
I Allows 4× vectorized operations using SIMD instructions
I Radix-221.25 redundant representation
I Use 12 limbs to represent 255-bit numbers
I I.e. f = f0 + f1 + ... + f11
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 21 / 30
Field element representation
I Use double-precision floating points
I Allows 4× vectorized operations using SIMD instructions
I Radix-221.25 redundant representation
I Use 12 limbs to represent 255-bit numbers
I I.e. f = f0 + f1 + ... + f11
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 21 / 30
Field element representation
I Use double-precision floating points
I Allows 4× vectorized operations using SIMD instructions
I Radix-221.25 redundant representation
I Use 12 limbs to represent 255-bit numbers
I I.e. f = f0 + f1 + ... + f11
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 21 / 30
Field arithmetic
I Carry
I top(fi ): force loss of precision
I Then, move “high” bits to next limb
I Addition
I (f + g)i = fi + gi
I (f − g)i = fi − gi
I Multiplication
I (f · g)k =∑
i+j=k figi +∑
i+j=k+12
(2−255 · 19
)figi
I Optimized using Karatsuba’s multiplication
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 22 / 30
Field arithmetic
I Carry
I top(fi ): force loss of precision
I Then, move “high” bits to next limb
I Addition
I (f + g)i = fi + gi
I (f − g)i = fi − gi
I Multiplication
I (f · g)k =∑
i+j=k figi +∑
i+j=k+12
(2−255 · 19
)figi
I Optimized using Karatsuba’s multiplication
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 22 / 30
Field arithmetic
I Carry
I top(fi ): force loss of precision
I Then, move “high” bits to next limb
I Addition
I (f + g)i = fi + gi
I (f − g)i = fi − gi
I Multiplication
I (f · g)k =∑
i+j=k figi +∑
i+j=k+12
(2−255 · 19
)figi
I Optimized using Karatsuba’s multiplication
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 22 / 30
Addition formulas
I Use Renes-Costello-Batina formulas
I Rewrite using graphs into vectorized operations
I Implement using field arithmetic functions
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 23 / 30
Point doublingdbl_generic
x yz
x3
31
y3
27
z3
34
1
2
3 4
5
6
78
9
10
11
1213
14 15
16
17 18
19
20
21
22
23
24
25
26
28
29
30
32
33
dbl_4x (3M + 4c)
extra carry operation
xy z
x3
31
y3
27
z3
3214
1312
15
5
2
34
8⟦-b/2⟧
3
1716⟦-3⟧
18⟦2b⟧
6
2423⟦3⟧
128
2630
9= -a₉/2 19 25
22 2529a
4
11107
⟦-6⟧
3433
29b⟦8⟧
11
2221⟦-3⟧
20= -a₂₀
Legend
add
subtract
triple
multiply by small constant
multiply
square
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 24 / 30
Point doubling
dbl_generic
x yz
x3
31
y3
27
z3
34
1
2
3 4
5
6
78
9
10
11
1213
14 15
16
17 18
19
20
21
22
23
24
25
26
28
29
30
32
33
dbl_4x (3M + 4c)
extra carry operation
xy z
x3
31
y3
27
z3
3214
1312
15
5
2
34
8⟦-b/2⟧
3
1716⟦-3⟧
18⟦2b⟧
6
2423⟦3⟧
128
2630
9= -a₉/2 19 25
22 2529a
4
11107
⟦-6⟧
3433
29b⟦8⟧
11
2221⟦-3⟧
20= -a₂₀
Legend
add
subtract
triple
multiply by small constant
multiply
square
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 24 / 30
Point additionadd_generic
x1 y1z1 x2 y2z2
x3
40
y3
38
z3
43
1 23 4 5
67
8
9 10
1112
13
14 15
1617
18 19
20
21
22
23 24
25
26
27
28
29
30
31
32
33
34
3536
37 3941
42
add_4x (3M and 4c)
extra carry after operation
x1 y1z1x2y2z2
x3
40
y3
38
z3
43
1 23 16
1415
192518
6
45
11
910
36
3332
27b26b⟦3⟧
3130⟦3⟧
37
2324
35
13
39
8
4142
34 29
2221⟦3⟧
20
28
27a26a⟦3⟧
7 1217
Legend
add
subtract
triple
multiply by small constant
multiply
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 25 / 30
Point addition
add_generic
x1 y1z1 x2 y2z2
x3
40
y3
38
z3
43
1 23 4 5
67
8
9 10
1112
13
14 15
1617
18 19
20
21
22
23 24
25
26
27
28
29
30
31
32
33
34
3536
37 3941
42
add_4x (3M and 4c)
extra carry after operation
x1 y1z1x2y2z2
x3
40
y3
38
z3
43
1 23 16
1415
192518
6
45
11
910
36
3332
27b26b⟦3⟧
3130⟦3⟧
37
2324
35
13
39
8
4142
34 29
2221⟦3⟧
20
28
27a26a⟦3⟧
7 1217
Legend
add
subtract
triple
multiply by small constant
multiply
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 25 / 30
Scalar multiplication
I Use left-to-right double-and-add
I Optimization: use signed window method (w = 5)
I Uses 263 · double + 59 · add operations
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 26 / 30
Scalar multiplication
I Use left-to-right double-and-add
I Optimization: use signed window method (w = 5)
I Uses 263 · double + 59 · add operations
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 26 / 30
Scalar multiplication
I Use left-to-right double-and-add
I Optimization: use signed window method (w = 5)
I Uses 263 · double + 59 · add operations
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 26 / 30
Outline
Introduction
Preliminaries
Cofactor security
ECC implementation
Results
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 27 / 30
Compared to Curve25519
Table: Cycle counts for Sandy2x and this work.
Implementation Sandy Bridge Ivy Bridge Haswell
Curve25519 (Sandy2x) 159kcc 157kcc –
this work 390kcc 383kcc 340kcc
Conclusion: about 2.5× slower
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 28 / 30
Compared to Curve25519
Table: Cycle counts for Sandy2x and this work.
Implementation Sandy Bridge Ivy Bridge Haswell
Curve25519 (Sandy2x) 159kcc 157kcc –
this work 390kcc 383kcc 340kcc
Conclusion: about 2.5× slower
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 28 / 30
Thank you! I
Acknowledgements <3:
I Peter, (+the department, Marrit, Judith, Gerdriaan)
I The LLVM project (especially for llvm-mca)
I Olivier (from SNT; for lending their Sandy Bridge machine)
Stuff I left out:
I Ristretto
I Politics
I Many implementation details
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 29 / 30
Thank you! I
Acknowledgements <3:
I Peter, (+the department, Marrit, Judith, Gerdriaan)
I The LLVM project (especially for llvm-mca)
I Olivier (from SNT; for lending their Sandy Bridge machine)
Stuff I left out:
I Ristretto
I Politics
I Many implementation details
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 29 / 30
Thank you! II
The code is at https://github.com/dsprenkels/curve13318
Extra reading:
I My thesis: https://dsprenkels.com/files/thesis-20190311.pdf
I Monero vulnerability (1): https://nickler.ninja/blog/2017/05/23/exploiting-low-order-
generators-in-one-time-ring-signatures/
I Monero vulnerability (2): https://moderncrypto.org/mail-archive/curves/2017/000898.html
Find me through:
I Email: [email protected]
I PGP key: 951D 6F6E C19E 5D87 1A61 A7F4 1445 C075 FFD5 68CD
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 30 / 30
References I
Barreto, P.S.L.M.: on Twitter (May 2017), https://twitter.com/pbarreto/status/869103226276134912
Bernstein, D.J.: Curve25519: New Diffie-Hellman speed records. pp. 207–228 (2006),https://cr.yp.to/ecdh/curve25519-20060209.pdf
Bernstein, D.J., Duif, N., Lange, T., Schwabe, P., Yang, B.Y.: High-speed high-security signatures. pp. 124–142(2011), https://ed25519.cr.yp.to/ed25519-20110926.pdf
Chou, T.: Sandy2x: New Curve25519 speed records. pp. 145–160 (2016),https://www.win.tue.nl/~tchou/papers/sandy2x.pdf
Genkin, D., Valenta, L., Yarom, Y.: May the Fourth Be With You: A microarchitectural side channel attack onseveral real-world applications of Curve25519. pp. 845–858 (2017), https://eprint.iacr.org/2017/806.pdf
Karatsuba, A., Ofman, Y.: Multiplication of many-digital numbers by automatic computers. Dokl. Akad. NaukSSSR 145(2), 293–294 (1962),http://www.mathnet.ru/php/getFT.phtml?jrnid=dan&paperid=26729&what=fullt&option_lang=eng
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 31 / 30
References II
Kaufmann, T., Pelletier, H., Vaudenay, S., Villegas, K.: When constant-time source yields variable-time binary:Exploiting Curve25519-donna built with MSVC 2015. pp. 573–582 (2016),https://infoscience.epfl.ch/record/223794/files/32_1.pdf
Koblitz, N.: Elliptic curve cryptosystems. Math. Comp. 48, 209–209 (1987), https://www.ams.org/journals/mcom/1987-48-177/S0025-5718-1987-0866109-5/S0025-5718-1987-0866109-5.pdf
luigi1111, ”fluffypony” Spagni, R.: Disclosure of a major bug in cryptonote based currencies (May 2017), https://src.getmonero.org/2017/05/17/disclosure-of-a-major-bug-in-cryptonote-based-currencies.html
Miller, V.S.: Use of elliptic curves in cryptography. pp. 417–426 (1986),https://www.researchgate.net/profile/Victor_Miller/publication/227128293_Use_of_Elliptic_Curves_in_Cryptography/links/0c96052e065c94b47c000000/Use-of-Elliptic-Curves-in-Cryptography.pdf
Perrin, T.: Subject: [curves] CryptoNote and equivalent points (May 2017),https://moderncrypto.org/mail-archive/curves/2017/000898.html
Renes, J., Costello, C., Batina, L.: Complete addition formulas for prime order elliptic curves. pp. 403–428 (2016),http://eprint.iacr.org/2015/1060
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 32 / 30
References III
Schnorr, C.P.: Efficient signature generation by smart cards 4(3), 161–174 (Jan 1991), https://www.researchgate.net/profile/Claus_Schnorr/publication/227088517_Efficient_signature_generation_by_smart_cards/links/0046353849579ce09c000000/Efficient-signature-generation-by-smart-cards.pdf
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 33 / 30
Double-and-add algorithm
function DoubleAndAdd(k ,P) . Compute [k]PR ← Ofor i from n − 1 down to 0 do
R ← [2]R . Doublingif ki = 1 then
R ← R + P . Additionelse
R ← R +O . Additionend if
end forreturn R
end function
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 34 / 30
Fixed-window double-and-add
function FixedWindow(k ,P) . Compute [k]Pk ′ ←Windowsw (k)Precompute ([2]P, ... , [2w − 1]P)R ← Ofor i from n
w − 1 down to 0 dofor j from 0 to w − 1 do
R ← [2]R . w doublingsend forif k ′i 6= 0 then
R ← R + [k ′i ]P . Additionelse
R ← R +O . Additionend if
end forreturn R
end function
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 35 / 30
Signed double-and-addfunction SignedFixedWindow(k ,P) . Compute [k]P
k ′ ← RecodeSigned(Windowsw (k))Precompute ([2]P, ... , [2w−1]P)R ← Ofor i from n
w − 1 down to 0 dofor j from 0 to w − 1 do
R ← [2]R . w doublingsend forif k ′i > 0 then
R ← R + [k ′i ]P . Additionelse if k ′i < 0 then
R ← R − [−k ′i ]P . Additionelse
R ← R +O . Additionend if
end forreturn R
end functionDaan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 36 / 30
Implemented signed double-and-add
function ScalarMultiplication(k ,P) . Compute [k]PT← (O,P, ... , [16]P) . Precompute ([2]P, ... , [16]P)k ′ ← RecodeSigned(Windows5(k))R ← Ofor i from 50 down to 0 do
for j from 0 to 4 doR ← [2]R . 5 doublings
end forif k ′i < 0 then
R ← R − T−k′i
. Additionelse
R ← R + Tk′i
. Additionend if
end forreturn R . R = (XR : YR : ZR)
end function
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 37 / 30
sign exponent mantissa
63 52 0
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 38 / 30
Depiction of top(f )
253bi+1 253bi bi+1 bi
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?fi :
+ 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
+ci :
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0+ 1 ? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?z ′:
+ 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
−ci :
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0result:
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 39 / 30
Signed windows
k ′3 k ′2 k ′1 k ′0
1011 0010 0110 1110k =
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 40 / 30
Signed window recoding
k ′′4 k ′′3 k ′′2 k ′′1 k ′′0
1011 0010 0110 1110
1 −101 010 111 −010
k =
Daan Sprenkels ECC optimization on Sandy Bridge 1 April 2019 41 / 30