Key Recovery Using Noised Secret Sharing with Discounts Over Large Clouds

1

Key Recovery Using Noised Secret Sharing with Discounts

Over Large Clouds Sushil Jajodia1, W. Litwin2 & Th. Schwarz3

1George Mason University, Fairfax, VA {[email protected]}2Université Paris Dauphine, Lamsade {[email protected]}3Thomas Schwarz, UCU, Montevideo {[email protected]}

mailto:[email protected]



2

What ?• New schemes for an encryption key

recovery• Lost key is always recoverable in practical

time– From a specifically encrypted backup• Using a Noised Secret Sharing Scheme

(NS)–We proposed such schemes in 2012

3

What ?

• The schemes guaranteed1. Recoverability in practical time of

any key backed up using an NS-scheme• E.g. in 10 min max • But only provided a large cloud

1K – 1M nodes

4

What ?• Such a cloud is a scarce & usually

external resource• Possibly expensive & hard to use

illegallyo Money trail, traces in many logs…

• The constraint should make backup disclosure attempt rare• But not impossible

5

What ?• After all, there is also no guarantee that

someone does not pass a red light right on you• Never mind, you still move around, don’t

you ?2. The schemes guaranteed also the absence of any other scheme recovering the key faster than the NS-scheme on the same cloud

6

What ?• We now enhance the proposed schemes:• Key owner escrowing the key backup

may retain a secret discount (code)• Discounted recovery may be orders of

magnitude less complex than the full-cost (brute force) recovery– The only possibility for 2012 schemes

• E.g. 256 times less complex for 1B code

7

What ?• Less complex means:– Less computations all together– Faster recovery on the same cloud• E.g. 256 times for 1B code

– Smaller cloud for the same time• E.g. 256 times for 1B code

• In every case, cheaper in cloud cost– E.g. up to 256 times for 1B code

8

What ?• We present schemes for discounted

recovery• These naturally allow also the full-

cost recovery• But even if this case, these are not

the same schemes as our earlier ones

9

Why• Currently, it’s better to encrypt the

outsourced data• Using client-side encryption– See news

• Especially Big Data– If you count on server-side encryption:

good luck– Also see news

10

Why• You may also enjoy songs about

privacy– By renowned CS Prof Th. Imielinski

(Rutgers U)– http://www.system-crash.com/mp3/09-System%20Crash%20_%

20Privacy.mp3 – http

://www.system-crash.com/mp3/HackersRevenge/05_SystemCrash_HackersRevenge_WeWillForgiveYourFuture.mp3

http://www.system-crash.com/mp3/09-System%20Crash%20_%20Privacy.mp3



http://www.system-crash.com/mp3/HackersRevenge/05_SystemCrash_HackersRevenge_WeWillForgiveYourFuture.mp3



11

Why• Key loss danger is Achilles’ heel of

modern crypto–Makes many folks refraining of

any encryption• With occasionally known disclosure consequences

–Other loose many tears if unthinkable happens

12

Why• Frequent causes of key loss– Corrupted or lost memory & no

backup– Sad loss of the key owner with all

her/his secrets– Disgruntled employee departed

with the key & backups

13

Why• Frequent causes of key disclosure– Hacking of one among many

(simple) backup copies at owner’s site – Sad loss of the key owner with all

her/his secrets– Disgruntled employee departed

with the key & backups

14

Why• Frequent causes of key disclosure– Dishonest super-user / system

administrator• Mandatory escrow of any key copy in many companies

– Insider attack within an escrow service

15

Why• Frequent consequences of key loss– Valuable company data• Possibly years of work by many

people–Unpublished valuable manuscripts• Personally known case

– Possibly critical personal health data• Time-series of CAT Scans, ECGs, Blood

analysis, med. treatments…

16

Why• Frequent consequences of key

disclosure– Listen to news

How ?Backup Encryption

• Client (Key owner) X backs up encryption key K• X estimates 1-node inhibitive recovery

time D–Say 70 days

• D measures trust to Escrow–Lesser trust ? • Choose 700 days

17

18

Client Side Encryption• D determines minimal cloud size N for

future recovery in any acceptable calculus time R –Chosen by recovery requestor• E.g. 10 min

–X expects N > D / R but also N D / R • E.g. N 10K for D = 70 days– N 100K for D = 700 days

19

Client Side Encryption• X creates a classical shared secret for K– Sees K as a large integer, • E.g., 256b long for AES

–Basically, X creates a 2-share secret –Share s0 is a random integer

– Share s1 is s1 = s0 XOR K

• Common knowledge:– K = s0

XOR s1

20

Client Side Encryption• X transforms the shared secret into a noised one– X makes s0 a noised share :• Chooses a 1-way hash H– E.g. SHA 256

• Computes the hint h = H (s0)– Chooses the noise space

I = 0,1…,m,…M-1– For some large M determined as we explain

soon

21

Client-Side Encryption

– Each noise m and s0 define a noise share s• In a way we show soon as well

– There are M different pseudo random noise shares• All but one are different from s0

• But it is not known which one is s0 – The only way to find for any s whether

s = s0 is to attempt the match

H (s) ?= h

22

Noised Secret Sharing

23

Client Side Encryption

• X estimates the 1-node throughput T – # of match attempts H (s) ?= h per time unit• 1 Sec by default

• X sets M to M = Int (DT).– M may be expected 240 ÷ 250 in practice

• M determines the worst-case and the average complexity of the recovery

• Respectively O(M) and O(M / 2)• As we will show

24

Client Side Encryption

• X forms base noise share f = p | 0 • Hides the noised share as s0

n = (f, M, h) in its noise share space [f, f + M[ • Sends the noised shared secret

S’ = (s0n, s1) to Escrow as the key backup

25

Client Side Encryption : Discount Code

• X may define some discount code d• Basically, d is the m – bit suffix of s0

• In practice, we expect m = 8, 16. • Represented in ASCII or Unicode d is then

simply one or two digits, e.g., ‘j’. • X retains d secretly for future recovery

request(or)

26

Discount Code Actiono 2m times reduced noise space I ; M’ = M / 2m

o 2m times reduced noise share space S’

27

Escrow-Side Recovery (Backup Decryption)

• Escrow E receives legitimate request of S recovery in time R at most

• E sends data S” = (s0n, R) to some cloud

node C with request for processing accordingly–Keeps s1 out of the cloud

• C will choose between static or scalable recovery schemes

28

Recovery Processing Parameters

• Node load Ln : # of noises among M assigned to node n for match attempts

• Throughput Tn : # of match attempts node n can process / sec

• Bucket (node) capacity Bn : # of match attempts node n can process / time R–Bn = R Tn

• Load factor n = Ln / Bn

29

Node LoadL

B

T

1

α

LB

T

LB

T

Overload Normal load Optimal load

1

R

t

30

Recovery Processing Parameters

• Notice the data storage oriented vocabulary• Node n respects R iff n ≤ 1–Assuming T constant during the processing

• The cloud respects R if for every n we have n ≤ 1• This is our goal –For both static and scalable schemes we

now present

31

Static Scheme

• Intended for a homogenous Cloud– All nodes provide the same throughput

32

Static Scheme Full-Cost Recovery Init Phase

• Node C that got S” from E becomes coordinator

• Calculates a (M) = M / B (C)

–Usually (M) >> 1• Defines N as a (M) –Implicitly considers the cloud as

homogenous• E.g., N = 10K or N = 100K in our ex.

33

Static Scheme Full-Cost Recovery Map Phase

• C asks for allocation of N-1 nodes • Associates logical address n = 1, 2…N-1

with each new node & 0 with itself• Sends out to every node n data (n, a0, P)

–a0 is its own physical address, e.g., IP–P specifies Reduce phase

34

Static Scheme Full-Cost Recovery Reduce Phase

• P requests node n to attempt matches for every noise share s = (f + m) such that n = m mod N• In practice, e.g., while m < M: –Node 0 loops over noise m = 0, N, 2N…• So over the noise shares f, f + N, f + 2N…

–Node 1 loops over noise m = 1, N+1, 2N+1…–…..–Node N – 1 loops over m = (your guess here)

35

Static Scheme : Node Load for Full Cost Recovery

T

1

α

0 1 2 N - 1

1

R, B

t, L

……..f + 2Nf + N f

……..f + 2N + 1f + N + 1 f + 1

……..f + 2N + 2f + N + 2 f + 2

……..f + 3N - 1f + 2N - 1 f + N - 1

M

36

Static Scheme : Termination• Node n that gets the successful match

sends s to C• Otherwise node n locally terminate• C asks every node to terminate– Details depend on actual cloud

• C forwards s as s0 to E

37

Static Scheme : Termination• E discloses the secret S and sends S to

Requestor– Bill included (we guess)

• E.g., up to 400$ on CloudLayer for –D = 70 days–R = 10 min– Both implied N = 10K with private

option

38

Static Scheme

• Observe that N ≥ D / R and N D / R – If the initial estimate of T by S owner holds

• Observe also that for every node n, we have(n) ≤ 1– Provided the initial estimate of T by C

holds in time

39

Static Scheme

• Under above assumptions maximal recovery time is indeed R–See the papers for details

• Average recovery time is R / 2 –Every noise share is indeed equally likely to

be the lucky one

40

Static Scheme : Discounted Recovery• m < M’ = M / 2m : m is bit length of d

M’ M’ M’

41

Static Scheme : Discounted Recovery

• If full-cost recovery needs 10K nodes for R = 10 min max. recovery calculus time and m = 8 then:

• The 10 min max. recovery requires only 40 – node wide cloud

• The 10K – node wide cloud suffices for R = 2.3 sec

• 1K cloud suffices for the recovery time up to 25 sec

• Etc

42

Static Scheme

• See papers for –Details, –Numerical examples – Proof of correctness•The scheme really partitions I•Whatever is N and s0, one and only one node finds s0

43

Static Scheme• Safety–No disclosure method can in practice

offer the (1-node) calculus faster than the scheme–Dictionary attack, inverted file of

hints…• Other properties

44

Scalable Scheme

• Heterogeneous cloud– Throughput may differ among the nodes

45

Scalable Scheme

• Intended for heterogenous clouds– Different node throughputs– Basically only locally known

• E.g. –Private or hybrid cloud–Public cloud without so-called private

node option

46

Scalable Scheme

• Each node n : n = 0,1… recursively splits its load through hash partitioning• Until its load factor n decreases to n ≤ 1• See our previous papers for details

47

Scalable Scheme• Init phase similar up to (M) for full cost or

(M’) for discounted recovery calculus – Basically (M) and (M’) >> 1 – Also we note it now 0

• If > 1 we say that node overflows• Node 0 sets then its level j to j = 0 and splits – Requests node 2j = 1– Sets j to j = 1 – Sends to node 1, (S”, j, a0)

48

Scalable Scheme• As result –There are N = 2 nodes– Both have j = 1–Node 0 and node 1 should each process M / 2

match attempts• We show precisely how on next slides

– Iff both 0 and 1 are no more than 1

• Usually it should not be the case• The splitting should continue as follows

49

Scalable Scheme• Recursive rule– Each node n splits until n ≤ 1

– Each split increases node level jn to jn + 1 – Each split creates new node n’ = n + 2jn – Each node n’ gets jn’ = jn initially

• Node 0 splits thus perhaps into nodes 1,2,4… • Until 0 ≤ 1

• Node 1 starts with j= 1 and splits into nodes 3,5,9…• Until 1 ≤ 1

50

Scalable Scheme

• Node 2 starts with j = 2 and splits into 6,10,18… • Until 2 ≤ 1

• Your general rule here • Node with smaller T splits more times

and vice versa

51

Scalable Scheme : Splitting

5

B

α = 0.5

3

0 1 2 4 8 16

1 2 4

α = 0.8α = 0.7

5

Node 0 split 5 times. Other nodes did not split.

j

T

T T

52

Scalable Scheme• If cloud is homogenous, the address

space is contiguous• Otherwise, it is not– No problem– Unlike for a extensible or linear hash

data structure

53

Scalable Scheme : Reduce phase

• Every node n attempts matches for every noise k [0, M-1] such that n = k mod 2jn. • If node 0 splits three times, in Reduce

phase it attempts to match noised shares (f + k) with k = 0, 8, 16…• If node 1 splits three times, it attempts to

match noised shares (f + k) with k = 1, 17, 33…• Etc.

54

Scalable Scheme : Reduce Phase

3

B

α = 0.5

0 1

4j

T

….f + 16f + 8f

….f + 33f + 17f + 1

L

55

Scalable Scheme• N ≥ D / R– If S owner initial estimate holds

• For homogeneous cloud, N is 30% greater on the average and twice as big at worst / static scheme• Cloud cost may still be cheaper– No need for private option

• Versatility may still make it preferable besides

56

Scalable Scheme• Max recovery time is up to R– Depends on homogeneity of the cloud

• Average recovery time is up to R /2• See again the papers for – Examples – Correctness– Safety– …–Detailed perf. analysis remains future work

57

Multi-share noised secret sharing• For max recovery time R arbitrarily close to the average one : R =

Rave * (k + 1) / k for M = D T / k ; M’ = M / 2m

58

Multi-share noised secret sharing• We now have – R = Rave * (k + 1) / k–Welcome capability for some

• Drawback: discount code becomes randomly valued d = (d0…dk-1) E.g. d = (j,…,a) at the figure

• Each di is m – bit wide• Perhaps harder to retain for k > 4 • See the paper for paths to solutions• See previous papers for scheme analysis

59

Related Work

• RE scheme for outsourced LH* files • CSCP scheme for outsourced LH* records

sharing• Crypto puzzles• One way (hash) functions with trapdoors• 30-year old excitement around Clipper chip• Botnets• Snowden’s Leaks (so-called)

Conclusion• NS-schemes with discounts effectively

relieve the fear of encryption key loss/disclosure – Achilles’ heel of cryptography

• Key is always recoverable from its backup• Especially easily through a discounted

recovery 60

Conclusion• The backup disclosure is hard at will• So can be also arbitrarily unlikely• But remains always a possibility• Like a driver going through a red light

right on us • Despite all precautions• But we still drive our cars, don’t we ?

61

62

Conclusion• To our best knowledge, our technique is

the only with the discussed properties at present• It appears of interest to many apps • It should especially help potential BigData

users• Deterred up to now for many • By potentially catastrophic consequences

of outsourced data loss / disclosure

63

Conclusion• NS sharing is a generalization of the well-

known secret sharing• An arbitrary additional cost C still protects the

secret, even if one gets all the shares – Lowered at will by the discounts

• The “classical” scheme amounts to C = 0• NS sharing appears of general interest for the

cryptography–Beyond our current use

64

Conclusion• Consider a company secret split into

shares among some insiders• With NS sharing, the secret is still

arbitrarily protected even if a collusion / hacking among the insiders happens – So that someone discloses all the shares

• Not with the classical algorithm

65

Conclusion• The scalable partitioning scheme

appears also a general purpose tool• Could be called Scalable Map-Reduce• Apparently failed dream of Amazon’s cloud

service– Under the name of Elastic Map-Reduce

66

Future work• Deeper formal analysis• Proof of concept implementation– Started using the popular open source

Vagrant VM for the scalable scheme• Experiments show 2.7s on the

average for a node to (re)start Vagrant on another node so it is ready to process data sent in.

67

Future work• One should also test the static scheme

on some easily available Map-Reduce implementation•E.g. Amazon EMR in commercial world

68

Future work• Nevertheless, all things considered,

the static scheme with discounts appears ready for use• All ‘bricks’ are already out there• Ready to be assembled

69

Thanksfor

Your Attention

Witold LITWIN & al

Documents

Key Recovery Using Noised Secret Sharing with Discounts Over Large Clouds