Download ppt - Efficient Private Approximation Protocols Piotr Indyk David Woodruff Work in progress

Efficient Private Approximation Protocols

Piotr IndykDavid Woodruff

Work in progress

Outline

1. Private approximation of L2 distance

2. Private near neighbor

3. Private approximate near neighbor

1. Private approximation of L2 distance

a {0,1}n b {0,1}n

• Want to compute some function F(a,b)• Security: protocol does not reveal anything except for the

value F(a,b)– Semi-honest: both parties follow protocol– Malicious: parties are adversarial

• Efficiency: want to exchange few bits

Secure communicationAlice Bob

Secure Function Evaluation (SFE)

• [Yao, GMW]: If F computed by circuit C, then F can be computed securely with O~(|C|) bits of communication

• [GMW] + … + [NN]: can assume parties semi-honest– Semi-honest protocol can be compiled to give

security against malicious parties• Problem: circuit size at least linear in n

* O~() hides factors poly(k, log n)

Secure and Efficient Function Evaluation

• Can we achieve sublinear communication?

• Ideally: secure computation with communication comparable to insecure case

• With sublinear communication, many interesting problems can be solved only approximately.

• What does it mean to have a private approximation?

Private Approximation

• [FIMNSW’01]: A protocol computing an approximation G(a,b) of F(a,b) is private, if each party can simulate its view of the protocol given the exact value F(a,b)

• Note: not sufficient to simulate non-private G(a,b) using SFE

• Example: – Define G(a,b):

• bin(G(a,b))i =bin((a,b))i if i>0• bin(G(a,b))0=a0

– G(a,b) is a 1 -approximation of (a,b), but not private

Concrete Pitfall: Dimension Reduction

• A basic problem: Hamming distance (a,b)• Approximate decision version: with prob. 1-,

– If (a,b)≤r, answer NO– If (a,b)≥r(1+) , answer YES

• [Kushilevitz-Ostrovsky-Rabani’98]:– Create mn binary matrix D, where

Pr[Dij=1]= 1/(2r) for m= O~(log 1/ / 2)– Exchange Da, Db (mod 2)– Answer YES if wt[D(a-b)]>r’, r’ function of r,

NOTE: This protocol was not designed to be private

Non-Privacy of KOR

• Let x = a – b. If, wt(x) = r, r log n ¼ m then can recover x from D, Dx in O(mn) time!

• Algorithm: for j=1…n, estimate

Pr[<di, x> =1| dij =1] = Pr[<di, x> =1 dij =1]/Pr[dij =1]

– If xj=1 then Pr[<di, x> =1|dij =1] is high– If xj=0 then Pr[<di x> =1|dij=1] is low

Approximating Hamming Distance

• [FIMNSW’01]: A private protocol with complexity O~(n1/2/ )– wt(x) small: compute wt(x) using O~(wt(x)) bits– wt(x) high: sample O~(n/wt(x)) xi, estimate wt(x)

• Our result: – Complexity: O~(1/2) bits– Works even for L2 norm, i.e., estimates ||x||2 for

a,b {1…M}n

* O~() hides factors poly(k, log n, log M, log 1/)

Crypto Tools• SFE of circuits [Yao’86]: O~(|circuit|) communication• Efficient SPIR or OT1

n: – Alice has A[1] … A[n] 2 {0,1}m , Bob has i 2 [n]– Goal: Bob privately learns A[i] and that’s it– Can be done using O~(m) communication [CMS99, NP99]

• Circuits with ROM [Naor, Nissim’01]:– Standard AND/OR/NOT gates– Lookup gates:

• In: i• Out: Mgate[i]

– Takes care of the security of computation:• begin secure … end secure

– Can just focus on privacy of the output

Communication at most O~(m|C|)

High-dimensional tools

• Random projection:– Take a random orthonormal nn matrix D,

that is ||Dx|| = ||x|| for all x.

– There exists c>0 s.t. for any xRn, i=1…n

Pr[ (Dx)i2 > ||Dx||2/n * k] < e-ck

Approximating ||a-b||2

• Recall:– Alice has a 2 [M]d, Bob has b 2 [M]d

– Goal: estimate ||x||2, x=a-b

Algorithm• Alice and Bob create random orthonormal matrix D such that, for

each i=1…n(Dx)i

2 < k||x||2/n • T=M2 n+1• Repeat

– {Assertion: ||x||2 ≤ T}– Invoke PRIVATESAMPLE to get L=O~(1/ 2) independent bits zi such that

Pr[zi=1]=||Dx||2/(Tk)

– T = T/2• Until Σi zi ≥ L/(4k)• Output E= Σi zi /L * 2Tk as an estimate of ||x||2

Correctness: – Unbiased estimator– High probablity from Chernoff bound

SECURE!

PRIVATESAMPLE

• P=Tk/n• Pick random t[n]• Retrieve (Da)t, (Db)t

• Compute (Dx)t = (Da)t - (Db)t

• Define v=[(Dx)t]2

• If v ≤ P then generate z s.t. Pr[z=1]=v/P Else output fail• Output z

Correct as long as (Dx)2i < Tk/n for each i=1…n

SECURE!

Generate independent bits zi with E[zi] = ||Dx||2/(Tk)

Algorithm, again• Alice and Bob create random* orthonormal** matrix D such that, for

each i=1…n(Dx)i

2 < ||x||2 /n * k• T=M2 n+1• Repeat

– {Assertion: ||x||2 ≤ T}– Invoke PRIVATESAMPLE to get L=O~(1/ 2) independent bits zi such that

Pr[zi=1]= ||Dx||2/Tk

{ Works as long as (Dx)2i < Tk/n for each i=1…n}

– T=T/2• Until Σi zi ≥ L/(4k)• Output E= Σi zi /L * 2Tk as an estimate of ||x||2

If Assertion not true, then Pr[zi=1]>1/(2k) E[Σi zi ] > L/(2k) >> L/(4k)

Simulation

SIMULATION• Repeat

– Choose L independent bits zi such that

Pr[zi=1]= ||x|| 2/Tk

– T=T/2

• Until Σi zi ≥ (L/k)

• Output E= Σi zi /L * 2Tk as an estimate of ||x||2

ALGORITHM• Repeat

– {Assertion: ||x||2 ≤ T}– Invoke PRIVATESAMPLE to get L

independent bits zi such that Pr[zi=1]= ||Dx|| 2/Tk

– T=T/2 • Until Σi zi ≥ (L/k)• Output E= Σi zi /L * 2Tk as an

estimate of ||x||2

Recall:• ||Dx||=||x||

Communication: O~(1/2)

2. Private near neighbor

Private Near Neighbor

q 2 [U]d P = p1, p2, …, pn 2 {1, 2, …, U}d = [U]d

Distance function: f(x,y)

Correctness: Bob learns mini f(q, pi)

Privacy: Alice learns nothing, Bob learns nothing else

Goal: Minimize communication

Alice Bob


f(a,b) = i fi(ai, bi) L2 Generalized

Hamming

Set

Difference

Previous [DA] O~(ndU) O~(nd) O~(ndU) O~(ndU)

Our Results O~(dU+n) O~(n+d) O~(d2 + n) O~(n+d)

[DA] needs 3rd party, we don’t

Approach: homomorphic encryption + secure function evaluation (SFE)

n points, dimension d, universe [U]

“Coordinate-wise” distance functions

q 2 [U]d P = p1, p2, …, pn 2 [U]d

Alice Bob

Bob: 1. For each coordinate, create a degree-(U-1) polynomial gj(x) = i ai,j xi such that gj(u) = fj(qj, u) for all u 2 [U] 2. Generate (SK, PK) for Paillier Encryption scheme. Send PK and EPK(ai, j) for all i,j

Alice: 1. For all i, E(j gj(pi,j)) = E(f(q, pi))

SFE: Inputs: Alice – E(f(q, pi)) Bob - SK 1. Bob gets mini DSK (E(f(q, pi)))

“Coordinate-wise” distance functions: f(a,b) = fi(ai, bi)

E(x), E(y) -> E(x + y)

E(x), c -> E(cx)

Generic distance functions

Security: 1. Replace SFE with oracle 2. Alice View indistinguishable from PK, E(0), E(0), …, E(0) – E semantically secure 3. Bob View just = output

Efficiency: 1. Send polynomials = O~(dU) 2. SFE = O~(n) (simple circuit)


“Pointwise”

distance

L2 Generalized

Hamming

Set

Difference

Previous [DA] O~(ndU) O~(nd) O~(ndU) O~(ndU)

Our Results O~(dU+n) O~(n+d) O~(d2 + n) O~(n+d)

n points, dimension d, universe [U]

(homomorphic tricks)

• Alice x1, …, xn 2 {0,1}d , Bob y1, …, yn 2 {0,1}d , Threshold t

• Bob gets all xi s.t. (xi, yj) < t for some j

• Communication: O~(n2 + nd2). Resolves open question of [FNP04]:

• [FNP04] achieve O~((d choose t)nt) May be superpolynomial in n

3. Private Approximate Near Neighbor


• Drawback: Protocols depend linearly on # points n

• Necessary? Not if algebraically homomorphic E exists

• Our approach: solve the approximate problem

Private c-Approximate Near Neighbor

Alice has P = {p1, …, pn} {0,1}d, Bob has q {0,1}d

Pr

Pcr

Notation: Pr = P B(q, r)

Correctness: Pr nonempty Bob learns some element of Pcr

Privacy: Bob’s view simulatable given q and Pcr

Private Approximate Near Neighbor

Definition Remarks:

Privacy: Don’t care what Bob gets as long as it follows from Pcr Simulator gets Pcr

Correctness: Don’t specify anything if Pr empty, but view still simulatable

Our results:

- O~(n1/2 + d)

- If Bob just wants some coordinate of an element of Pcr, then improve to O~(n1/2 + polylog(d))

Private Approximate Near Neighbor

Two approaches:

1. Dimensionality Reduction in Hamming Cube [KOR98]

2. Locality Sensitive Hashing [IM98]

This talk: protocol using #1

Dimensionality Reduction

• [KOR]: Let A be random m times d binary matrix, m = O(log d /2)

• Then there is a separator r’ s.t. with probability 1-1/n2 , for any p,q {0,1}d

1. (p,q) > cr (Ap, Aq) > r’ 2. (p,q) · r (Ap, Aq) < r’

Idea: Alice 1. Applies A to P dimension small 2. Enumerates all w {0,1}m, forms array: B[w]={p 2 P s.t. (Ap, w) < r’} 3. Use Oblivious ROM

Dimensionality reduction protocol

2. Agree on k matrices A1, …, Ak

3. Create array Bi based on Ai

4. Bi[p] contains any n1/2 points p’ 2 P s.t. (Aip’, p) < r’

5. Alice sets ROM to be the Bis

Pcr

1. Randomly sample O~(n1/2) points P1

2. If |Pcr| > n1/2, then P1 Å Pcr ;, w.h.p.

Protocol:

6. If P1 Å Pcr ;, SFE outputs a random element of P1. Otherwise, SFE uses [i B

i[Aiq] to output a random element of Pr

Dimensionality Reduction Analysis

Properties:

1. If |Pcr| > n1/2 , we output random element of Pcr ,w.h.p.

2. If |Pcr| < n1/2 , by properties of A, for any p Pr ,

PrA [8 p 2 Pr, (Ap, Aq) < r’ and 8 p 2 Pcr, (Ap, Aq) > r’] > 1- 1/n

3. Since bucket size is n1/2 and |Pcr| < n1/2, pBi[Aiq], Pr i Bi[Aiq]

Correctness:

If |Pcr| > n1/2 , output element from Pcr

Else output an element from Pr


• Communication:

1. Sampling O~(n1/2) elements to ensure |Pcr| < n1/2

2. OT on O~(1) buckets of size n1/2

Thus, balanced steps 1 & 2 O~(dn1/2) total communication

• Simulatability: Output either a random element of Pcr , or a random

element of Pr


• Dependence on d:

1. Homomorphic encryption: O~(d + n1/2)

1. Bob sends E(q1), …, E(qd)

2. Alice computes E((pi, q)) - Uses these for sampling and bucketing

2. Reduce to O~(polylog(d) + n1/2) if Bob just wants a coordinate of point in Pcr – use approximations

Conclusions

• Extensions: Can achieve O(n1/3 + d) communication if you allow the protocol to “leak” k bits of information

• Open problems:

1. Polylogarithmic Private Approximation of other distances

2. More efficient protocols for exact near neighbor.Tricks for PIR may be useful

3. Polylogarithmic c-approx NN protocol