View
221
Download
0
Category
Preview:
Citation preview
Optimal Bounds for Johnson-Lindenstrauss Transforms and Streaming Problems with Sub-
Constant Error
T.S. Jayram David Woodruff
IBM Almaden
Data Stream Model• Have a stream of m updates to an n-dimensional vector v
– “add x to coordinate i”– Insertion model -> all updates x are positive– Turnstile model -> x can be positive or negative
– stream length and updates < poly(n)
• Estimate statistics of v– # of distinct elements F0
– Lp-norm |v|p = (Σi |vi|p )1/p
– entropy– and so on
• Goal: output a (1+ε)-approximation with limited memory
Lots of “Optimal” Papers• Lots of “optimal” results
– “An optimal algorithm for the distinct elements problem” [KNW]
– “Fast moment estimation in optimal space” [KNPW]– “A near-optimal algorithm for estimating entropy of a stream”
[CCM]– “Optimal approximations of the frequency moments of data
streams” [IW]– “A near-optimal algorithm for L1-difference” [NW]– “Optimal space lower bounds for all frequency moments” [W]
• This paper– Optimal Bounds for Johnson-Lindenstrauss Transforms and
Streaming Problems with Sub-Constant Error
What Is Optimal?
• F0 = # of non-zero entries in v
• “For a stream of indices in {1, …, n}, our algorithm computes a (1+ε)-approximation using an optimal O(ε-2 + log n) bits of space with 2/3 success probability… This probability can be amplified by independent repetition.”
• If we want high probability, say, 1-1/n, this increases the space by a multiplicative log n
• So “optimal” algorithms are only optimal for algorithms with constant success probability
Can We Improve the Lower Bounds?
x 2 {0,1}ε-2 y 2 {0,1}ε-2
Gap-Hamming: either Δ(x,y) > ½ + ε or Δ(x,y) < ½-ε
Lower bound of Ω(ε-2) with 1/3 error probability
But upper bound of ε-2 with 0 error probability
Our Results
Streaming Results• Independent repetition is optimal!
• Estimating Lp-norm in turnstile model up to 1+ε w.p. 1-δ – Ω(ε-2 log n log 1/δ) bits for any p– [KNW] get O(ε-2 log n log 1/δ) for 0 · p · 2
• Estimating F0 in insertion model up to 1+ε w.p. 1-δ – Ω(ε-2log 1/δ + log n) bits– [KNW] get O(ε-2 log 1/δ) for ε-2 > log n
• Estimating entropy in turnstile model up to 1+ε w.p. 1-δ– Ω(ε-2log n log 1/δ) bits– Improves Ω(ε-2 log n) bound [KNW]
Johnson-Lindenstrauss Transforms
• Let A be a random matrix so that with probability 1- δ, for any fixed q 2 Rd
|Aq|2 = (1 ± ε) |q|2
• [JL] A can be a 1/ε2 log 1/δ x d matrix– Gaussians or sign variables work
• [Alon] A needs to have (1/ε2 log 1/δ) / log 1/ε rows
• Our result: A needs to have 1/ε2 log 1/δ rows
Communication Complexity Separation
x
y
f(x,y) 2 {0,1}
D1/3, ρ (f) = communication of best 1-way deterministic protocol that errs w.p. 1/3 on distribution ρ
[KNR]: R||1/3(f) = maxproduct distributions ¹ £ λ D ¹ £ λ,1/3(f)
0
1
Communication Complexity Separation
f(x,y) 2 {0,1}
VC-dimension: maximum number r of columns for which all 2r rows occur in communication matrix on these columns
[KNR]: R||1/3(f) = Θ(VC-dimension(f))
Our result: there exist f and g with VC-dimension k, but:
R||δ(f) = Θ(k log 1/δ) while R||
δ(g) = Θ(k)
Our Techniques
Lopsided Set Intersection (LSI)
S ½ {1, 2, …, U}|S| = 1/ε2
U = 1/ε2 ¢ 1/δ
Is S Å T = ;?
- Alice cannot describe S with o(ε-2 log U) bits
- If x, y are uniform then with constant probability, S Å T = ;
- R||1/3(LSI) > Duniform, 1/3 (LSI) = Ω(ε-2 log 1/δ)
T ½ {1, 2, …, U}|T| = 1/δ
Lopsided Set Intersection (LSI2)
S ½ {1, 2, …, U}|S| = 1/ε2
U = 1/ε2 ¢ 1/δ
Is S Å T = ;?
T ½ {1, 2, …, U}|T| = 1
-R||δ/3(LSI2) ¸ R||
1/3(LSI) = Ω(ε-2 log 1/δ)- Union bound over set elements in LSI instance
Low Error Inner Product
x 2 {0, ε}U
|x|2 = 1
y 2 {0, 1}U
|y|2 = 1
Does <x,y> = 0?
Estimate <x, y> up to ε w.p. 1-δ -> solve LSI2 w.p. 1-δ
R||δ (inner productε) = Ω(ε-2 log 1/δ)
U = 1/ε2 ¢ 1/δ
L2-estimationε
x 2 {0, ε}U
|x|2 = 1
What is |x-y|2 ?
U = 1/ε2 ¢ 1/δ
y 2 {0, 1}U
|y|2 = 1
- |x-y|22 = |x|22 + |y|22 - 2<x, y> = 2 – 2<x,y>
- Estimate |x-y|2 up to (1+Θ(ε))-factor solves inner-productε
- So R||δ (L2-estimationε) = Ω(ε-2 log 1/δ)
- log 1/δ factor is new, but want an (ε-2 log n log 1/δ) lower bound
- Can use a known trick to get an extra log n factor
- log 1/δ factor is new, but want an (ε-2 log n log 1/δ) lower bound
- Can use a known trick to get an extra log n factor
Augmented Lopsided Set Intersection (ALSI2)
Universe [U] = [1/ε2 ¢ 1/δ]
S1, …, Sr ½ [U]
All i: |Si| = 1/ε2
j 2 [U]i* 2 {1, 2, …, r}
Si*+1 …, Sr
Is j 2 Si*?
R||1/3(ALSI2) = (r ε-2 log 1/δ)
Reduction of ALSI2 to L2-estimationε
S1
S2
…Sr
x1
x2
…xr
jSi*+1
…Sr
yi*
xi*+1
…xr
} x } y
y - x = 10i* yi* - ii* 10i ¢ xi
|y-x|2 is dominated by 10i* |yi* – xi*|2
- Set r = Θ(log n)
- R|| δ(L2-estmationε) = (ε-2 log n log
1/δ)
- Streaming Space > R|| δ(L2-estimationε)
- Set r = Θ(log n)
- R|| δ(L2-estmationε) = (ε-2 log n log
1/δ)
- Streaming Space > R|| δ(L2-estimationε)
Lower Bounds for Johnson-Lindenstrauss
Use public randomness to agree on a JL matrix A
x 2 {-nO(1), …, nO(1)} t
Ax Ay-
|A(x-y)|2
- Can estimate |x-y|2 up to 1+ε w.p. 1-δ- #rows(A) = (r ε-2 log 1/δ /log n)- Set r = Θ(log n)
- Can estimate |x-y|2 up to 1+ε w.p. 1-δ- #rows(A) = (r ε-2 log 1/δ /log n)- Set r = Θ(log n)
y 2 {-nO(1), …, nO(1)} t
Low-Error Hamming DistanceUniverse = [n]Δ(x,y) = Hamming Distance between x and y
x 2 {0,1}n y 2 {0,1}n
• R||δ (Δ(x,y)ε) = (ε-2 log 1/δ log n)• Reduction to ALSI2• Gap-Hamming to LSI2 reductions with Low Error
• Implies our lower bounds for estimating• Any Lp-norm• Distinct Elements• Entropy
Conclusions
• Prove first streaming space lower bounds that depend on probability of error δ– Optimal for Lp-norms, distinct elements– Improves lower bound for entropy– Optimal dimensionality bound for JL transforms
• Adds several twists to augmented indexing proofs– Augmented indexing with a small set in a large domain– Proof builds upon lopsided set disjointness lower bounds– Uses multiple Gap-Hamming to Indexing reductions that
handle low error
ALSI2 to Hamming Distance
- Let t = 1/ ε2 log 1/δ
- Use public coin to generate t random strings b1, …, bt 2 {0,1}t
- Alice sets xi = majorityk in Si bi, k
- Bob sets yi = bi ,j
S1, …, Sr ½ [1/ε2 ¢ 1/δ]All i: |Si| = 1/ε2
j 2 [U]i* 2 {1, 2, …, r}
Si*+1 …, Sr
Embed multiple copies by duplicating coordinates at different scales
Embed multiple copies by duplicating coordinates at different scales
Recommended