Upload
edgar-spencer
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Doubling Dimension:a short survey
Anupam GuptaCarnegie Mellon University
Barriers in Computational Complexity II, CCI, Princeton
Metric space M = (V, d)
(finite) set V of points
symmetric non-negativedistances d(x,y)
triangle inequalityd(x,y) ≤ d(x,z) + d(z,y)
x
y
z
Dimension dimD(M) is the smallest k such that
every set S with diameter DS
can be covered by 2k sets of diameter ½DS
D
doubling dimension
¸ = 2dim_D = doubling constant
doubling generalizes geometric dimension
Take k-dim Euclidean space Rk
Claim: dimD(Rk) ≈ Θ(k)
Easy to see for boxes
Argument for spheres a bit more involved. 23 boxes to cover
larger box in R3
facts about doubling
The notion of doubling dimension behaves smoothly under metric distortion
definition closed under taking submetrics
jargon: “doubling” = family of metrics with doubling dimension bounded by some absolute constant c independent of n.
Suppose a metric (X,d) has doubling dimension k.
If any subset S µ X of points has all inter-point distances lying between ± and ¢
then |S| ≤ (2¢/±)k
useful property of doubling
Proof: recursively apply the definition…
Suppose a metric (X,d) has doubling dimension k.
If any subset S µ X of points has all inter-point distances lying between ± and ¢
then |S| ≤ (2¢/±)k
useful property of doubling
/2D
this 2-dim set
has O(/)2 points
what is not a doubling metric?
The equidistant metric Un on n points has dimension (log n)
Hence low doubling dimension captures the fact that the metric does not have large (near)-equidistant metrics.
the picture thus far…
Doubling dimension kEuclidean
dimension £(k)
Metrics with >> 2k
nearly-equidistant points
btw, just to check
Natural Q: Do all doubling metrics embed into ℓ2 with distortion O(1)?
No.
The Laakso fractals require (√log n) distortion to embed into ℓ2 withany number of dimensions. [GKL’03]
In fact, the right behavior is £(√ dimD log n) [KLMN’04, ABN’05, JLM’09]
Many geometric algorithms can be extended to doubling spaces…
Near neighbor searchCompact routingDistance labelingNetwork triangulationSensor placements
Small-world networksTraveling SalesmanSparse SpannersApprox. inferenceNetwork Design
Clustering problemsWell-separated pair
decompositionData structuresLearnability
a substantial(?) generalization
Doubling dimension kEuclidean
dimension £(k)
example application
Assign labels L(x) to each host x in a metric spaceLooking just at L(x) and L(y), can infer distance d(x,y)
Results
labels with (O(1)/ε)dim × log n bitsestimates within (1 + ε) factor
Contrast withlower bound of n bit labels in general for any factor < 2
x
y010001
110001
f( , )
110001
010001
≈ d(x,y)
[Arora 95] showed that TSP on Rk was (1+²)-approximable in time
[Talwar 04] extended the first result to metrics with doubling dimension k
another example
Can we get the PTAS as well?
example in action: sparse spanners for doubling metrics
spanners
Given a metric M = (V, d), a graph G = (V, E) is an (m, ²)-spanner if1) number of edges in G is m2) d(x,y) ≤ dG(x,y) ≤ (1 + ²) d(x,y)
A reasonable goal: ² = 0.1, m = O(n)
Fact: For the equidistant metric Un, if ² < 1 then G = Kn
spanners for doubling metrics
Theorem:Given any metric M, and any ² < ½,we can efficiently find an spanner G with stretch ²
and number of edges m = n (1 + 1/²) dimD(M)
Hence, for doubling metrics, linear-sized spanners!
Generalizes a similar theorem for Euclidean metrics.
standard tool: nets
Nets: A set of points N is an r-net of a set S if– d(u,v) ≥ r for any u,v 2 N– For every w 2 S \ N, there is a u 2 N with d(u,w) < r
r
standard tool: nets
Nets: A set of points N is an r-net of a set S if– d(u,v) ≥ r for any u,v 2 N– For every w 2 S \ N, there is a u 2 N with d(u,w) < r
Fact: If a metric has doubling dim k and N is an r-net
) B(x,2r) \ N has O(1)k points.
recursive nets
24
816
so you take a 2-net N1 of these pointsNow you can take a 4-net N2 of this net
And so on…
Suppose all the points were at least unit distance apart
recursive nets
N0 = V
Nt is a 2t-net of the set Nt-1
N1
N2
N3
N4
Nt is a 2t+1-net of the set V (almost)
the spanner construction
N0 = V
Nt is a 2t-net of the set Nt-1
N1
N2
N3
N4
Nt is a 2t+1-net of the set V (almost)
Connect eachnet point in Nt
to other net points at distance
at most O(1/²) 2t
the number of edges
Number of points in Nt within O(1/²) 2t of some net point
at most O(1/²)k
Number of levels = O(log diameter)
Number of nodes in net at each level ≤ n
Hence, number of edges ≤ n × log diameter × O(1/²)k
Can be improved to n × O(1/²)k
the stretch factor
spanners for doubling metrics
Theorem:Given any metric M, and any ² < ½,we can efficiently find an (m, ²)-spanner G with
number of edges m = n (1 + 1/²) dimD(M)
Hence, for doubling metrics, linear-sized spanners!
example in action: TSP for doubling metrics
plan of attack
We have PTASs for TSP for points in constant-dimensional ℓ2.
If we could embed doubling metrics into constant-dimensional ℓ2
that maintains distances to within (1+²) (in expectation)
we’d be done.
completely ridiculous strategy, but maybe we’ll get somewhere.
embedding doubling trees into ℓ2
Recall: embedding doubling metrics into ℓ2requires (√log n) distortion, regardless of
dim’n.
however…
Theorem: if a doubling metric is also a tree metric, embeds into ℓ2 with distortion O(1) and dimension
O(1)poly(¸) poly(¸)
embedding doubling metrics into doubling trees
Bad news:2-d grids require (log n) distortion
to embed into distributions over trees
Good news:All doubling metrics embed into distributions over
doubling trees with distortion O(log n).
plan of attack
We have PTASs for TSP for points in constant-dimensional ℓ2.
If we could embed doubling metrics into constant-dimensional ℓ2
that maintains distances to within (1+²) (in expectation)
we’d be done.
revised
Arora’s simpler TSP idea
Given any TSP tour of length L in d-dim spacefind B = (log n/±)d portals in each cluster
and show there exists a portal-respecting
tour which increases length by ≤ ± L
Now dynamic program to find best portal-resp tour
Þ runtime ~ (n log n) BB
Arora’s simpler TSP idea
Given any TSP tour of length L in d-dim spacefind B = (log n/±)d portals in each cluster
and show there exists a portal-respecting tour which increases length by ≤ ± L
define portals, choosing ± = ²/O(log n)
OPT tour of length L* in original doubling metric
embeds into O(1)-dim space with length L = O(log n)L*
increase in length = ± L = ² L*
and now find the best portal-respecting tour in original doubling metric!
recap for TSP
embedded doubling metric randomly into doubling trees
embedded those into constant-dimensional ℓ2
use that to find clusters/portalsand claim existence of (1+²) OPT tour
find best tour in original metric using dynamic programming.
Talwar’s algorithm does it better, dependence on dimD, not on ¸
low dimensional embeddings(and dimensionality reduction)
dimensionality reduction
If a Euclidean metric embeds into Rk for some dimension kwith distortion O(1)
the Euclidean metric has doubling dimension O(k)
we want to efficiently find an Euclidean embedding into RO(k)
with distortion O(1)
We just saw: embed any metric with doubling dimension k into distribution over 2O(k)-dimensional ℓ1 spaces
with distortion O(log n)2O(k).
dimensionality reduction
We just saw: embed any metric with doubling dimension k into distribution over 2O(k)-dimensional ℓ1 spaces
with distortion O(log n)2O(k).
If a Euclidean metric embeds into Rk for some dimension kwith distortion O(1)
the Euclidean metric has doubling dimension O(k)
we want to efficiently find an Euclidean embedding into RO(k)
with distortion O(1)
dimensionality reduction
We just saw: embed any metric with doubling dimension k into distribution over 2O(k)-dimensional ℓ1 spaces
with distortion O(log n)2O(k). O(k) ℓ2 space
O*(log n)
Better:
If a Euclidean metric embeds into Rk for some dimension kwith distortion O(1)
the Euclidean metric has doubling dimension O(k)
we want to efficiently find an Euclidean embedding into RO(k)
with distortion O(1)
a more general bound
Example Theorem:Any metric with doubling dimension dimD embeds intoEuclidean space with T dimensions with distortion
(where T 2 [ dimD log log n, log n])
All these techniques are ultimately limited by fact thatthey embed all doubling metrics, and not just Euclidean ones.
log ndimD
T
special cases of interest
Distortionon using O(dimD (M))Euclidean dimensions
Distortion on using O(log n)
Euclidean dimensions
General metrics
Euclidean
This generalizes result we talked about in Lecture #2: any metric embeds into Euclidean space with O(log n) distortionThis is just the Johnson-Lindenstrauss lemma.
If the metric is doubling, this quantity is sqrt{log n}.
In general, this is never more than O(log n).
Again generalizes the previous result.
weaken requirements?
Low-dimensional projection preserving near-neighborsO(log dimD poly ²-1) dimension random projection [IN05?]
(random projections also work for points on smooth manifolds)
Give low-dim set of points approximating d(x,y)0.99
Again, can get similar dimensionality… [GK10, BRS10]
one more useful tool..
Given a metric M,want to partition it randomly
into pieces of “small” diametersuch that “nearby” vertices lie in different pieces
only with “small” probability.
“random metric decompositions”
“padded” decompositions
A metric (V,d) admits ¯-padded decompositions, if for every ¢, we can output a random partition
V = V1 ] V2 ] … ] Vk
1. each Vj has diameter ≤ ¢
2. Pr[ B(x,½) split ] ≤
the facts
Thm: Doubling metrics admit O(dimD)-padded decompositions
Useful wherever padded decompositions are useful
E.g.: can prove that all doubling metrics embed into ℓ2 with distortion
last slide: some questions
For specific metric space problems, can we match the performance for their geometric counterparts?
Which problems admit algorithms whose performance can be parameterized using such a notion of dimension?
Other notions of dimension that are algorithmically significant?
thank you!