45
Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Embed Size (px)

Citation preview

Page 1: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Doubling Dimension:a short survey

Anupam GuptaCarnegie Mellon University

Barriers in Computational Complexity II, CCI, Princeton

Page 2: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Metric space M = (V, d)

(finite) set V of points

symmetric non-negativedistances d(x,y)

triangle inequalityd(x,y) ≤ d(x,z) + d(z,y)

x

y

z

Page 3: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Dimension dimD(M) is the smallest k such that

every set S with diameter DS

can be covered by 2k sets of diameter ½DS

D

doubling dimension

¸ = 2dim_D = doubling constant

Page 4: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

doubling generalizes geometric dimension

Take k-dim Euclidean space Rk

Claim: dimD(Rk) ≈ Θ(k)

Easy to see for boxes

Argument for spheres a bit more involved. 23 boxes to cover

larger box in R3

Page 5: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

facts about doubling

The notion of doubling dimension behaves smoothly under metric distortion

definition closed under taking submetrics

jargon: “doubling” = family of metrics with doubling dimension bounded by some absolute constant c independent of n.

Page 6: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Suppose a metric (X,d) has doubling dimension k.

If any subset S µ X of points has all inter-point distances lying between ± and ¢

then |S| ≤ (2¢/±)k

useful property of doubling

Proof: recursively apply the definition…

Page 7: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Suppose a metric (X,d) has doubling dimension k.

If any subset S µ X of points has all inter-point distances lying between ± and ¢

then |S| ≤ (2¢/±)k

useful property of doubling

/2D

this 2-dim set

has O(/)2 points

Page 8: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

what is not a doubling metric?

The equidistant metric Un on n points has dimension (log n)

Hence low doubling dimension captures the fact that the metric does not have large (near)-equidistant metrics.

Page 9: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

the picture thus far…

Doubling dimension kEuclidean

dimension £(k)

Metrics with >> 2k

nearly-equidistant points

Page 10: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

btw, just to check

Natural Q: Do all doubling metrics embed into ℓ2 with distortion O(1)?

No.

The Laakso fractals require (√log n) distortion to embed into ℓ2 withany number of dimensions. [GKL’03]

In fact, the right behavior is £(√ dimD log n) [KLMN’04, ABN’05, JLM’09]

Page 11: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Many geometric algorithms can be extended to doubling spaces…

Near neighbor searchCompact routingDistance labelingNetwork triangulationSensor placements

Small-world networksTraveling SalesmanSparse SpannersApprox. inferenceNetwork Design

Clustering problemsWell-separated pair

decompositionData structuresLearnability

a substantial(?) generalization

Doubling dimension kEuclidean

dimension £(k)

Page 12: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

example application

Assign labels L(x) to each host x in a metric spaceLooking just at L(x) and L(y), can infer distance d(x,y)

Results

labels with (O(1)/ε)dim × log n bitsestimates within (1 + ε) factor

Contrast withlower bound of n bit labels in general for any factor < 2

x

y010001

110001

f( , )

110001

010001

≈ d(x,y)

Page 13: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

[Arora 95] showed that TSP on Rk was (1+²)-approximable in time

[Talwar 04] extended the first result to metrics with doubling dimension k

another example

Can we get the PTAS as well?

Page 14: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

example in action: sparse spanners for doubling metrics

Page 15: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

spanners

Given a metric M = (V, d), a graph G = (V, E) is an (m, ²)-spanner if1) number of edges in G is m2) d(x,y) ≤ dG(x,y) ≤ (1 + ²) d(x,y)

A reasonable goal: ² = 0.1, m = O(n)

Fact: For the equidistant metric Un, if ² < 1 then G = Kn

Page 16: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

spanners for doubling metrics

Theorem:Given any metric M, and any ² < ½,we can efficiently find an spanner G with stretch ²

and number of edges m = n (1 + 1/²) dimD(M)

Hence, for doubling metrics, linear-sized spanners!

Generalizes a similar theorem for Euclidean metrics.

Page 17: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

standard tool: nets

Nets: A set of points N is an r-net of a set S if– d(u,v) ≥ r for any u,v 2 N– For every w 2 S \ N, there is a u 2 N with d(u,w) < r

r

Page 18: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

standard tool: nets

Nets: A set of points N is an r-net of a set S if– d(u,v) ≥ r for any u,v 2 N– For every w 2 S \ N, there is a u 2 N with d(u,w) < r

Fact: If a metric has doubling dim k and N is an r-net

) B(x,2r) \ N has O(1)k points.

Page 19: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

recursive nets

24

816

so you take a 2-net N1 of these pointsNow you can take a 4-net N2 of this net

And so on…

Suppose all the points were at least unit distance apart

Page 20: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

recursive nets

N0 = V

Nt is a 2t-net of the set Nt-1

N1

N2

N3

N4

Nt is a 2t+1-net of the set V (almost)

Page 21: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

the spanner construction

N0 = V

Nt is a 2t-net of the set Nt-1

N1

N2

N3

N4

Nt is a 2t+1-net of the set V (almost)

Connect eachnet point in Nt

to other net points at distance

at most O(1/²) 2t

Page 22: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

the number of edges

Number of points in Nt within O(1/²) 2t of some net point

at most O(1/²)k

Number of levels = O(log diameter)

Number of nodes in net at each level ≤ n

Hence, number of edges ≤ n × log diameter × O(1/²)k

Can be improved to n × O(1/²)k

Page 23: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

the stretch factor

Page 24: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

spanners for doubling metrics

Theorem:Given any metric M, and any ² < ½,we can efficiently find an (m, ²)-spanner G with

number of edges m = n (1 + 1/²) dimD(M)

Hence, for doubling metrics, linear-sized spanners!

Page 25: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

example in action: TSP for doubling metrics

Page 26: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

plan of attack

We have PTASs for TSP for points in constant-dimensional ℓ2.

If we could embed doubling metrics into constant-dimensional ℓ2

that maintains distances to within (1+²) (in expectation)

we’d be done.

completely ridiculous strategy, but maybe we’ll get somewhere.

Page 27: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

embedding doubling trees into ℓ2

Recall: embedding doubling metrics into ℓ2requires (√log n) distortion, regardless of

dim’n.

however…

Theorem: if a doubling metric is also a tree metric, embeds into ℓ2 with distortion O(1) and dimension

O(1)poly(¸) poly(¸)

Page 28: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

embedding doubling metrics into doubling trees

Bad news:2-d grids require (log n) distortion

to embed into distributions over trees

Good news:All doubling metrics embed into distributions over

doubling trees with distortion O(log n).

Page 29: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

plan of attack

We have PTASs for TSP for points in constant-dimensional ℓ2.

If we could embed doubling metrics into constant-dimensional ℓ2

that maintains distances to within (1+²) (in expectation)

we’d be done.

revised

Page 30: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Arora’s simpler TSP idea

Given any TSP tour of length L in d-dim spacefind B = (log n/±)d portals in each cluster

and show there exists a portal-respecting

tour which increases length by ≤ ± L

Now dynamic program to find best portal-resp tour

Þ runtime ~ (n log n) BB

Page 31: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Arora’s simpler TSP idea

Given any TSP tour of length L in d-dim spacefind B = (log n/±)d portals in each cluster

and show there exists a portal-respecting tour which increases length by ≤ ± L

define portals, choosing ± = ²/O(log n)

OPT tour of length L* in original doubling metric

embeds into O(1)-dim space with length L = O(log n)L*

increase in length = ± L = ² L*

and now find the best portal-respecting tour in original doubling metric!

Page 32: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

recap for TSP

embedded doubling metric randomly into doubling trees

embedded those into constant-dimensional ℓ2

use that to find clusters/portalsand claim existence of (1+²) OPT tour

find best tour in original metric using dynamic programming.

Talwar’s algorithm does it better, dependence on dimD, not on ¸

Page 33: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

low dimensional embeddings(and dimensionality reduction)

Page 34: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

dimensionality reduction

If a Euclidean metric embeds into Rk for some dimension kwith distortion O(1)

the Euclidean metric has doubling dimension O(k)

we want to efficiently find an Euclidean embedding into RO(k)

with distortion O(1)

We just saw: embed any metric with doubling dimension k into distribution over 2O(k)-dimensional ℓ1 spaces

with distortion O(log n)2O(k).

Page 35: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

dimensionality reduction

We just saw: embed any metric with doubling dimension k into distribution over 2O(k)-dimensional ℓ1 spaces

with distortion O(log n)2O(k).

If a Euclidean metric embeds into Rk for some dimension kwith distortion O(1)

the Euclidean metric has doubling dimension O(k)

we want to efficiently find an Euclidean embedding into RO(k)

with distortion O(1)

Page 36: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

dimensionality reduction

We just saw: embed any metric with doubling dimension k into distribution over 2O(k)-dimensional ℓ1 spaces

with distortion O(log n)2O(k). O(k) ℓ2 space

O*(log n)

Better:

If a Euclidean metric embeds into Rk for some dimension kwith distortion O(1)

the Euclidean metric has doubling dimension O(k)

we want to efficiently find an Euclidean embedding into RO(k)

with distortion O(1)

Page 37: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

a more general bound

Example Theorem:Any metric with doubling dimension dimD embeds intoEuclidean space with T dimensions with distortion

(where T 2 [ dimD log log n, log n])

All these techniques are ultimately limited by fact thatthey embed all doubling metrics, and not just Euclidean ones.

log ndimD

T

Page 38: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

special cases of interest

Distortionon using O(dimD (M))Euclidean dimensions

Distortion on using O(log n)

Euclidean dimensions

General metrics

Euclidean

This generalizes result we talked about in Lecture #2: any metric embeds into Euclidean space with O(log n) distortionThis is just the Johnson-Lindenstrauss lemma.

If the metric is doubling, this quantity is sqrt{log n}.

In general, this is never more than O(log n).

Again generalizes the previous result.

Page 39: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

weaken requirements?

Low-dimensional projection preserving near-neighborsO(log dimD poly ²-1) dimension random projection [IN05?]

(random projections also work for points on smooth manifolds)

Give low-dim set of points approximating d(x,y)0.99

Again, can get similar dimensionality… [GK10, BRS10]

Page 40: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton
Page 41: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

one more useful tool..

Given a metric M,want to partition it randomly

into pieces of “small” diametersuch that “nearby” vertices lie in different pieces

only with “small” probability.

“random metric decompositions”

Page 42: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

“padded” decompositions

A metric (V,d) admits ¯-padded decompositions, if for every ¢, we can output a random partition

V = V1 ] V2 ] … ] Vk

1. each Vj has diameter ≤ ¢

2. Pr[ B(x,½) split ] ≤

Page 43: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

the facts

Thm: Doubling metrics admit O(dimD)-padded decompositions

Useful wherever padded decompositions are useful

E.g.: can prove that all doubling metrics embed into ℓ2 with distortion

Page 44: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

last slide: some questions

For specific metric space problems, can we match the performance for their geometric counterparts?

Which problems admit algorithms whose performance can be parameterized using such a notion of dimension?

Other notions of dimension that are algorithmically significant?

Page 45: Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

thank you!