Testing Metric Properties Michal Parnas and Dana Ron

Preview:

Citation preview

Testing Metric Properties

Michal Parnas and Dana Ron

Property Testing (Informal Definition)

For a fixed property P and any object O,determine whether O has property P,or whether O is far from having property P (i.e., far from any other object having P ).

Task should be performed by querying the object (in as few places as possible).

? ?

?

??

Property Testing - Background

• Initially defined by Rubinfeld and Sudan in the context of Program Testing (of algebraic functions).

• Goldreich Goldwasser and Ron initiated study of testing properties of (undirected) graphs.

• Growing body of work deals with properties of functions, graphs, strings, sets of points ... Many algorithms with complexity that is sub-linear in (or even independent of) size of object.

Motivation

• Computational: Design testing algorithms that are (much) more efficient than exact decision algorithms for properties.

• Combinatorial: Gain new understanding about tested property.

Testing Metric Properties

P - Metric property ;M - n x n rational-valued matrix;

- Distance/approximation parameter;

M is said to be -far from property P if must modify more than fraction of n2 entries so that M obtains P. Otherwise say that it is -close.

Testing algorithm can query M on entries M[i,j].If M has property P, should accept;

If M is -far from property P, should reject w.p. 2/3.

Tree Metrics and Ultametrics

An n x n matrix M is a tree metric (additive metric) if exists a tree T with positive weights on edges, such that:

• There exists a mapping from [n] into nodes of T;

• For every i,j[n]={1,…,n}, T((i),(j))=M[i,j];

• All nodes to which no i[n] is mapped to, have degree greater than 2.

If: T is rooted, maps only to leaves of T, and distance of all leaves to root is the same, then M is an ultrametric.

75

3 25

35

4

1 53

6 7

4

2

M[1,2]=8;M[1,3]=12;M[1,4]=10;M[1,5]=15; . . .

Tree Metric

1 2 3 4 5 6

4 4 4

3

3

2

2

11

M[1,2]=M[1,3]=M[2,3]=8;M[1,4]=M[1,5]=M[1,6]=12;M[4,5]=M[4,6]=6;M[5,6]=2; . . .

Ultrametric

Our Results

• Can test ultrametrics with |S|= O(log(1/)/).

• Can test general tree metrics with |S|=O(log(1/)/).

• Can extend result for ultrametrics to approximate ultrametrics.

• Can test d-dimensional Euclidean metrics with |S|=O(d log d/).

Our algorithms all work by taking uniformly selected sample S [n] and querying M[i,j] for i,j S. Size of sample is always poly(1/) and independent of n. Specifically:

Our Results (continued)

Testing algorithms can be used to solve relaxed versions of corresponding search problems in time linear in n (and polynomial in 1/). That is, can construct tree that agrees with M on all but at most -fraction of entries.

(Note that running time is sub-linear in size of matrix M.)

Constructing an Ultrametric Tree

Suppose M is an ultrametric. We can construct an ultrametric tree that agrees with M on given subset {1,…,s} in following manner:

• Initialization: Position points 1 and 2 at equal distance M[1,2]/2 from root node.

• Iterations: For each point j = 3,…,s add point j to current tree by adding new branch that emits from j’s unique point of departure from tree. This point is determined by closest point in tree.

M[1,2]=8; M[1,3]=M[1,4]=M[1,5]=10;M[2,3]=M[2,4]=M[2,5]=10;M[3,4]=2; M[3,5]=6;M[4,5]=6;

1 2

4 4

3

1

5

2

3

45

1 1

1

Consistency of points with tree

For U [n] , let TU denote tree with leaf-set U, that agrees with M on U (if exists, such tree is unique).

Def: Say that j [n] \ U is consistent with TU if adding j to TU as described in construction procedure, results in tree that agrees with M on U+j.

Denote set of points consistent with U by U.

The “Scaffold Partition”

For U [n] , let TU denote tree with leaf-set U, that agrees with M on U. We refer to tree as scaffold.

Def: Let PU be following partition of U, induced by TU: Points i and j are in same class i.f.f have same point of departure from TU .

1

3

1

11

3

2

1 1

2

2

C1 C4C3C2

The scaffold partition

Violating Pairs

If M is an ultrametric, then for every subset U, and for

every two points i,j that belong to different classes in PU, value of M[i,j] is exactly determined by corresponding (different) departure points in TU.

Def: Say that i,j U that belong to different classes in

PU are a violating pair w.r.t. TU if distance between them according to scaffold TU differs from M[i,j] .

1

3

1

11

3

2

1 1

2

2

C1 C4C3C2

If M is ultrametric, must have M[i,j]=8.

ji

3 2

Two types of “witnesses”

Suppose have scaffold tree TU that agrees with M on U. (If can’t construct such tree, clearly M not ultrametric.)

It follows that:

• If obtain point j that is inconsistent with TU

then have witness that M not ultrametric.

• If obtain pair of points i,j that are violating w.r.t. TU

then have witness that M not ultrametric.

Testing Algorithm for Ultrametrics

1. Uniformly select s=O(log(1/)/3) points from [n]. Denote set by U.

2. Construct tree TU that agrees with M on U. If fail, reject.

3. Uniformly select m=O(1/) pairs of points from [n].

4. If any of these 2m points is inconsistent with TU, or any of the m pairs is violating w.r.t. TU, then reject.

5. If no step cause rejection then accept.

Analysis of Algorithm

If M is ultrametric -- Algorithm always accepts. (No inconsistent points and no violating pairs.)

From now on assume M is -far from ultrametric. Will show that algorithm rejects w.h.p.

Specifically: Either can’t construct TU that agrees with M; or many inconsistent points w.r.t. TU; or many violating pairs w.r.t. TU;

Special Case (for M -far from ultrametric)

Suppose TU agrees with M, and all but at most (/3)n2

pairs of points in U belong to different classes in PU

(are separated). (In particular is the case if all classes of size O( n).)

Claim: Either have > (/3)n inconsistent points w.r.t. TU

or have > (/3)n2 violating pairs w.r.t TU.

Subject to claim, if M is -far from ultrametric, then rejected w.h.p. as required.

Proof of Claim for special case

Assume, contrary to claim, that have (/3)n inconsistent points, and (/3)n2 violating pairs. Will show that ultrametric tree T that agrees with M on all but at most n2 entries, in contradiction to assumption on M.

Tree T builds on scaffold TU:

For every class C in PU create star-shaped sub-tree with leaf set C that is rooted at point of departure of C from TU.Inconsistent points are added arbitrarily.

By premise of lemma and (counter) assumptions, num of disagreements (/3)n .n + (/3)n2 + (/3)n2 = n2 .

incon. pts viol. Pairs unsep. pairs

1

3

1

11

3

2

1 1

2

2

C1 C4C3C2

1

3

1

11

3

2

1 1

2

2

C1 C4C3C2

General Case

By special case: Gain from separating points to diff classes.

Def: Say that point kU is effective separator w.r.t. TU if

adding k to U causes ( n/12)2 pairs of points to be

separated into different classes.

k

C1

C4C3C2C1,2C1,1

General Case

By special case: Gain from separating points to diff classes.

Def: Say that point kU is effective separator w.r.t. TU if

adding k to U causes ( n/12)2 pairs of points to be

separated into different classes.

k

C4C3C2C1,2C1,1

General Case (continued)

In analysis, view sample U as being selected in phases.

In each phase, if many effective separators then one selected w.h.p.

After sufficient num of phases, either have special case (few non-separated pairs), or U s.t. have few effective separators w.r.t. TU .

In latter case can show that class C in PU, tree TC s.t. for almost all pairs i,jC, M[i,j]= TC(i,j). (Tree is star-shaped/broom-shaped.)

General Case (continued)

Claim: Either have > (/4)n inconsistent points w.r.t. TU

or have > (/4)n2 violating pairs w.r.t TU.

Subject to claim, if M is -far from ultrametric, then rejected w.h.p. as required.

Proof of Claim is similar to that in special case: Assume few inconsistent points and violating pairs, show that tree close to M (contradicting M being-far from ultrametric).

1

3

1

11

3

2

1 1

2

2

C1 C4C3C2

1

3

1

11

3

2

1 1

2

2

C1 C4C3C2

Solving Relaxed version of Search ProblemAnalysis implies that testing algorithm can be used to solve relaxed version of corresponding search problem.

That is, if M is ultrametric then, w.h.p. can construct tree that agrees with M on all but at most -fraction of entries in time linear in n and polynomial in 1/:

• Construct scaffold TU on uniformly selected sample U;

• Partition all points in [n]\U into classes of PU according to distances to points in U;

• For each class C construct star/broom-shaped tree TC.

Testing Approximate Ultrametrics

Def: For a given approximation parameter , we say that matrix M is a -approximate ultrametric if exists ultrametric M’ s.t. for every i,j [n], |M[i,j]-M’[i,j]| .

We describe an algorithm, that for every and, if M is a –approximate ultrametric then algorithm accepts M, and if M is –far from being a c–approximate ultrametric then algorithm rejects M w.h.p. (c is a fixed constant).

Conclusions and Further Research

• Presented algorithm for testing whether matrix is an ultrametric or far from being an ultrametric. Analysis implies fast solution for relaxed search problem.

• Mentioned similar results for approximate ultrametrics, general tree metrics and Euclidean metrics.

• We suspect that results can be improved in terms of dependence on 1/.

• We conjecture that can extend result for general tree metrics to approximate variant.

• Testing other natural metric properties?

Recommended