46
ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Embed Size (px)

Citation preview

Page 1: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS

Amihood AmirLiam RodittyJessica Ficler

Oren Sar Shalom

Page 2: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Motivation – the Conference Location Problem

Page 3: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Consensus String Problem

Output: Find a point whose maximum Distance from all points is smallest

Input: points in space.

Page 4: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Hamming Distance

•  

Page 5: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Consensus String Problem (1-HRC)

•  

 

 

Page 6: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

History:

Frances and Litman [1997]:Problem is NP-complete even for binary alphabets

Therefore: 3 directions.

1. Solution for small k.2. Fixed parameter tractability.3. Approximation algorithms.

Page 7: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

History:

Solution for small k:

Gramm, Niedermeier, and Rossmanith [2001] (3)

Boucher, Brown, and Durocher [2008] (4 binary)

A., Landau, Na, Park, Park, and Sim [2009] (3, radius & dist. sum optimization)

A., Paryenty, and Roditty [2012] (5 binary, l 2

for all k: l k)

Page 8: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

History:

Fixed Parameter Tractability for all Parameters:

Fixed l: Ben-Dor, Lancia, Perone, and Ravi [1997]

Fixed k: Gramm, Niedermeier, and Rossmanith [2003]

Fixed d: Sojanovic, Berman, Gumucio, Hardison, and Miller [1997] Lanctot, Li, Ma, Wang, and Zhang [1999] Sze, Lu, and Chen [2004]

Page 9: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

History:Approximations:

PTAS: Li, Ma, and Wang [2002] – not practical.

Rounded LP: Ben-Dor, Lancia, Perone, and Ravi [1997]

large number of variables: |Σ|l Chimani, Woste, and Bocker [2011]:

can be reduced to: |Σ|(l-1) A., Paryenty, and Roditty [2011]: |T(S)| |Σ| (T(S)= set of column types)

Page 10: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Another Motivation – Clustering.The C-CenterStrings problemInput:1. Points in space2. Number c3. Objective function f.

Output: Divide the points to c sets such that for the c consensus strings c1,c2,…,cc, f(c1,c2,…,cc) is maximum/minimum.

Page 11: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Three Types of Objective functions:• Let HRC (Hamming Radius Clustering) be the consensus

string problem defined before.

1.c-HRC: partition into c sets, each of which has center with radius d.

• 2. c-HRLC: partition into c sets, each of which has center with radius d, but center is part of input set.

• 3. c-HRSC: partition into c sets, each of which has a center and the sum of the radii does not exceed d.

Page 12: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

The Hamming radius c-clustering problem (c-HRC)Example:

For the following strings and d=1, we show it belongs to 2-HRC.

 

   

Page 13: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

The Hamming radius local c-clustering problem (c-HRLC)Example:

For the following strings and d=2, we show it belongs to 2-HRLC.

   

Does it belong to 2-HRLC when d=1 ?

 

Page 14: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

The Hamming radius c-clustering sum problem (c-HRSC)Example:

For the following strings and d=2, we show it belongs to 2-HRC.

 

   

Page 15: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

In this Paper:

We consider:

1. Parametetrized Complexity, and

2. Approximations

Small k is not too meaningful in the context of clustering.

Page 16: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

C-CenterString Parameterized Complexity

c Fixedk Fixed

d Fixed(d=1)

d/l and c Fixed

l Fixed(l=2)

HRC NPCpolynomial

time NPCpolynomial

time ?

HRLCpolynomial

timepolynomial

time ?polynomial

time NPC

HRSC NPCpolynomial

time ?polynomial

time ?

Page 17: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Theorem: HRC,HRLC and HRSC can be solved in polynomial time for fixed k.

• If k≤c then input strings can be assigned to c centers where d=0.

• Otherwise c<k. There are ck<kk options for partitioning k strings to c sets.

- For each set, find the consensus center in

polynomial time.- The partition that gives the best result is

the optimal solution.

Page 18: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

C-CenterString Parameterized Complexity

c Fixedk Fixed

d Fixed(d=1)

d/l and c Fixed

l Fixed(l=2)

HRC NPCpolynomial

time NPCpolynomial

time ?

HRLCpolynomial

timepolynomial

time ?polynomial

time NPC

HRSC NPCpolynomial

time ?polynomial

time ?

Page 19: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Theorem: HRC is NP complete even if the radius is fixed to d = 1.

• d = 1 and the alphabet is binary• By reduction from Vertex Cover For Triangle-Free Graphs

• Our input:• G - Triangle-Free Graph• t – size of vertex-cover set

Page 20: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

• The construction:

The c parameter is t.

The distance parameter d is 1.

1

4

2

3

5 6

7

1 2 3 4 5 6 7

1 0 0 1 0 0 0

0 1 1 0 0 0 0

0 1 0 1 0 0 0

1 0 1 0 0 0 0

0 0 1 0 1 0 0

0 0 0 1 0 1 0

0 0 0 0 1 0 1

0 0 0 0 0 1 1

Encode edges as bit strings of length |V|. Set the bits of the vertices on the two sides of the edge.

Page 21: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

1

4

2

3

5 6

7

0 1 1 0 0 0 0

1 0 1 0 0 0 0

0 0 1 0 1 0 0

0 0 1 0 0 0 0

0 0 0 1 0 0 0

1 0 0 1 0 0 0

0 1 0 1 0 0 0

0 0 0 1 0 1 0

0 0 0 0 0 0 1

0 0 0 0 1 0 1

0 0 0 0 0 1 1

Page 22: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

0 1 1 0 0 0 0

1 0 1 0 0 0 0

0 0 1 0 1 0 0

0 0 1 0 0 0 0

Page 23: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

0 0 0 ? ? ? ?

0 0 0 0 1 0 1

0 0 0 1 0 1 0

Page 24: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

0 1 1 0 0 0 0

1 0 1 0 0 0 0

1 1 0 0 0 0 0

1 2 3 4 5 6 71

32

Page 25: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

C-CenterString Parameterized Complexity

c Fixedk Fixed

d Fixed(d=1)

d/l and c Fixed

l Fixed(l=2)

HRC NPCpolynomial

time NPCpolynomial

time ?

HRLCpolynomial

timepolynomial

time ?polynomial

time NPC

HRSC NPCpolynomial

time ?polynomial

time ?

Page 26: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Theorem: HRLC is NP complete even if

the length is fixed to l=2• We prove by reduction from Minimum Maximal

Matching for Bipartite graphs• Our input:

• G – Bipartite Graph• t – size of the minimal set that is maximal matching

Maximal MatchingMinimumMaximal Matching

Page 27: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

• The construction:

The c parameter is t.

The distance parameter d is 1.

1

4

2

3

5

1 2

1 4

3 2

3 4

5 4

Page 28: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

1

4

2

3

5

1 2

3 2

3 4

3 2

1 4

5 4

5 4

Page 29: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

Page 30: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

3 2 5 4

Page 31: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

1 2 1 3

6 2

1 2

1 4

5 2

1 2

1 3

5 3

1 3

6 2

5 2

1 2

1 2

1 4

1 3

5 3

1 3

Page 32: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

6 2

5 2

1 2

1 2

1 4

1 3

5 3

1 3

Move strings [6,2] and [5,2] if there are centers begins in 5 or 6 5 2

1 2

1 2

1 4

1 3

5 3

1 3

6 2

6 7

6 7

Change the center to one of the remaining strings 5 2

5 2

1 2

1 4

1 3

5 3

1 3

6 2

6 7

6 7

We keep going until there are no two centers with common symbol !

Page 33: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Approximation Algorithms

• 1. A linear-time 4-Approximation for the 2-HRSC problem.

• 2. A polynomial time 3-Approximation for the 2-HRSC problem.

• 3. Special case PTAS – by computing the clusters and doing 1-HRC approximation on each cluster.

Page 34: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

>2d>2d

>2d

Lemma

Page 35: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Proof•  

center

 

 

 

 

 

Page 36: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

• If we had a representative from each cluster we can associate the rest of the strings to the appropriate group

• Now use a knownapproximation algorithmof 1-HRC, for finding the consensus strings of each cluster

>2d 

>2d

>2d 

 

Page 37: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

>4d

Lemma

Cluster c-center

Cluster c-center

Page 38: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Proof

≤d

 

≤d

≤d

≤d

≤d

Page 39: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0

1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0

Page 40: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0

1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0

Page 41: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Polynomial time approximation algorithm for 2-HRSC problem•  

 

 

  

 

      

Page 42: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

     

Page 43: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

Page 44: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

     

  

  

Page 45: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

•  

Page 46: ON THE EFFICIENCY OF THE HAMMING C-CENTERSTRING PROBLEMS Amihood Amir Liam Roditty Jessica Ficler Oren Sar Shalom

Future work

1. We presented a heuristic algorithm that did very well in practice – what is its approximation ratio?

2. There are some gaps in the parameterized complexity table: a. What happens in the HRLC/HRSC cases for fixed d?

b. What happens in the HRC/HRSC cases for fixed l?

3. Is there a PTAS for c-HRC?

4. Can we approximate c-HRC using LP? SDP?