Constraint Systems Laboratory Oct 21, 2004Guddeti: MS thesis defense1 An Improved Restart Strategy for Randomized Backtrack Search Venkata P. Guddeti Constraint

Oct 21, 2004 Guddeti: MS thesis defense 1

Constraint Systems Laboratory

An Improved Restart Strategy forRandomized Backtrack Search

Venkata P. Guddeti

Constraint Systems LaboratoryUniversity of Nebraska-Lincoln

Under the supervision of Dr. Berthe Y. Choueiry



Outline

• Summary of contributions

• Background

• Randomized BT search with restarts

• Empirical evaluations

• Conclusions & future research directions



Summary of contributions• An improved restart strategy for randomized

backtrack search (RDGR)

• Evaluation & characterization– Comparison with BT, LS, ERA, RGR– Criterion: solution quality distribution – Problem types: GTAAP & random CSPs

• As a result, we have identified– Regimes where a given technique dominates– Building blocks for designing cooperative, hybrid search



Outline• Summary of contributions• Background

– Constraint satisfaction problem (CSP)– Graduate Teaching Assistants Assignment

Problem (GTAAP)– Search strategies: BT, LS, ERA

• Randomized BT search with restarts• Empirical evaluations• Conclusions & future research directions



CSP: Definition• Given P = (V, D, C):

– V a set of variables– D a set of variable domains (values that a

variable can take)– C a set of constraints

• Objective: assign a value to each variable such that all constraints are satisfied

In general, a CSP is NP-complete



CSP: Representation • Variable → node

• Domain → node label

• Constraint → edge between nodes

≠

≠

V3 V4

V2V1

≠≠

{d} {c, d, e, f}

{a, b, c}{a, b, d}



Context: GTAAP [Glaubius 01]

Hiring & managing GTAs as instructors + graders• Given

– A set of courses– A set of GTAs– A set of constraints that specify allowable assignments

• Find a consistent & satisfactory assignment– Consistent: assignment breaks no (hard) constraints– Satisfactory: assignment maximizes

1. number of courses covered 2. happiness of the GTAs



Constraint-based model• Variables (typically 70 courses)

– Grading, conducting lectures, labs & recitations

• Values (30 GTAs)– Hired GTAs (+ preference for each value in domain)

• Constraints– Unary, binary, global (e.g., capacity)

• Objective– longest consistent solution (primary criterion)– maximize geometric mean of preferences (secondary

criterion)



Backtrack search (BT)Start with an empty assignment & expand it by instantiating one variable at a time

≠

≠

V3 V4

V2V1

≠≠

{d} {c, d, e, f}

{a, b, c}{a, b, d}



BT (cont’d)

• In theory, complete. In practice... forget it– Huge branching factor causes thrashing

backtrack never reaches early variables

• Tested 12 ordering heuristics (Chap 3)– No significant difference

Use randomization &restarts [Gomes et al. 98]



Iterative-improvement search• Start with a complete assignment (=state), move

to states that improve current one• Not complete• Tested: LS and ERA [Hui Zou, MS 2003]

– Advantages: • Explores relatively wide portions of solution space• ERA solves tight instances, never solved before or since

– Disadvantages• LS: local optimum & plateau cause stagnation• ERA: deadlock in over-constrained cases causes instability



Outline


• Background






BT: Randomization & restarts

• Ordering of variables/values determines which parts of the solution space are explored– Randomization allows us to

explore wider portion of search tree

• Thrashing causes stagnation of BT search– Interrupt search, then restart

In systematic backtrack search



Restart strategies• Fixed-cutoff & universal strategy [Luby et al., 93]

• Randomization & Rapid restarts (RRR) [Gomes et al., 98]

– Fixed optimal cutoff value– Priori knowledge of cost distribution required

• Randomization & geometric restarts (RGR) [Walsh 99]

• Bayesian approach [Kautz et al., 02]



RGR [Walsh 99]

• Static restart strategy

• As the cutoff value increases, RGR degenerates into randomized BT– Ensures completeness (utopian in our setting)– But… restart is obstructed – … and thrashing reappears diminishing the

probability of finding a solution

nCCrCi

.i

0

1



RDGR

• Randomization & Dynamic Geometric Restarts

• Cutoff value – Depends on the progress of search– Never decreases, may stagnate– Increases at a much slower rate than RGR

• Feature: restart is ‘less’ obstructed

otherwise

restart at the improved hassolution when the

1

.

i

iii C

CrC

th



Outline


• Background






Three main experiments

1. Effect of run time on RGR & RDGR

2. Choice of r in RGR & RDGR

3. Relative performance of RDGR versus– Backtrack search (BT) [Glaubius 01]

– Local Search (LS) [Zou 03]

– Multi-Agent Search (ERA) [Liu et al. 02, Zou 03]

– RGR

All implementations use same platform and executed to the best of our abilities (internal competition)



Evaluation criteria• Solution Quality Distribution (SQD)

– cumulative distributions of solution quality– measured as percentage deviation from best

known solution

• Descriptive statistics– Mean, median, mode, std dev, max, min

• 95% confidence interval using – Mann-Whitney U-test– Wilcoxon matched pairs signed-rank test



Data sets • 8 real-world data sets (GTAAP)

– 5 solvable, 3 over-constrained– Experiment repeated 500 times

• 4 sets of randomly generated problems– Model B, 100 instances, each instance runs for 3 minutes

Critical valueof order parameter

Order parameter

Solvable <25,15,0.5,0.36>Unsolvable <25,15,0.5,0.36>

<40,20,0.5,0.2> <40,20,0.5,0.5>



1. Effect of varying run time• RDGR consistently outperforms RGR• Running time does not affect the relative dominance

Solvable problem

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12 14

Deviation from best known solution [%]

Per

cen

tag

e o

f te

st r

un

s

RDGR-20minRGR-20minRDGR-10minRGR-10minRDGR-5minRGR-5min

Over-constrained problem

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12 14

Deviation from best known solution [%]

Pe

rce

nta

ge

of

tes

t ru

ns

RDGR-20minRDGR-10minRDGR-5minRGR-20minRGR-10minRGR-5min



2. Choice of r in RGR

RGR on data sets 1 & 5

0

10

20

30

40

50

60

70

80

90

100

1 1.5 2 2.5 3 3.5 4Ratio

Pe

rce

nta

ge

of

tes

t ru

ns

Data set 1 Data set 5

RGR on Random CSPs

0

10

20

30

40

50

60

70

80

90

1 1.5 2 2.5 3 3.5 4Ratio

Per

cen

tag

e o

f te

st r

un

sUnder-constrained Over-constrained Phase transition, solvable Phase transition, unsolvable

r = 1.1 for RGR for GTAAP & random CSPs



2. Choice of r in RDGR

RDGR on Random CSPs

0

10

20

30

40

50

60

70

80

90

100

1 1.5 2 2.5 3 3.5 4Ratio

Per

cen

tag

e o

f te

st r

un

sUnder-constrained Over-constrained Phase transition, solvable Phase transition, unsolvable

RDGR on data sets 1 & 5

0

10

20

30

40

50

60

70

80

90

100

1 1.5 2 2.5 3 3.5 4

Ratio

Pe

rce

nta

ge

of

tes

t ru

ns

Data set 1 Data set 5

r = 1.1 for GTAAP r = 2 for random CSPs



3. Performance: SQDs• Under-constrained: ERA > RDGR > RGR > BT > LS

• Over-constrained: RDGR > RGR > BT > LS > ERA

Under-constrained

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12 14Deviation from best known solution [%]

Per

cen

tag

e o

f te

st r

un

s ERARDGRRGRBTLS

Over-constrained

0

10

20

30

40

50

60

70

80

90

100

0 5 10 15 20 25Deviation from best known solution [%]

Per

cen

tag

e o

f te

st r

un

s

RDGRRGRBT



3. SQDs at phase transition

Phase transition, solvable

0

10

20

30

40

50

60

70

80

90

100


Per

cen

tag

e o

f te

st r

un

s

RDGRRGRBTERALS

• Solvable: ERA still wins for smallest deviations• Unsolvable: RDGR > RGR > BT > ERA > LS

Phase transition, unsolvable

0

10

20

30

40

50

60

70

80

90

100


Per

cen

tag

e o

f te

st r

un

s

RDGR

RGR

BT

ERA

LS



3. Performance: RDGR vs. RGR • RDGR allows more restarts than RGR

• RDGR is more stable than RGR

Data sets 1 2 3 4 5 6

Average restarts

RGR 16.7 17.4 22.5 14.7 22.4 19.5

RDGR 74.5 59.9 167.4 39.1 39.1 46.2

Data sets 1 2 3 4 5 6

Standard deviation

RGR 2.8 1.1 0.7 1.0 1.0 1.2

RDGR 0.7 0.8 0.6 0.9 0.7 1.1



Outline


• Background






Summary: algorithm dominance

On GTAAP & randomly generated CSPs

• Solvable instancesERA > RDGR > RGR > BT > LS

• Over-constrained instances RDGR > RGR > BT > LS > ERA

• At phase transition (statistically)

RDGR > RGR > BT > ERA > LS(although ERA gives best results on solvable instances)



Future research

• Design ‘progress-aware’ restart strategies– Where cutoff value is changed during search

• Design new search strategies– Hybrids: a solution from a given technique is

fed to another– Cooperative: strategies applied where most

appropriate within a given problem instance



Thank you for your attention

I welcome your questions..

Documents

Constraint Systems Laboratory Oct 21, 2004Guddeti: MS thesis defense1 An Improved Restart Strategy for Randomized Backtrack Search Venkata P. Guddeti Constraint