View
238
Download
0
Tags:
Embed Size (px)
Citation preview
Examples of CSP problemsExamples of CSP problems N-Queens Problem : Given an N x N chessboard, the task is to place N queens on the
board so that no two queens are on the same row, column, or diagonal.
Graph Coloring :Given a graph of nodes and edges, the task is to assign a color to each node, such that no two adjacent nodes are assigned the same color.
Boolean SatisfiabilityGiven a propositional logic formula in conjunctive normal form, must find an assignment of true or false to each variable that results in the entire expression evaluating to true.
Properties of CSP ProblemsProperties of CSP Problems The CSP Problems stay apart from single-agent-path-
finding problems and two player games.
The reason is that in CSP we are not generally interested in the series of moves made to reach a solution, but simply finding a problem state that satisfies all the constraints.
In CSPs, the goal state is always given implicitly, since and explicit description of the goal is a solution of the problem.
While constraint-satisfaction problems appear somewhat different from path-finding problems and two-player games, there is a string similarity among the algorithms employed.
RepresentationRepresentation CSP problems can be represented as set of variables,
with a set of values for each variable.
Set of constraints on the values that the variables can be assigned :
unary constraint - specifies a subset of all possible values that can be assigned.
binary constraint - specifies which possible combinations of assignments to a pair of variables satisfy the constraint between them.
ternary constraint - limits the set of values that can be simultaneously assigned to three different variables.
etc.
Dual Representations - Dual Representations - Example 1 Example 1
Example : a three-by-three crossword puzzle, which is equivalent to solving such a problem with no clues ( any valid words are legal).
Representation #1:
six variables : one for each row and column.
legal values : all possible three-letter words.
nine binary constraints between each row and column: word chosen must assign the same letter to the square common.
These two different representations of These two different representations of the problem are duals of each otherthe problem are duals of each other.
Representation #2:
nine variables : for each square of the puzzle. legal values : 26 letters of the alphabet. six ternary constraints the letters in each row and
column must constitute a legal three-word letter.
Dual Representations - Dual Representations - Example 1 Example 1
Every CSP problem formulated as a set of variables, Every CSP problem formulated as a set of variables, values and constraints has a dual representationvalues and constraints has a dual representation.
Each variablevariable of the original representation becomes a constraintconstraint in the dual representation.
Each constraint constraint of the original representation becomes a variablevariable in the dual representation.
Dual Representations Dual Representations
constraints
variables
Original Representation
constraints
variables
Dual Representation
Consider representation of a graph coloring problem :
variablesvariables : the edges of the original graph.
valuesvalues : ordered pairs of colors.
constraintsconstraints : whenever more than one edge is connected to the same node, they all must assign the same color to that node.
Dual Representations - Dual Representations - Example 2 Example 2
The dual of the dual representation of a CSP is the original representation.
Choice of representation have an effect on the efficiency of solving the problem - it may be easier to solve the dual than the original problem.
For example :
the crossword-puzzle is likely to be easier to solve by using the first representation.
Dual Representations Dual Representations
Brute-Force SearchBrute-Force Search
Brute-Force approachBrute-Force approach :
Try all the possible assignments of the values to the variables, and check them against the constraints, rejecting those that violate any of them.
If there are n variables and k different values than : k^n possible assignments.
Chronological BacktrackingChronological Backtracking A much better approach to CSP is a chronological
backtracking:
Select an order for variables and values. Assign values to the variables one at a time.
Assign values to the variables one at a time.
Each assignment is made so that all the constraints involving any of the variables that already been assigned are satisfied.
Chronological Backtracking - Chronological Backtracking - contcont
This allows large parts of the search tree to be pruned.
If variable has no legal assignment, then the last variable that was assigned is reassigned to it’s next legal value.
The algorithm continues until a complete, consistent assignment is found - success, or all possible assignments are shown to violate some constraints - failure.
Can be continued after success in order to find all possible assignments.
Chronological BacktrackingChronological Backtracking
Q Q Q Q
QQ Q
Q
Q
Q
Q
Q
QQ Q
Q
Q
Q
Q
Q
Q
Q
Q
QTree Generated to solve Four-Queens Problem
Lets look on the tree generated by chronological backtracking to find all solutions to the Four-Queens problem.
There are 4 variables and 4 possible values.
Chronological Backtracking generates only 16 nodes in contrast of the brute-force algorithm that would generate 256 nodes.
Maximum depth of the tree is the number of variablesMaximum depth of the tree is the number of variables - 4 in the example.
Branching factor of a node is the number of legal Branching factor of a node is the number of legal valuesvalues for the given variable - maximum 4 in the example.
Chronological BacktrackingChronological Backtracking
Since these trees have fixed depth,and all the solutions are at the maximum depth, simple depth-first search can be used.
Intelligent BacktrackingIntelligent Backtracking
We can improve the performance of chronological backtracking using a number of techniques, such as:
variable ordering backjumping forward checking
Variable OrderingVariable Ordering
The idea of variable ordering is to order the variables from the constrained to least constrained. In general, the variables should be instantiated in increasing order of the size of their remaining domains.
It can be done statically at the beginning of the search or dynamically reordering the remaining variables each time a variable is assigned.
Value OrderingValue Ordering The order of the values determines the order in which the
tree is searched.
It doesn’t effect the tree size and makes no difference if all the solutions are to be found
If only one solution is required it can decrease the time required to find solution
We should order the values from least constraining to most constraining, in order to minimize the time required to find a first solution.
BackjumpingBackjumping
The ideaThe idea :When a dead-end is reached, instead of simply undoing the last decision made, the decision that actually caused the failure should be modified.
Forward CheckingForward Checking
The ideaThe idea :When a variable is assigned a value, check each remaining uninstantiated variable to make sure that there is at least one assignment for each that is consistent with all the currently instantiated variables. If not, the current variable is assigned its next value.
Constraint Recording Constraint Recording
In CSP problems some constraints are explicitly specified, and others are implied by the explicit constraints. Implicit constraints may be discovered during a backtracking ,or in advance in a preprocessing phase.
The ideaThe idea : once these implicit constraints are discovered, they should be saved explicitly so that they don’t have to be rediscovered.
Constraint Recording - Example Constraint Recording - Example The constraint graphconstraint graph on the figure consists of
3 variables (x,y,z) unary constraint of each variable
x can only be assigned the values a and b y can only be assigned the values c and d z can only be assigned the values e and f
binary constraint between each pair of variables. x and z can only be assigned as x=b,z=e or x=a, z=f etc.
Constraint Recording - Example Constraint Recording - Example
ZY
X
{ׂ(b,c),(b,d)} {ׂ(b,e),(a,f)}
{ׂ(c,e),(d,f)}{c,d} {e,f}
{a,b}
Arc ConsistencyArc Consistency
For each pair of variables X and Y that are related by a binary constraint, we remove from the domain of X any values that do not have at least one corresponding consistent assignment to Y, and vice versa.
Several iterations often required to achieve complete arc consistencyarc consistency.
Algorithm terminates when no additional values are removed in an iteration of examining all the arcs.
Because of the constraint between variables x and y, if variable x is assigned to a, then there is no consistent value that can be assigned to variable y. Thus, value a can be deleted from the domain of variable x, since no solution to the problem can assign a to x.
Arc Consistency - ExampleArc Consistency - Example
Path ConsistencyPath Consistency
Path ConsistencyPath Consistency is a generalization of arc consistency where instead of considering pairs of variables,we examine triples of constrained variables and the path between them.
Without performing arc consistency first, consider the combinations of assignments to variables x and y that are allowed by the constraint between them.
In particular, if x=b and y=d, then there is no possible assignment to variable z that is consistent with these two assignments. Thus, we can remove the ordered pair (b,d) from the set of pairs allowed by the constraint between x and y.
Path Consistency - ExamplePath Consistency - Example
Arc and Path ConsistencyArc and Path Consistency The effect of performing arc or path consistency the
resulting search space can be dramatically reduced.
In some cases can eliminate the need for search entirely.
Can be generalized into large groupings of variables, called k-consistencyk-consistency.
The complexity of consistency checks is polynomial in k.
At some point , high order consistency checks become less effective than backtracking search. In practice, arc consistency is almost always worth doing, and path consistency is often worth performing prior to backtracking search.
Heuristic RepairHeuristic Repair The ideaThe idea : Search a space of inconsistent but
complete assignments to the variables, until a consistent complete assignment is found.
ExampleExample :
In the N-queens problem, this amounts to placing all N queens on the board at the same time, and moving the queens one at a time until a solution is found.
The heuristic, called min-conflicts, is to move a queen that is in conflict with the most other queens, and move it to a position where it conflicts with the fewest other queens.
Advantage of Heuristic RepairAdvantage of Heuristic Repair
While backtracking techniques can solve on the order of 100-queen problem, heuristic repair can solve million-queen problems, often with only about 50 individual queen moves.
NoteNote : this strategy has been extensively explored in the context of boolean satisfiability where it is referred to as GSAT.
Disadvantage of Heuristic RepairDisadvantage of Heuristic Repair
The main drawback of these approach is that it is not complete , in the sense that it is not guaranteed to find a solution in a finite amount of time, even if one exists.
If there is no solution, these algorithms will run forever, whereas backtracking will eventually discover that the problem is not solvable.
CSP - ConclusionCSP - Conclusion While CSP problems appear somewhat different
from single-agent path-finding and two-player games, there is a strong similarity among the algorithms employed.
For example:
backtracking can be viewed as a form of branch-and-bound.
Heuristic repair can be viewed as a heuristic search with the same evaluation function and goal state.
Parallel Search AlgorithmsParallel Search Algorithms Search is very computation-intensive process. As a
result, there is motivation to apply the most powerful computers available to the task.
In terms of raw computing cycles, the most powerful machines are parallel processors.
Thus, we turn our attention to parallel search algorithms, of three general classes:
parallel node generation and evaluation
parallel window search
tree splitting algorithms.
Space ComplexitySpace Complexity - is solved by iterative and local search algorithms
Time ComplexityTime Complexity - is solved by paralleling the activities to many processors.
Parallel Search AlgorithmsParallel Search Algorithms
Parallel Node GenerationParallel Node Generation
In this approach we parallelize the generation and evaluation of each node.
For exampleFor example, the deep-Blue chess machine uses parallel custom VLSI hardware to generate the next legal moves , and to apply the heuristic evaluation function to each node.
Parallel Node Generation - Parallel Node Generation - ExampleExample
Algorithm - BFS
Number of Processors - 5
Problem - Get from A to F
A B
C
D
F
E
Parallel Node Generation - Parallel Node Generation - ExampleExample
D E B
A
C
P. 1P. 1
EC
D E B F
p. x= processor
P. 1P. 1
P. 1P. 1 P. 1P. 1
P. 1P. 1
P. 2P. 2
P. 2P. 2
P. 2P. 2
P. 2P. 2
P. 3P. 3
P. 3P. 3
B
Parallel Node Generation - Parallel Node Generation - limitationslimitations
The technique is inherently domain-specific.
The total amount of parallelism is limited by the domain:
There will always be a maximum amount of parallelism that can be extracted from any domain.
Beyond this amount, additional processors will not speed up the search any further. (In the example, although there were 5 processors, we
eventually used only 3 of them).
Parallel Window SearchParallel Window Search
In this parallelism approach we give different processors different guesses of the goal. In this way we cutoff many iterations of wrong guesses.
This approach is mainly used in Alpha-Beta and IDA* algorithms.
Parallel Window SearchParallel Window Searchin Alpha-Beta in Alpha-Beta
The main ideaThe main idea : The closer the alpha and beta variables reach the final result value, the more branches are cutoff in search.
The implementationThe implementation : Given multiple processors, we can divide the range from - to to series of contiguous but smaller alpha-beta windows, and give each processor the entire three to search from the same root but with different initial values of alpha and beta. One of them will return the actual min-max value , and faster then usual.
Parallel Window SearchParallel Window Searchin Alpha-Beta- Example1in Alpha-Beta- Example1
3
12
8
2
4
6
14
5
2
Alpha= - 3 Beta=
In the alpha-beta original algorithm we cutoff only 2 branches
Parallel Window SearchParallel Window Searchin Alpha-Beta- Example2in Alpha-Beta- Example2
3
12
8
2
4
6
14
5
2
If we divide the range to 4 slices : -inf to 0, 0 to 3, 3 to 20, 20 to inf, then we can see that the 3rd processor will reach the result, with cutting 4 branches (2 more then before)
Alpha= 0 3 Beta=20
Parallel Window Search -Parallel Window Search -Alpha-Beta- LimitationsAlpha-Beta- Limitations
The limit of this parallelism is that even in the actual min-max value, meaning alpha=beta = the final result,
b= branching factord=search depth
thethe complexity is still O(bcomplexity is still O(bd/2d/2))thethe complexity is still O(bcomplexity is still O(bd/2d/2))
The main ideaThe main idea : As close the cutoff threshold is close to the result, less nodes are expanded.
The implementationThe implementation : Given multiple processors, we can initialize each processor with a different cutoff threshold, while all of them search the entire tree. As soon as a processor complete its assigned iteration, it begins the iteration with the next cutoff threshold that not yet assigned to a processor.
Parallel Window SearchParallel Window Searchin IDAin IDA**
Parallel Window SearchParallel Window Searchin IDAin IDA* * - Example- Example
Heuristic function = Manhattan Distance
Start state Goal state
7 5 3
8 2 4
1 6
7 6 5
8 4
1 2 3
Parallel Window SearchParallel Window Searchin IDAin IDA* * - Example- Example
6
1+7=81+7=8
Cutoff Threshold = 6
7 5 3
8 2 4
1 6
7 5 3
8 2 4
1 6
7 5 3
8 2
1 6 4
Parallel Window SearchParallel Window Searchin IDAin IDA* * - Example- Example
Cutoff Threshold = 8
4+6=10
2+6=8
186
72543
186
725
4
3
18
6
72543
186
725
43
186
72
5
4
3
18
6
72543
18
6
7
2
543
186
72
5
43
18
6
7
25 4
3
186
7
2
543
18
6
7
2
54
3
186
72
5
43
186
7 2 5
43
4+6=10
3+5=8
3+7=10 3+7=10 3+7=10
2+8=10
2+6=8
2+8=10
1+7=8 1+7=8
6
Parallel Window SearchParallel Window Searchin IDAin IDA* * - Example- Example
If we continue to process to Threshold=12(and skip also Threshold=10) we would see that it generates only 22 nodes.
The process with Threshold=12 would find the solution very close to the time of the process with Threshold= 8.
Cutoff Threshold = 12
Parallel Window SearchParallel Window Searchin IDAin IDA* * - Example- Example
Cutoff Threshold = 12(we skipped the 10)
2+6=8
186
725
4
3
186
72543
6
186
72
5
4
3
2+8=10
18 67
25
4
3
3+7=10
18 67
25
4
3
4+6=10
18 67
2
5
4
3
5+7=12
18 67
2
5
43
6+6=12
18 67
2
5
43
7+5=12
1867
2
5
43
8+4=12
1+7=8
186
72543
186
72
5
4
3
3+9=12186
7254
3
18 67
25
4
3
4+8=12
18 67
2
5
4
3
6+8=14
18 67
2
5
43
8+6=14
9+5=141867
2
54 3
9+3=12
1867
2
54 3
10+2=12
1867
2
54
3 11+1=12
1867
2
543 12+0=12
3+9=12
1867
2
5
43
9+3=12
1867
2
54 3
10+4=14
1867
2
5
43
Parallel Window SearchParallel Window Searchin IDAin IDA* * - Example Summary- Example Summary
IDA* with one processor would build the 6,8,10,12 iterations A about 20-30 extra nodesabout 20-30 extra nodes.
IDA* with parallel window , 4 processors would start (6,8,10,12) and the “12” would return the result. a time of 3 iterations would be saveda time of 3 iterations would be saved.
Parallel Window SearchParallel Window Searchin IDAin IDA* * - Limitations- Limitations
The parallel window mainly saved the time of building the last iteration before the correct cutoff threshold.
In case of no solution, the parallel window search only speedup a little the building of all iterations.
Tree Splitting SearchTree Splitting Search
The idea is to parallelize a heuristic search by having different processors search different parts of the search tree.
Two main algorithmsTwo main algorithms :
Distributed Tree Search (DTS) Branch-and-Bound Pruning
Distributed Tree Search-Distributed Tree Search-PropertiesProperties
It’s designed to unlimited number of processors.
It does not rely on central control or shared memory.
It’s can be used on load balancing or irregular trees.
Distributed Tree Search - Distributed Tree Search - The Algorithm The Algorithm
Down the TreeDown the Tree All the processors assigned to root. One of the processors expands the root node,generating
each of its children and separates process for each child. Then it allocates processors to the child processes
according to an allocation strategy. All of the child processes that were assigned at least one
processor start executing, on one of their assigned processors.
A root process goes to sleep, awaiting a message from one of its children processes.
Distributed Tree Search - Distributed Tree Search - The AlgorithmThe Algorithm
Up the TreeUp the Tree
When one of the processors has reached the result of his sub-tree, it sends a message to the processor associated with his father node.
If it is the last child, the processor compute the result and return to his father. Otherwise, it saves the child result, and assigned the child processor to another unassigned child.
Distributed Tree Search - Distributed Tree Search - ExampleExample
D E B
A
B C
EC
D E B
p. = processor
G C I E A G C A
Problem : Get from A to F
Algorithm : BFS & DTS
Distributed Tree Search - Distributed Tree Search - ExampleExample
p. = processor
AAssociate : p.1assigned : p.1,2,3,4,5
B CAssociate : p.3assigned : p.3,4,5
Associate : p.1assigned : p.1,2
D EEC
Asso : p.1assi : p.1
Asso : p.2assi : p2
Asso : p.3assi : p3
Asso : p.4assi : p4,5
E
Asso : p.1assi : p.1
C
Asso : p.2assi : p.2
E
Asso : p.3assi : p.3
B
Asso : p.1assi : p.1
F
Asso : p.3assi : p.3
D
Asso : p.1assi : p.1
G
Asso : p.2assi : p.2
I
Asso : p.3assi : p.3
G C
Asso : p.4assi : p.4
Asso : p.5assi : p.5
DTS-LimitationsDTS-Limitations
Can only be used when the results of one subtree don’t effect the search of other subtrees
Wrong allocation strategy could destroy the parallelism.
Branch and Bound PruningBranch and Bound Pruning
The difficultyThe difficulty : In such algorithms whether a branch of tree is searched at all depends on result from other parts.
The AlgorithmThe Algorithm : as in DTS we use a breadth-first allocation of p processors till the point of one node per process.
Then, each p will proceed in parallel DFS ,using a phase of combining the current results.
Branch and Bound Pruning - Branch and Bound Pruning - ExampleExample
23 12 8 2 4 6 14 5
• Find with alpha-beta algorithm - Max move.
•Note : In 2 levels only beta value is important
Branch and Bound Pruning - Branch and Bound Pruning - ExampleExample
3
12
8
2
4
6
14
5
2
We have already seen that in the alpha-beta original algorithm we cutoff only 2 branches
Branch and Bound Pruning - Branch and Bound Pruning - ExampleExample
3
2
14
Using 3 processors
By using 3 processors we had cutoff 4 branches
4
6
12
8
5
2
P.1 , b= ,3, 2
P.1,2,3 , b= , 2
P.2 , b= , 2
P.3 , b= ,14,2
Branch and Bound Pruning - Branch and Bound Pruning - LimitationsLimitations
The number of processors will usually be much smaller than the number of leaf nodes in tree.
We can not parallelize an algorithm with more than one bound.
In the exampleIn the example: it wouldn’t help us if we had needed to check more levels in alpha-beta
Analysis of Parallel Branch-and-Analysis of Parallel Branch-and-BoundBound
b = branching factor d = search depth p= number of processors
bx - the effective branching factor of the serial algorithm. bxd - overall serial time
Analysis of Parallel Branch-and-Analysis of Parallel Branch-and-BoundBound
The search is divided to 3 steps :
Distribute the p processors using breadth-first allocation, till there a processor per node.
Each process will now search is own subtree.The depth of each subtree is d-logbp.
Propagate the result from all the processors.
This will occur at logbp level.This will occur at logbp level.
And the total time will be bx(d- logbp).And the total time will be bx(d- logbp).
Total time : logbp + bx(d- logbp)Total time : logbp + bx(d- logbp)
Analysis of Parallel Branch-and-Analysis of Parallel Branch-and-BoundBound
The speedup by using Parallel Branch-and-Bound:
Total serial timeTotal serial time : bxd
Total parallel timeTotal parallel time : bx(d -logbp) + 2logbp ~ bx(d-logbp)
The speedupThe speedup : bxd / bx(d-logbp) = px
SummarySummary
Parallelism can save a lot of time.
Three techniques were presented.Each one of them can be used ,depending on the problem and the search algorithm.
There are some search algorithms that we can not parallelize today.