A Space-Efficient Randomized DNA Algorithm for k-SAT Kevin Chen and Vijay Ramachandran Princeton University

A Space-Efficient Randomized DNA Algorithm for k-SAT

Kevin Chen andVijay Ramachandran

Princeton University

DNA Computing

• DNA can store a lot of information!– A DNA sequence is a bitstring5’-A-T-T-G-C-A-T G-C-A-3’ 3’-A-C-G-T-A-C-G-T-T-G-5’

– A test tube can hold approximately 250

strands of DNA

• Important chemical properties– Watson-Crick complementary hybridization

DNA Computing

• Massive parallelism– Each strand is a “processor”– Operations can be performed on all strands

in a test tube with one laboratory step

• Power of large database searching– Use DNA to store lots of information, use

chemical properties to pull out what we’re looking for

Examples of DNA Algorithms

• Hamiltonian Circuit, Adleman, 1994– DNA sequences represent paths through a

graph that self-assemble

• SAT, Lipton, 1995– Strands encode solutions; generate all and

filter out ones that work

Randomized Algorithms

Types Of Algorithms

• Monte Carlo:Answer can be incorrect (e.g., Primality testing)

• Las Vegas:Answer always correct but running time is probabilistic(e.g., Quicksort)

For more information:Motwani & Raghavan, Randomized Algorithms. Cambridge UP, 1995.

Examples ofRandomized Algorithms

AlgorithmAlgorithmBest knownBest known

non-randomized non-randomized running timerunning time

Best known Best known randomized randomized running timerunning time

Quicksort O(n2) O(n log n)

Primality Testing (log n)O(log log log n) O(log3 n)

3-SAT O(20.582n) O(20.446n)

Paturi’s k-SAT Algorithm (1997)

1. While there are unassigned variables left:– Choose a random variable xi

– If xi is forced, set xi as required, else assign xi toT or F at random

2. Test if the assignment is a solution

3. Repeat steps 1 and 2, I times, where the error probability is

e

k/112 I

Paturi’s Algorithm (continued)

• Randomized version of Davis-Putnam (1960)• Running time is

O(n2m·2n–n/k)for a k-SAT instance with n variables andm clauses

• Monte-Carlo Algorithm: may output ‘Unsatisfiable’ when a solution exists

Objective

• Adapt Paturi’salgorithm to Adleman and Lipton’s DNAextract model

• Reduce space complexity sincenot all 2n solutionsare generated

Related Work

• Lipton (1995)– First DNA algorithm for formula-SAT– O(n) time and O(2n) space

• Ogihara and Ray (1997)– Implemented Monien-Speckenmeyer

algorithm for 3-SAT– O(nm2 + n2) time and O(20.6942n) space

• Diaz, Esteban and Ogihara (2000)

Model of Computation

• Extended version of the extract model with the APPEND primitive

• 2n+3 well-behaved words, needed toform assignment sequences to build solution strands

Assignment sequence for“x2 = False”, with separator

and primer sequencesS and P

PP SS

S’S’ P’P’ S’S’

x(2)=Fx(2)=F

x(2)=F’x(2)=F’

Allowable Operations

• COMBINE• DETECT• EXTRACT• APPEND*†

• POUR*• TO-SINGLE-

STRANDED• TO-DOUBLE-

STRANDED

* Sources of randomness† Generalization of Boneh’s APPEND

Implementation of APPEND

HH PP SS x(5)=Tx(5)=T PP SS

H’H’ P’P’ S’S’ x(5)=T’x(5)=T’ P’P’

PP SS


x(2)=Tx(2)=T

x(2)=T’x(2)=T’

PP SS


x(2)=Fx(2)=F

x(2)=F’x(2)=F’

Partial assignmentcontaining x5 = T

Assignment sequence for x2 = T

Assignment sequence for x2 = F

One sequence will anneal at this exposed sticky end

Each sequence has a 50% Each sequence has a 50% chance of appending to our chance of appending to our partial assignmentpartial assignment

Implementation ofTO-DOUBLE-STRANDED




H’H’ P’P’ S’S’ x(5)=T’x(5)=T’ P’P’

P’P’ P’P’

Original Strand

Add complement of primer sequence, P’

Add DNA polymerase, bases (A,T,C,G), and DNA ligase

Polymerase will construct the complementary strand starting with the primer sequence

DNA Algorithm

• Start with a tube T containing I empty solution strands.

• While there are still unassigned variables, perform the following loop:

T HH PP SS I

DNA Algorithm (continued)

• Pour T into n tubes

• Remove possible repeat assignments

• Check forforced variables

• Appendappropriateassignment sequences

Pour out the contents of T into n tubes and associate a variable with each tube via a random permutation.

…

T1 x6 T2 x2 T3 x4 Tn x5

T


• Pour T into ntubes



• Appendappropriate assignmentsequences

If a strand already has an assignment for the variable associated with its tube, move it to the next tube and repeat.

T1 x6

Extract strands with “x6 = T”or “x6 = F”

T2 x2

Add tonext tube and repeat+





• Appendappropriateassignment sequences

For each tube, partition the strands into three parts: forced to T, forced to F, not forced.Consider clause C = x1 x2 x3 …

else

x1 = F x2 = T

else

etc.…

Forced to x3 = T

Repeat for other clausescontaining x3

else

Ti x3

Ti x3





• Append appropriateassignment sequences

APPEND the correct assignment sequence for forced variables and assign a random value otherwise.

Notforced

Forcedto T

Forcedto F

AssignTrue

AssignFalse

Combine all

…


• Combine all tubes and repeat while loop.

• After n iterations (all variables have been assigned on each strand), check if any of the assignment strands are solutions to the formula using Lipton’s algorithm.

T Solutions

Lipton’s Algorithm

Analysis of Algorithm

On an n variable, m clause instance ofk-SAT, if the error probability is e–, our algorithm has space complexity:

The analysis is the same as the time complexity analysis in Paturi (1997, 1998).

α log1 1

2 nkO

Comparison of DNAk-SAT Algorithms

AlgorithmAlgorithm TypeType SpaceSpace TimeTimemax(max(nn))

kk = 3 = 3

Lipton

1995k-SAT O(2n) O(km) 50

Ogihara

19973-SAT only

O(20.6942n) O(n2+nm2) 72

CR1999Random

k-SATO(n2+k2nm) 75

α log1 1

2 nkO

Conclusion

• Our algorithm reduces space complexity while only requiring O(n) words

• This is the first known attempt to harness the tools of randomized classical algorithms in DNA computing

• Further gains may be obtainable by adapting the algorithm of Paturi (1998)

Documents

A Space-Efficient Randomized DNA Algorithm for k-SAT Kevin Chen and Vijay Ramachandran Princeton University