Upload
samantha-lloyd
View
233
Download
0
Embed Size (px)
Citation preview
XFA Faster Signature Matching With Extended AutomataAuthor Randy Smith Cristian Estan and Somesh Jha
Publisher IEEE Symposium on Security and Privacy 2008
Presenter Yu-Hao Tseng
Date 20140115
1
Outline
bull Introductionbull Technical overviewbull Build XFAs from Regexbull Experimental Results
2
Introduction
bull In this paper their primary goal is to improve the time and space efficiency of signature matching in network intrusion detection systems (NIDS)
bull To achieve their goal they introduce extended finite automata (XFAs) which augment traditional FSAs with a finite scratch memory used to remember various types of information relevant to the progress of signature matching
3
Technical overview
bull For NIDS signatures REs overlap or subsume each otherbull Matching progress interleavedbull Many distinct combination of reachable states
bull Two signaturesbull where all and are distinct stringsbull which consists of all strings of length n
4
Technical overview (Cont)
bull where all and are distinct strings
5
Technical overview (Cont)
bull where all and are distinct stringsbull use a single bit of scratch memory
6
Technical overview (Cont)bull which consists of all strings of length n
bull use a counter
7
Technical overview (Cont)bull XFA = DFAs+ auxiliary variables1048708
bull Changes shape of automatabull Tames state space explosion
8
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Outline
bull Introductionbull Technical overviewbull Build XFAs from Regexbull Experimental Results
2
Introduction
bull In this paper their primary goal is to improve the time and space efficiency of signature matching in network intrusion detection systems (NIDS)
bull To achieve their goal they introduce extended finite automata (XFAs) which augment traditional FSAs with a finite scratch memory used to remember various types of information relevant to the progress of signature matching
3
Technical overview
bull For NIDS signatures REs overlap or subsume each otherbull Matching progress interleavedbull Many distinct combination of reachable states
bull Two signaturesbull where all and are distinct stringsbull which consists of all strings of length n
4
Technical overview (Cont)
bull where all and are distinct strings
5
Technical overview (Cont)
bull where all and are distinct stringsbull use a single bit of scratch memory
6
Technical overview (Cont)bull which consists of all strings of length n
bull use a counter
7
Technical overview (Cont)bull XFA = DFAs+ auxiliary variables1048708
bull Changes shape of automatabull Tames state space explosion
8
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Introduction
bull In this paper their primary goal is to improve the time and space efficiency of signature matching in network intrusion detection systems (NIDS)
bull To achieve their goal they introduce extended finite automata (XFAs) which augment traditional FSAs with a finite scratch memory used to remember various types of information relevant to the progress of signature matching
3
Technical overview
bull For NIDS signatures REs overlap or subsume each otherbull Matching progress interleavedbull Many distinct combination of reachable states
bull Two signaturesbull where all and are distinct stringsbull which consists of all strings of length n
4
Technical overview (Cont)
bull where all and are distinct strings
5
Technical overview (Cont)
bull where all and are distinct stringsbull use a single bit of scratch memory
6
Technical overview (Cont)bull which consists of all strings of length n
bull use a counter
7
Technical overview (Cont)bull XFA = DFAs+ auxiliary variables1048708
bull Changes shape of automatabull Tames state space explosion
8
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Technical overview
bull For NIDS signatures REs overlap or subsume each otherbull Matching progress interleavedbull Many distinct combination of reachable states
bull Two signaturesbull where all and are distinct stringsbull which consists of all strings of length n
4
Technical overview (Cont)
bull where all and are distinct strings
5
Technical overview (Cont)
bull where all and are distinct stringsbull use a single bit of scratch memory
6
Technical overview (Cont)bull which consists of all strings of length n
bull use a counter
7
Technical overview (Cont)bull XFA = DFAs+ auxiliary variables1048708
bull Changes shape of automatabull Tames state space explosion
8
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Technical overview (Cont)
bull where all and are distinct strings
5
Technical overview (Cont)
bull where all and are distinct stringsbull use a single bit of scratch memory
6
Technical overview (Cont)bull which consists of all strings of length n
bull use a counter
7
Technical overview (Cont)bull XFA = DFAs+ auxiliary variables1048708
bull Changes shape of automatabull Tames state space explosion
8
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Technical overview (Cont)
bull where all and are distinct stringsbull use a single bit of scratch memory
6
Technical overview (Cont)bull which consists of all strings of length n
bull use a counter
7
Technical overview (Cont)bull XFA = DFAs+ auxiliary variables1048708
bull Changes shape of automatabull Tames state space explosion
8
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Technical overview (Cont)bull which consists of all strings of length n
bull use a counter
7
Technical overview (Cont)bull XFA = DFAs+ auxiliary variables1048708
bull Changes shape of automatabull Tames state space explosion
8
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Technical overview (Cont)bull XFA = DFAs+ auxiliary variables1048708
bull Changes shape of automatabull Tames state space explosion
8
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex
bull Annotating regular expressionsbull Compiling to an XFAbull From parse trees to NXFAsbull From NXFAs to XFAs
bull ε ndasheliminationbull determinizing transitionsbull data determinization
bull Finding efficient implementations
9
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Annotate Signaturebull New operators change parse tree and add domain values 1048708
bull Parallel concatenation ( ) adds a bit1048708bull Breaks up RE into string-like componentsbull Set a bit when the left operand acceptsbull Test the bit when the right operand accepts
bull ex abcd =gt abcd
10
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull XFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ QtimesΣrarrQ is the transition functionbull D is the finite set of values in the data domainbull QtimesΣtimesDrarrD is the per transition update function which defines how the
data value is updated on every transitionbull (q0d0) is the initial configuration which consists of an initial state q0 and
an initial data value d0bull F QtimesD is the set of accepting configurationssube
11
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Definitions
bull NXFA is a 7-tuple (Q D Σ δ ( )F)bull Q is the set of statesbull Σ is the set of inputs (input alphabet)bull δ Qtimes(Σ ε )timesQ is the nondeterministic relation describing the allowed sube cup
transitionsbull D is the finite set of values in the data domainbull δ rarr is the nondeterministic update function (or update relation) which
defines how the data value is updated on every transitionbull QD0 QtimesD is the set of initial configurations of the NXFAsubebull F QtimesD is the set of accepting configurationssube
12
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
13
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
14
a
b
sum
1
[ a]
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull From parse trees to NXFAs
bull Ex ab[^a]1 =gt ab[^a]1
15
1
2
0
3
4
bit = 0cnt = 0 sum
a
bsum
[^a]
ε
ε
ε
cnt++
if (bit == 1 ampamp cnt = 1) accept()
bit = 1
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
16
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
17
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 1 ε -elimination for NXFAs
18
120576
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo
A C
Frsquo
19
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C
A C
Frsquo
A C
A C
20
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 1) (C 0) (C 2)
δlsquo A C A C A C A B C
A C
Frsquo
A B C
A C
21
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0)Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C
A C
Frsquo
A B C
A B C
22
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C
A C
Frsquo
A C
A B C
23
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo
A C
A B C
24
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 2 determinizing transitions for NXFAs
Qrsquo A C A B C (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 0) (C 1) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (B 1) (A 0) (C 0) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (B 1) (C 2) (C 2) (C 1) (A 0) (A 0) (A 0) (C 0) (C 0) (C 0) (C 2) (C 1)
Drsquo (A 0) (B 0) (C 0) (C 2)
δlsquo A C A C A C A B C A B C A B C A B C A C A B C A C
A C
Frsquo (A C (C 1)) (A B C (C 1))
A C
A B C
25
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
26
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5
Drsquo 3 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5)
G G
35 35
27
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
G H
35 3 4 5
28
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5Drsquo 3 5 3 4 5
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5)
H H
3 4 5 3 4 5
29
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5 7
30
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7)
H G
3 4 5 3 5
31
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 7 3 5 6
32
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 7 3 4 5
33
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G G
3 5 6 3 5
34
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
35
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Alg 3 data determinization for NXFAs
Q G H 3 5 3 5 3 5 7 3 5 6 3 5 6 3 5 3 5 3 4 5 3 5 7 3 4 5 3 5 6 3 4 5 3 4 5 3 4 5 3 4 5 3 5 7 3 4 5 3 5
Drsquo 3 5 3 4 5 3 5 7 3 5 6
δlsquo G G G H H H H G H G
( ) (G 3 5)
Frsquo (G 3 5 6)
QD (G 3 5) (H 3 4 5) (G 3 5 7) (G 3 5 6)
G H
3 5 6 3 4 5
36
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Build XFAs from Regex (Cont)
bull Compile to XFAbull Finding efficient implementations
37
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Experimental Results
bull 1450 Regular expressions extracted from Snort HTTPbull Characteristics of combined XFA1048708bull 41994 total states =gt 42 MBbull 195 bits (~25 bytes) of aux memorybull Instruction memory 35 MB
38
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Experimental Results (Cont)
39
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Experimental Results (Cont)
40
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41
Conclusion
bull DFAs for regular expressions often blow up when combined
bull XFA = DFAs+ auxiliary variables1048708bull Changes shape of automatabull Tames state space explosion
bull Result compared to other feasible approaches reduce both time and space
41