[Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

Embed Size (px)

Citation preview

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    1/26

    A Scalable Architecture For High-

    Throughput Regular-Expression

    Pattern Matching

    Authors Benjamin C. Brodie, Ron K.Cytron, David E. Taylor

    PresenterYu-Tso Chen

    Date September 10, 2007

    Publisher ISCA06

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    2/26

    2

    Outline

    Introduction

    Regular expression

    Approach and architecture Optimizations

    Performance

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    3/26

    3

    Introduction

    The problem of discovering credit card numbers,

    currency values, or telephone numbersrequires a

    more general specification mechanism.

    For such applications, the most widely used pattern-

    specification language is regular expressions.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    4/26

    4

    Introduction

    We describe a new high throughput FSMrepresentation andits construction algorithm.

    State transitions are triggered by a sequence of msymbols,rather than by a single symbol.

    Character and sequence encoding to reduce a transition tablescolumn count, run-length codingto compress the transition

    table.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    5/26

    5

    Regular expression

    \$[0 - 9]+(\.[0 - 9]{2}){0,1}

    A match must begin with $. The match continues only by the appearance of one or more

    decimal digits.

    Optionally, the match may continue with a decimal pointfollowed by exactly two decimal digits.

    EX:

    $3, $34, $347, $347.12

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    6/26

    6

    FSM exampleRegular expression: \$[0 - 9]+(\.[0 - 9]{2}){0,1}

    (a) symbol denotes all symbols not in{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, $, . }

    (b) Transition table for the FSM; if is the ASCII character set, then the

    table has 256 columns.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    7/26

    7

    Approach

    A set of regular expressions is presented to the

    Regular Expression Compiler.

    Generating three tables: symbol ECI (Equivalence

    Class Identifier) table, indirection table, transition table.

    The Regular Expression Circuits implement high-

    throughput FSMs.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    8/26

    8

    Overall Architecture

    Regular Expression Compiler

    Regular Expression Circuits

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    9/26

    9

    A High-Throughput FSM Modern interconnection interfaces transport multiple bytes per

    cycle, making the FSM the bottleneck in terms of achieving

    higher throughput.

    We consider an extension of the FSM that allows it to performa single transition based on a string of msymbols.

    EX:

    Based on the table in Figure 2(b), if the current state is E, then

    the input sequence 2$would result in a transition to state B:

    2

    (E, 2$) =

    ((E, 2), $) = B.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    10/26

    10

    New FSM Because it performs multiple transitions per cycle, the higher

    throughput FSM can traverse an accepting state of the originalFSM during a transition.

    Each accept may be required to report the origination point of

    the match in the target.

    To support recognition of accept and restart conditions, weaugment the traditional transition function with two flags(bits):The first flag indicates a restarttransition and the second flag

    indicates an accepttransition. (like Moore or Mealy model?)

    The automaton includes state variables band eto record thebeginningand endof a match; initially, b =e = 0, and the index ofthe first symbol of the target is 1.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    11/26

    11

    black:e is advanced by m.

    A match is in progress and the

    portion of the target string

    participating thus far begins atposition b and ends at position e,

    inclusively.

    red only:e is advanced by m and

    a match is declared.The target

    substring causing the match starts

    at position b and ends at position

    e.

    green only:b is advanced by m

    and e is set to b.The automatonis restarting the match process.

    red and green:The red action is

    performed before the green

    action.

    Transition on a red edgeindicate a pattern

    matchTransition on a green edgeindicates thefirst symbol of a new potential match.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    12/26

    12

    Alphabet Encoding

    The size of a FSMs transition table () increasesexponentiallywith the lengthof the input sequenceconsumed in each cycle.

    Given a transition table : Q Q, an O(||2|Q|)algorithm for partitioning into a (typically) reduced set ofequivalence classes.

    We call the index into the encoded table an ECI(Equivalence class Identifier).

    Of the 16 possible two-character input sequences, thetransition table requires only 8columns due to the ECI

    mapping.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    13/26

    13

    Example

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    14/26

    14

    High-Throughput with Encoding

    1. The FSM d is constructed for r (regular expression) in the

    usual way.

    2. A set of transitions is computed that would allow theautomaton to accept based on starting at anyposition in the

    target. This is accomplished by simulating for each state a-

    transition to q0. Specifically, is computed as follows:

    foreachp Q do

    foreach a do

    {(p, a) (q0, a)}

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    15/26

    15

    High-Throughput with Encoding

    3. From d and, the base FSM1is constructed by the algorithm inFigure 8. The results of that algorithm are shown in Figure 5for our running example.

    4. State minimization is performed on a high-throughput FSM bythe standard algorithm, except that states are initially split byincoming edge color (black, green, red), instead of by whetherthey are final or nonfinal states in a traditional FSM.

    5. Given aFSMk, a higher throughputFSM2kwith alphabetencoding is constructed by the stride-doubling algorithm.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    16/26

    16

    The Symbol ECI Encoding Block

    Transforming a sequence of m symbols into an ECI,

    suitable for presentation to a transition table.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    17/26

    17

    Transition Table Compression via Run-

    Length Coding

    Run-length codingis a simple technique that can reduce the storagerequirements for a sequence of symbols

    EX:

    String: aaaabbbcbbaaaRun-length coding: 4(a)3(b)1(c)2(b)3(a)

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    18/26

    18

    Supporting Variable-Length Columns

    in Memory

    The entire table will typically be too large to be fetched in

    one memory access.

    Transition Table Memory (TTM)

    xis the number of entriesper memory wordin the TTM.

    The compressed columns should be placed in physical memory as

    compactly as possible.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    19/26

    19

    Supporting Variable-Length Columns

    in Memory

    Indirection table

    Once the Indirection Table entry is retrieved using the input symbol

    ECI, thepointeris used to read the first memory word from the

    TTM. An entire column is accessed by starting at address indexand

    reading w consecutive wordsfrom the TTM, where w is given by:

    Theindex and countvalues determine which entries in the first and

    last memory words participate in the column

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    20/26

    20

    Supporting Variable-Length Columns

    in Memory

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    21/26

    21

    Supporting Variable-Length Columns

    in Memory

    State Selection Precompute prefix sums of the compressed entries

    The old count value is translated into Word CountandTerminal Indexvalues.

    TheInitial Index is used to retrieve a set of mask bits

    from theInitial Index Mask ROM. These bits are used

    to mask off preceding entries in the memory word that

    are not part of the transition table column.

    Terminal index = (count + index - 1) mod x

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    22/26

    22

    Supporting Variable-Length Columns

    in Memory

    The run-length sums for entries that are not part

    of the transition table column are forced to the

    value of theMax Run-Length Register.

    Forcing the run-length sums of trailing entries to be

    the maximum run-length sum value simplifies the

    Priority Encoder that generates the select bits for themultiplexor that selects the next state.

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    23/26

    23

    State select block

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    24/26

    24

    Optimizations

    Improving the effectiveness of run-length coding

    State reordering (knapsack problem)

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    25/26

    25

    Optimizations

    Memory packing

    For example, the compressed column associated with ECI 1 inFigure 11 occupies 3 entries and could fit in a single word of theTTM. However, the column began at a location that caused it tospan 2 memory words instead of 1.

    This problem is a variant of the classical fractional knapsackproblem

    The basic idea is to find the longest run-lengthencodedcolumn and chose it first. We pack it into memory wordsguaranteeing that it achieves the best possible packing. Wethen take the number of remaining entries in the last columnand apply subset sum on it with the remaining run-length

    encoded columns. (greedy)

  • 8/11/2019 [Y2006]a Scalable Architecture for High-Throughput Regular-Expression Pattern Matching

    26/26

    26

    Performance

    Snort(open source)(IPS)(IDS)