55
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA U N C L A S S I F I E D SWAMP+: Enhanced Smith- Waterman Search for Parallel Models Shannon Steinfadt, Ph.D. Los Alamos National Laboratory [email protected]

SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA

U N C L A S S I F I E D

SWAMP+: Enhanced Smith-Waterman Search for Parallel

Models

Shannon Steinfadt, Ph.D.Los Alamos National Laboratory

[email protected]

Page 2: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 2

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

OutlineMotivation for Sequence Alignment

Smith-Waterman Local Sequence Alignment

SWAMP

ASC • SWAMP using ASC Emulator

SWAMP+

SWAMP and SWAMP+ on Metal• ClearSpeed• Convey Computer

Contributions

Future Work

Questions?

gcggacgctccacg-tgtc--c—-ct-cgccgcgccc-cgtctacc

gggccctcctggctcccaacagcttctcagttc ccacttc||:|:||||::|-|::|--|--||-|-|:|:|::| ||-|:||

Page 3: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 3

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Motivation: Sequence Alignment

Given two sequences:

Align them to find the longest, most common subsequence

DNA nucelotides {A, G, T, C} Proteins {A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V}

Query:IHACYSRQPELAAKLMKDVIAEPYRERLLPGFRQARQAVAEIGAVASGISGSGPTLFALCDKPETAQRVA

Subject:MFCVQCEQTIRTPAGNGCSYAQGMCGKTAETSDLQDLLIAALQGLSAWAVKAREYGIINHDVDSFAPRAFFSTLTNVNFDSPRIVGYAREAIALREALKAQCLAVDANARVDNPMADLQLVSDDLGELQRQAAEFTPNKDKAAIGENILGLRLLCLYGLKGAAAYMEHAHVLGQYDNDIYAQYHKIMAWLGTWPADMNALLECSMEIGQMNFKVMSILDAGETGKYGHPTPTQVNVKATAGKCILISGHDLKDLYNLLEQTEGTGVNVYTHGEMLPAHGYPELRKFKHLVGNYGSGWQNQQVEFARFPGPIVMTSNCIIDPTVGAYDDRIWTRSIVGWPGVRHLDGDDFSAVITQAQQMAGFPYSEIPHLITVGFGRQTLLGAADTLIDLVSREKLRHIFLLGGCDGARGERHYFTDFATSVPDDCLILTLACGKYRFNKLEFGDIEGLPRLVDAGQCNDAYSAIILAVTLAEKLGCGVNDLPLSLVLSWFEQKAIVILLTLLSLGVKNIVTGPTAPGFLTPDLLAVLNEKFGLRSITTVEEDMKQLLSA

Page 4: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 4

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Motivation: Sequence Alignment

Given two sequences:

Align them to find the longest, most common subsequence

DNA nucelotides {A, G, T, C} Proteins {A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V}

One of the most common fundamental tasks is local sequence alignment

Query: VIA-EPYRE-RLLPGFRQARQAVAEIGAVASGISGSGPTLFALCDK: : :: : :: : : : :

Subject: LVSREKLRHIFLLGGCDGARGERHYFTDFATSVPDDCLILTLACGK

Page 5: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 5

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Pairwise Local Sequence Alignment

(derived by humans)

(preserved by evolution)

Similar CharactersSimilar Characters Similar StructureSimilar StructureSimilar FunctionSimilar Function

Ancestral RelationshipsGene Functionality

Aid in Drug DiscoveryAssembly of Raw Data

Ancestral RelationshipsGene Functionality

Aid in Drug DiscoveryAssembly of Raw Data

Homologous Sequences

Page 6: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 6

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Compare all possible combinations of sequence characters against each other

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Page 7: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 7

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Aligning using Smith-Waterman Algorithm

Compare all possible combinations of sequence characters against each other

Page 8: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 8

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Aligning using Smith-Waterman Algorithm

Compare all possible combinations of sequence characters against each other

Page 9: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 9

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations of sequence characters against each other

Page 10: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 10

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations of sequence characters against each other

Page 11: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 11

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations - but it has dynamic programming data dependencies

Page 12: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 12

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations - but it has dynamic programming data dependencies

Page 13: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 13

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations - but it has dynamic programming data dependencies

Page 14: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 14

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations - but it has dynamic programming data dependencies

Page 15: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 15

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations - but it has dynamic programming data dependencies

Page 16: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 16

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations - but it has dynamic programming data dependencies

Page 17: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 17

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Aligning using Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Compare all possible combinations - but it has dynamic programming data dependencies

Page 18: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 18

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Smith-Waterman Recursive Matrix Equations

σ−⎪⎭

⎪⎬⎫

⎪⎩

⎪⎨⎧ −

=−

ji

jiji D

gCD

,1

,1. max

σ−⎪⎭

⎪⎬⎫

⎪⎩

⎪⎨⎧ −

=−

1,

1,, max

ji

jiji I

gCI

( )⎪⎩

⎪⎨⎧

==

ji

jiji S2S1miss_cost

S2S1match_costS2,S1

if

ifd

Ci, j = max

Di .j

I i, j

Ci −1, j −1 + d S1i ,S2j( )0

⎪ ⎪

⎪ ⎪

⎪ ⎪

⎪ ⎪

g : gap extension cost

σ: gap opening cost

g

Page 19: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 19

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Traceback in the Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

1) Find the maximum computed value

Page 20: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 20

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Traceback in the Smith-Waterman Algorithm

Cost KeyMatch +10

Miss -3

Insert a Gap -3Extend a Gap -1

Alignment:CATTGC - -TG

1) Find the maximum computed value2) Traceback until you reach ‘0’s

Page 21: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 21

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Smith-Waterman Vectorization Approaches

Parallel Processing• Allows high-quality results in less time using the Smith-Waterman algorithm

Rognes described four basic approaches:• Vectors along the anti-diagonal (a wavefront) approach described by Wozniak• Vectors along the query (a single column split downward) described by Rognes

and Seeberg• A striped approach introduced by Farrar• Multi-sequence vectors described by Alpern et. al. and again by Rognes

Page 22: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 22

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Parallelizing the Smith-Waterman Algorithm

Sequential matrixof computed values

Page 23: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 23

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Parallelizing the Smith-Waterman Algorithm

Tilted data arrangementto parallelize and processa diagonal at a time.

Page 24: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 24

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Parallelizing the Algorithm: “Tilting” the Matrix

Page 25: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 25

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Parallelizing the Algorithm: “Tilting” the Matrix

Page 26: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 26

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Parallelizing the Algorithm: “Tilting” the Matrix

Smith-Waterman using Associative

Massive Parallelism (SWAMP)

Page 27: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 27

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP (Smith-Waterman using Associative Massive Parallelism)

Used PEs

Unused PEs

Order of Computations

Page 28: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 28

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

ASC: Associative Architecture

SIMD with special associative featuresFine-grained parallelism

Designed for fast associative searchesContent-based searches, not memory address

Page 29: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 29

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

ASC Advantages

Quick data movement in SIMD• Move raw data in parallel• At each step, PEs follow the algorithmic steps for data movement in lock step

No message passing like MPI/PVM• No store/forward• No headers• No explicit synchronizingVery fast operations for• Finding Maximum / Minimum• Finding if there are “Any Responders”• “Pick One” active PE

Page 30: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 30

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

ASC - SWAMP Algorithm

1) Read in S1 and S2In Active PEs (those with data values for S1 or S2):2) Initialize the two-dimension variables D[$], I[$], C[$] to zeros.3) “Shift” or slide string S2 to create a titled matrix4) For every anti_diagonal (a_d) from 2 to m+n-1 do in parallel {5) If S2[$,a_d] is valid (S2 [$,a_d] ≠ “@” and S2[$,a_d] ≠ “-”) then {

6.1) Calculate score for deletion for D[$,a_d] 6.2) Calculate score for a insertion for I[$,a_d]6.3) Calculate matrix score for C[$,a_d] }7) local_maxPE=MAXDEX(C[$, a_d])8) if C[local_maxPE, a_d] > max_Val then {9.1) max_PE = local_maxPE9.2) max_Val = C[local_maxPE, a_d]) } }

10) Return max_Val, max_PE

Page 31: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 31

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP (Smith-Waterman using Associative Massive Parallelism)

Used PEs

Unused PEs

Order of Computations

Page 32: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 32

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP on ASC Analysis

Computation takes O(m+n) time with m+1 PEs

Sequential Smith-Waterman (Gotoh)• O(m*n) time, m*n space • When |S1| = |S2|, it becomes an O(n2) algorithm

SWAMP parallel algorithm• If actual number of PEs < m+1, assign {(m+1) / # PEs} work to each PE

Page 33: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 33

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP on ASC - Performance

•Based on actual measurements using ASC language and emulator.

• Predictions shown with the dashed line.

Predictions calculated using linear regression and the least squares method.

Page 34: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 34

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP+

SWAMP+ returns multiple non-overlapping sequences• Search and process with SWAMP multiple times• Return top k non-overlapping, non-intersecting sequences• Reveal additional information

— Spatial information— Length of comparisons— Identify regulatory regions and motifs

Page 35: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 35

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP+ - 3 variations

Page 36: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 36

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Moving SWAMP+ to Hardware

Have TheseHave These

Want ThisWant This

Used ThisUsed This

Associative SIMD Model - ASC

ctcgccgcgc ggcggacgct ccacgtgtcc cccgtctacc

gggccctcct ggctcccaac agcttctcag ttcccacttc

Page 37: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 37

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Hardware Analysis: ClearSpeed

Have TheseHave These

Want ThisWant This

Use ThisUse This

ClearSpeed Advance 620 PCI-X board

50 GFLOPS peak performance25W average power dissipation

ctcgccgcgc ggcggacgct ccacgtgtcc cccgtctacc

gggccctcct ggctcccaac agcttctcag ttcccacttc

Page 38: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 38

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP+ on ClearSpeed

Implemented SWAMP and SWAMP+ on the ClearSpeed board• Used the software equivalent of

— Maximum— Any Responders— Pick One

Allows accurate, deterministic timings for algorithms

Page 39: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 39

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

ClearSpeed – SWAMP+ Algorithm1) Read in S1 and S2In Active PEs (those with data values for S1 or S2):2) Initialize Row 0, Col 0 variables D[$], I[$], C[$] to zeros.3) For each PE, shift S2 down 1, copy entire string4) For every a_d from 2 to m+n-1 do in parallel {5) If S2[$,a_d] ≠ “@” and S2[$,a_d] ≠ “-” then {

6.1) Calculate score for deletion and insertion for D[$,a_d]; Calculate matrix score for C[$,a_d] }7) local_maxPE=getPE(max_int(C[$, a_d]))8) if C[local_maxPE] > max_Val then {9.1) max_PE = local_maxPE9.2) max_Val = C[local_maxPE]) } }

10) Return max_Val, max_PE 11) Perform traceback12) Mark aligned values 13) Run alignment for all k values (2-12)

Page 40: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 40

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

ClearSpeed SWAMP+ CUPS Performance

Page 41: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 41

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Average Calculation with Eight Highest Outliers RemovedCycle Counts

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

10 20 30 40 50 60 70 80 90 96

Sequence Lengths

1 Alignment

2 Alignments (1st)

2 Alignments (2nd)

3 Alignments (1st)

3 Alignments (2nd)

3 Alignments (3rd)

4 Alignments (1st)

4 Alignments (2nd)

4 Alignments (3rd)

4 Alignments (4th)

5 Alignments (1st)

5 Alignments (2nd)

5 Alignments (3rd)

5 Alignments (4th)

5 Alignments (5th)

ClearSpeed Calculation Times

Page 42: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 42

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Average Traceback Cycle Counts for all Alignments

0

50000

100000

150000

200000

250000

300000

350000

400000

10 20 30 40 50 60 70 80 90 96

Sequence Lengths

1 Alignment2 Alignments (1st)2 Alignments (2nd)3 Alignments (1st)3 Alignments (2nd)3 Alignments (3rd)4 Alignments (1st)4 Alignments (2nd)4 Alignmens (3rd)4 Alignments (4th)5 Alignments (1st)5 Alignments (2nd)5 Alignments (3rd)5 Alignments (4th)5 Alignments (5th)

First (Longest) Traceback Cycle

Counts

Shorter (2 through k) or Second to Fifth Alignments in this

instance

Clearspeed Traceback Timings

Page 43: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 43

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

ClearSpeed Computation Time Comparison

Page 44: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 44

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP+ on ClearSpeed Analysis

Computation was O(m+n) using m+1 PEs

Showed similar performance for theoretical speedup as the ASC code

When m = n, the running time of O(n) with a coefficient of 2

Page 45: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 45

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Hardware Analysis: Conveyctcgccgcgc ggcggacgct ccacgtgtcc cccgtctacc

gggccctcct ggctcccaac agcttctcag ttcccacttc

Have TheseHave These

Want ThisWant This

Use ThisUse This

Convey Computer HC-1FPGA + x86 HybridConsists of a personality & application768 GCUPS peak from single machine

Page 46: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 46

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP+ on Convey Computer System

FPGA + x86 Hybrid system• Daughter FPGA board and an x86 closely married hardware

The Smith-Waterman alignment application has multiple components• Product consists of a personality (SW02) and an aligner application (cnysws)• Implements multiple systolic arrays on the AEs to perform parallel Smith-Waterman

searches

Page 47: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 47

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Convey Smith-Waterman Personality‡

4 AEs each contain 4 tiles• Query loaded into each tile• May join 2 or 4 tiles• Up to 1280 query length• Continuously process reference DB• Efficient for searching a database with many

short strings

‡Slide courtesy of Convey Computer

Page 48: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 48

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Convey Computer Smith-Waterman Performance‡

‡Graph courtesy of Convey Computer

Page 49: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 49

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Convey ComputerSWAMP+ on Non-Associative Parallel System

1) Run parallel Convey cnysws application with traceback for two files containing sequences (S1^) and database queries (S2^)2) Capture alignment information for all query and database matchesFor each query:

For each database match:3) Copy query and sequence into altered{Query,Database}_n_n.faa4) Mark aligned bases for each query-database match While # of iterations <k-1:

For each pair of files created in instruction 3:5) Run cnysws6) Repeat Step 27) If score for hit * δ < current score:

7.1) Track match7.2) Mark aligned bases as matched for query-database match

8) Output the k sub-alignments

Page 50: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 50

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Convey Computer SWAMP+ Run Times

0

2

4

6

8

10

12

14

16

Tim

e (

seco

nd

s)

Sequence Length

Average Computation Times

Avg. Program Time Reported

Avg. System Seconds Recorded Externally

*Other files are single sequence to sequence comparison. AA consists of 2 amino acid (AA) queries

Page 51: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 51

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

SWAMP+ on Convey Analysis

Smith-Waterman execution efficient• X86 version uses a combination of the most efficient Smith-Waterman alignment

— Makes use Farrar extensions with SSE as well as North neighbor approximations

• Will use the x86 for larger comparisons and traceback• Computation was O(m+n) using m+1 PEs

Handles largest datasets of the implementationsUses the BLOSUM 25 and BLOSUM 50 substitution lookup tables for evolutionary models of similarityMaximum GCUPS values for hardware• HC-2: 1920 cells, 259 GCUPS• HC-2ex: 5120 cells, 768 GCUPS

Page 52: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 52

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Contributions

Created a new algorithm for the ASC platform• Implemented, tested SWAMP in the ASC language and emulator• Showed promising results and good scaling

Extended SWAMP to find more information between the sequences• Created the SWAMP+ suite of algorithms

— Single-to-multiple— Multiple-to-single— Multiple-to-multiple

Analyzed different hardware for best fit for ASC algorithmsImplemented the SWAMP and SWAMP+ algorithms on ClearSpeed Co-ProcessorDesigned and implemented SWAMP+ adaptation on Convey Hybrid System

Page 53: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 53

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Future Work

Convey Computer • Extend and update output to be more comprehensive• Quantitative / Qualitative comparison of BLAST results against SWAMP+ results• Timings and work analysis

Running against MPI implementations • Communication and system overhead impacts on cluster vs. more tightly coupled

system in addition to compute time

Page 54: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 54

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

ReferencesO. Gotoh, "An improved algorithm for matching biological sequences," Journal of Molecular Biology, vol. 162, pp. 705-708, Dec 15 1982.S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, "Basic Local Alignment Search Tool," Journal of Molecular Biology, vol. 215, pp. 403-410, 1990.T. Rognes, "Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation," BMC Bioinformatics, vol. 12, 2011.A. Wozniak, "Using video-oriented instructions to speed up sequence comparison," Computer Applications in the Biosciences (CABIOS), vol. 13, pp. 145 - 150, 1997.M. Farrar, "Striped Smith-Waterman speeds database searches six times over other SIMD implementations," Bioinformatics (Oxford, England), vol. 23, pp. 156-161, Jan 15 2007.T. Rognes and E. Seeberg, "Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors," Bioinformatics (Oxford, England), vol. 16, pp. 699-706, Aug 2000.

…Please see paper for full reference list

Page 55: SWAMP+: Enhanced Smith- Waterman Search for Parallel Models€¦ · Slide 5 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

Slide 55

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy’s NNSA - LA-UR-12-20189

U N C L A S S I F I E D

Q & A

Contact Info:

Dr. Shannon Steinfadt

[email protected]

http://www.SwampAlign.com