20
Transformational Transformational Grammars Grammars and PROSITE Patterns and PROSITE Patterns Roland Miezianko Roland Miezianko CIS 595 - Bioinformatics CIS 595 - Bioinformatics Prof. Vucetic Prof. Vucetic

Transformational Grammars and PROSITE Patterns

Embed Size (px)

DESCRIPTION

Transformational Grammars and PROSITE Patterns. Roland Miezianko CIS 595 - Bioinformatics Prof. Vucetic. Agenda. Transformational Grammars Definition The Chomsky Hierarchy Finite State Automata FMR-1 Triplet Repeat Region Regular Grammar Example PROSITE Patterns in Regular Grammar Form. - PowerPoint PPT Presentation

Citation preview

Page 1: Transformational Grammars and PROSITE Patterns

Transformational GrammarsTransformational Grammarsand PROSITE Patternsand PROSITE Patterns

Roland MieziankoRoland Miezianko

CIS 595 - BioinformaticsCIS 595 - Bioinformatics

Prof. VuceticProf. Vucetic

Page 2: Transformational Grammars and PROSITE Patterns

AgendaAgenda

• Transformational GrammarsTransformational Grammars– DefinitionDefinition– The Chomsky HierarchyThe Chomsky Hierarchy

• Finite State AutomataFinite State Automata– FMR-1 Triplet Repeat RegionFMR-1 Triplet Repeat Region– Regular Grammar ExampleRegular Grammar Example

• PROSITEPROSITE– Patterns in Regular Grammar FormPatterns in Regular Grammar Form

Page 3: Transformational Grammars and PROSITE Patterns

AssumptionsAssumptions

• Treated biological sequences as Treated biological sequences as one-dimensional strings of one-dimensional strings of independent and uncorrelated independent and uncorrelated symbols.symbols.

• Need to address interaction among Need to address interaction among base pairs to understand base pairs to understand secondary structures.secondary structures.

Page 4: Transformational Grammars and PROSITE Patterns

Secondary StructuresSecondary Structures

• The 3-D folding of proteins and The 3-D folding of proteins and nucleic acids involves extensive nucleic acids involves extensive physical interactions between physical interactions between residues that are not adjacent in residues that are not adjacent in primary sequence. primary sequence. [1][1]

• Require a model for secondary Require a model for secondary structure that reflect the structure that reflect the interaction among base pairs.interaction among base pairs.

Page 5: Transformational Grammars and PROSITE Patterns

Modeling StringsModeling Strings

• General theories for modeling General theories for modeling strings of symbols has been strings of symbols has been developed by computational developed by computational linguistslinguists– Chomsky in 1956, 1959Chomsky in 1956, 1959– Interested in how a brain or computer Interested in how a brain or computer

program could algorithmically program could algorithmically determine whether a sentence was determine whether a sentence was grammatical or notgrammatical or not

Page 6: Transformational Grammars and PROSITE Patterns

Transformational Transformational GrammarsGrammars

• Transformational Grammars consist Transformational Grammars consist of:of:– SymbolsSymbols

• Abstract Nonterminal SymbolsAbstract Nonterminal Symbols• Terminal SymbolsTerminal Symbols

– Rewriting Rules (Productions) Rewriting Rules (Productions) • A --> BA --> B

Page 7: Transformational Grammars and PROSITE Patterns

Transformational Transformational Grammars, ExampleGrammars, Example

Example GrammarTwo-letter terminal alphabet: {a, b}Single nonterminal letter: SThree Productions:S->aSS->bSS->e (e=special blank terminal symbol)

Example derivation of our simple grammar:S->aS->abS->abbS->abb

Page 8: Transformational Grammars and PROSITE Patterns

Chomsky HierarchyChomsky Hierarchy

• Four types of restrictions on Four types of restrictions on grammar’s productions resulted on grammar’s productions resulted on four classes of grammars.four classes of grammars.– Regular GrammarsRegular Grammars– Context-Free GrammarsContext-Free Grammars– Context-Sensitive GrammarsContext-Sensitive Grammars– Unrestricted GrammarsUnrestricted Grammars

Page 9: Transformational Grammars and PROSITE Patterns

Chomsky HierarchyChomsky Hierarchy

regular

context-free

context-sensitive

unrestricted

Page 10: Transformational Grammars and PROSITE Patterns

AutomataAutomata

• Each grammar has a corresponding Each grammar has a corresponding abstract computational device called: abstract computational device called: automatonautomaton

Grammar Parsing Automaton

Regular Finite State

Context-Free Push-Down

Context-Sensitive Linear Bounded

Unrestricted Turing Machine

Page 11: Transformational Grammars and PROSITE Patterns

FRM-1 TripletFRM-1 TripletRepeat RegionRepeat Region

• FRM-1 gene sequence contains FRM-1 gene sequence contains CGG which is repeated number of CGG which is repeated number of timestimes

• Number of triplets is highly variable Number of triplets is highly variable between individualsbetween individuals

• Increased copy number is Increased copy number is associated with a genetic diseaseassociated with a genetic disease

Page 12: Transformational Grammars and PROSITE Patterns

FRM-1 TripletFRM-1 TripletRepeat RegionRepeat Region

• FSA will match any string from the FSA will match any string from the “language” that contains the “language” that contains the strings:strings:

GCG CTG

GCG CGG CTG

GCG CGG CGG CTG

GCG CGG CGG CGG CGG … CTG

Page 13: Transformational Grammars and PROSITE Patterns

FRM-1 TripletFRM-1 TripletRepeat RegionRepeat Region

Page 14: Transformational Grammars and PROSITE Patterns

FRM-1 TripletFRM-1 TripletRepeat RegionRepeat Region

Regular Grammar for our Finite State Automaton finds any number of copies of CGG

Page 15: Transformational Grammars and PROSITE Patterns

PROSITE PatternsPROSITE Patterns

• PROSITE database is an example of PROSITE database is an example of a biological application of regular a biological application of regular grammarsgrammars– Unlike methods which assign scores to Unlike methods which assign scores to

alignments, PROSITE patterns either alignments, PROSITE patterns either match a sequence or do not.match a sequence or do not.

Page 16: Transformational Grammars and PROSITE Patterns

PROSITE PatternsPROSITE Patterns

• Consists of a string of pattern Consists of a string of pattern elements separated by dashes and elements separated by dashes and terminated by a periodterminated by a period– Pattern Element – single letterPattern Element – single letter– [ ] - any one letter[ ] - any one letter– { } – anything but enclosed letters{ } – anything but enclosed letters– X – any residue can occurX – any residue can occur– X(y) – any letter of length yX(y) – any letter of length y

Page 17: Transformational Grammars and PROSITE Patterns

PROSITE PatternsPROSITE Patterns

[RK]-G-{EDRKHPCG}-[AGSCI]-[FY]-[LIVA]-x-[FYM].

RNP-1 Motif

Page 18: Transformational Grammars and PROSITE Patterns

ConclusionConclusion

• Transformational grammars are Transformational grammars are useful in developing acceptors of useful in developing acceptors of different length sequences and for different length sequences and for matching specific multi-sequence matching specific multi-sequence regions.regions.

• Higher order grammars in the Higher order grammars in the Chomsky hierarchy are more Chomsky hierarchy are more difficult to program and applydifficult to program and apply

Page 19: Transformational Grammars and PROSITE Patterns

ReferencesReferences

[1] Durbin, R. Biological Sequence Analysis: Probabilistic Models of [1] Durbin, R. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. University of Cambridge Press, 1998.Proteins and Nucleic Acids. University of Cambridge Press, 1998.

[2] Gibson, G. A Primer of Genome Science. Sinauer Associates, Inc. [2] Gibson, G. A Primer of Genome Science. Sinauer Associates, Inc. Publishers, 2002. Publishers, 2002.

[4] PROSITE Database http://us.expasy.org/prosite/[4] PROSITE Database http://us.expasy.org/prosite/

[3] Mount, D. Bioinformatics: Sequence and Genome Analysis. Cold [3] Mount, D. Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press, 2001. Spring Harbor Laboratory Press, 2001.

Page 20: Transformational Grammars and PROSITE Patterns

Transformational GrammarsTransformational Grammarsand PROSITE Patternsand PROSITE Patterns

QuestionsQuestions

AndAnd

AnswersAnswers