28
Protein Structure Prediction Using Neural Networks Martha Mercaldi Kasia Wilamowska Literature Review December 16, 2003

Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Protein Structure PredictionUsing Neural Networks

Martha MercaldiKasia Wilamowska

Literature ReviewDecember 16, 2003

Page 2: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

The Protein Folding Problem

Page 3: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Evolution of Neural Networks

• Neural networks originally designed to approximate connections betweenneurons in the brain

Page 4: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Evolution of Neural Networks

Page 5: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Why use Neural Nets for ProteinFolding?

• Successful applications in:– Secondary structure prediction

– Solvent access

• No “inherent shortcoming” yet found

• Can incorporate evolutionary informationvia multiple alignments

• Detect previous misclassifications

Page 6: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network

• Purpose– Using neural nets, effectively predict the

secondary structure of proteins.

• Current best for secondary structureprediction is SSpro8 with accuracy in therange of 62-63%

Page 7: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network

• Input to the system– Can choose to use DNA or

amino acid sequences

– SSpro8 uses amino acidsequences

– The authors’ system,UTMPred, uses DNA

• Output - forms consistingof alpha helices, betasheets and loopsexpanded to eightstructure forms

Protein Secondary Structure Forms

Page 8: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 9: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

x - input tothe network

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 10: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

y – numberof inputnodes

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 11: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

p – prototype,representing a number ofk nearest previouslytrained input to the currenttested input

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 12: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

i – number of prototypesin y-dimensional featurespace

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 13: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

s – the activation function

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 14: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

M – number of outputclasses

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 15: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

ujq – a weight which

represents the degree ofmembership forprototype j to outputclass q

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 16: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

mjq – the BAA mass,

product of weight mjq and

activation function sj

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 17: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

hjq – the conjunctive

combination of the BBA’s

Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network

Page 18: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network

• The DBNN input are DNAsequences converted tobinary format prior to use

• The sequences are:– 88 Escheichia coli proteins

– 25 yeast Saccharomycescerevisiae proteins

– 166 mammalian proteins(80 of which are human)

Page 19: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network

• The input window sizefor UTMPred is set to7 codons, whichresults in 84 inputnodes and 8 outputnotes which representthe expandedstructural forms.

Page 20: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Protein Secondary Structure PredictionBased on Denoeux Belief Neural Network

• UTMPred used 200 prototypes and after the training wascompleted, the system was able to predict H and Eforms with accuracy above 75%. At the same time, thesystem had difficulty predicting form I, due to a smallamount of data in the training samples.

Page 21: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory

• Purpose– Using neural networks, efficiently predict

protein function

• Using databases such as Prosite, Pfam,and Prints, either query the databases formotifs within a protein in question, orquery for an absence or presence ofarbitrary combinations of motifs.

Page 22: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory

• Given a training set,induce a classifier able toassign novel proteinsequences to one of theprotein familiesrepresented in thetraining set

• Once trained, theclassifier will be able topredict novel proteins intospecific functionalfamilies based on itsknowledge of the trainingset

Page 23: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory

• Input data– From the Prosite database containing over 1100

entries. Each entry describes a function shared bysome proteins. In the experiment one Prositedocumentation entry corresponded to a protein class,and each protein class could, in turn, be characterizedby one or more motif patterns/profiles. Only motifsconsidered significant matches by profileScan werechosen.

• DBNN was used as the classifier.

Page 24: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory

• 585 proteins belonging toone of ten classes wereused, out of whichsubsets of varying sizewere picked randomly tobecome the training set.

• Once the DBNN wastrained, all 585 proteinswere used as the test setto determine accuracy

With only 10% of the total trainingsamples, DBNN could be constructed toclassify proteins with a 95% accuracy.

Page 25: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory

• The number of falsepositives generated byDBNN were significantlylower than those resultingfrom a Prosite search.

• As the size of the data setapproaches 100%, thefalse positives discoveredby DBNN approacheszero.

The number of false positives resultingfrom the use of the DBNN trained usingtraining sets of different sizes.

Page 26: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Assignment of Protein Sequence to Functional FamilyUsing Neural Network and Dempster-Shafer Theory

• A second data set of 73protein sequences drawn fromfive classes were used to builda DBNN classifier

• Using the DBNN classifier builtby random sized datasets, theoutput exceeded 96%accuracy when the training setwas greater or equal to 22

• Once the input contained morethan 80% (58 or moresequences) of the dataset, allsequences were correctlypredicted Result of classifying proteins

containing common motifs

Page 27: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Future Work

• Ultimate solution to “protein folding” will probablybe a hybrid

• Neural networks likely to be included due to theirsuccessful application to related problems– Secondary structure– Solvent access– Distance between residues in final structure– Protein interface recognition

• In addition, neural nets can combine knowledgefrom multiple sources

Page 28: Protein Structure Prediction Using Neural Networks · Protein Secondary Structure Prediction Based on Denoeux Belief Neural Network •Purpose –Using neural nets, effectively predict

Bibliography

• B. Rost. “Neural networks for protein structure prediction: hype orhit?” Artificial intelligence and heuristic methods for bioinformatics(2003): 34-50.

• S.N.V. Arjunan, S. Deris, R.M. Illias. “Protein Secondary StructurePrediction Based on Denoeux Belief Neural Network.” ICAIETProceedings (2002): 554-560.

• N.M.Zaki, S. Deris, S.N.V. Arjunan. “Assignment of ProteinSequence to Functional Family Using Neural Network andDempster-Shafer Theory” Journal of Theoretics 5-1 (2003).

Background information

• S.N.V Arkimam, S. Deris, R.M.Illias. “Prediction of ProteinSecondary Structure” Jurnal Teknologi 35(C) (2001): 81-90.

• T.Wessels, C.W. Omlin.”Refining Hidden Markov Models withRecurrent Neural Networks”.