41
protein RNA DNA Predicting Protein Function

Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

proteinRNADNA

Predicting Protein Function

Page 2: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Biochemical function(molecular function)

What does it do?Kinase???Ligase???

Page 245

Page 3: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Function based onligand binding specificity

What (who) does it bind ??

Page 245

Page 4: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Function basedon biological process

What is it good for ??Amino acid metabolism?

Page 245

Page 5: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Function based oncellular location

DNA RNA

Page 245

Where is it active?? Nucleolus ?? Cytoplasm??

Page 6: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Function based oncellular location

DNA RNA

Page 245

Where is the Protein Expressed ??Brain? Testis? Where it is under expressed??

Page 7: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

GO (gene ontology)http://www.geneontology.org/

• The GO project is aimed to develop three structured, controlled vocabularies (ontologies) that describe gene products in terms of their associated

• molecular functions (F)• biological processes (P) • cellular components (C)

Ontology is a description of the concepts and relationships that can exist for an agent or a community of agents

Page 8: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Inferring protein function Bioinformatics approach

• Based on homology

• Based on functional characteristics

“protein signature”

Page 9: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Homologous proteinsRule of thumb:Proteins are homologous if 25% identical (length >100)

Page 10: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Proteins with a common evolutionary origin

Paralogs - Proteins encoded within a given species that arose from one or more gene duplication events.

Orthologs - Proteins from different species that evolved by speciation.

Hemoglobin human vs Hemoglobin mouse

Hemoglobin human vs Myoglobin human

Homologous proteins

Page 11: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

COGsClusters of Orthologous Groups of proteins

> Each COG consists of individual orthologous proteins or orthologous sets of paralogs.

> Orthologs typically have the same function, allowing transfer of functional information from one member to an entire COG.

DATABASE

Refence: Classification of conserved genes according to their homologous relationships. (Koonin et al., NAR)

Page 12: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Inferring protein function based on the protein signature

Page 13: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

The Protein Signature

Expression PatternWhere it is expressed ?

Motif (or fingerprint):• a short, conserved region of a protein• typically 10 to 20 contiguous amino acid residues

Domain: • A region of a protein that can adopt a 3 dimensional structure

Page 14: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

1 50ecblc MRLLPLVAAA TAAFLVVACS SPTPPRGVTV VNNFDAKRYL GTWYEIARFD vc MRAIFLILCS V...LLNGCL G..MPESVKP VSDFELNNYL GKWYEVARLDhsrbp ~~~MKWVWAL LLLAAWAAAE RDCRVSSFRV KENFDKARFS GTWYAMAKKD

GTWYEI K AV M

GXW[YF][EA][IVLM]

Protein MotifsProtein motifs can be represented as a consensus or a profile

Page 15: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Searching for Protein Motifs

- ProSite a database of protein patterns that can be searched by either regular expression patterns or sequence profiles.

- PHI BLAST Searching a specific protein sequence pattern with local alignments surrounding the match.

-MEME searching for a common motifs in unaligned sequences

Page 16: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Protein Domains

• Domains can be considered as building blocks of proteins.

• Some domains can be found in many proteins with different functions, while others are only found in proteins with a certain function.

Page 17: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

DNA Binding domainZinc-Finger

Page 18: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Varieties of protein domains

Page 228

Extending along the length of a protein

Occupying a subset of a protein sequence

Occurring one or more times

Page 19: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Example of a protein with 2 domains: Methyl CpG binding protein 2 (MeCP2)

MBD TRD

The protein includes a Methylated DNA Binding Domain(MBD) and a Transcriptional Repression Domain (TRD).MeCP2 is a transcriptional repressor.

Page 20: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Result of an MeCP2 blastp search:A methyl-binding domain shared by several proteins

Page 21: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Are proteins that share only a domain homologous?

Page 22: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Pfam

> Database that contains a large collection of multiple sequence alignments of protein domains

Based on Profile hidden Markov Models (HMMs).

Page 23: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Profile HMM (Hidden Markov Model)

D16 D17 D18 D19

M16 M17 M18 M19

I16 I19I18I17

100%

100% 100%

100%

D 0.8S 0.2

P 0.4R 0.6

T 1.0 R 0.4S 0.6

X XX X

50%

50%D R T RD R T SS - - SS P T RD R T RD P T SD - - SD - - SD - - SD - - R

16 17 18 19

HMM is a probabilistic model of the MSA consisting of a number of interconnected states

Match

delete

insert

Page 24: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Pfam

> Database that contains a large collection of multiple sequence alignments of protein domains

Based on Profile Hidden Markov Models (HMMs).

> The Pfam database is based on two distinct classes of alignments

–Seed alignments which are deemed to be accurate and used to produce Pfam A-Alignments derived by automatic clustering of SwissProt, which are less reliable and give rise to Pfam B

Page 25: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Physical properties of proteins

Page 26: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

DNA binding domains have relatively high frequency of basic (positive) amino acids

M K D P A A L K R A R N T E A AR R S S R A R K L Q R M

GCN4

zif268 M E R P Y A C P V E S C D R R FS R S D E L T R H I R I H T

myoDS K V N E A F E T L K R C T S S N

P N Q R L P K V E I L R N A I R

Page 27: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Transmembrane proteins have a unique hydrophobicity pattern

Page 28: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Knowledge Based Approach

• IDEA Find the common properties of a protein

family (or any group of proteins of interest) which are unique to the group and different

from all the other proteins. Generate a model for the group and predict

new members of the family which have similar properties.

Page 29: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Knowledge Based Approach

• Generate a dataset of proteins with a common function (DNA binding protein)

• Generate a control dataset • Calculate the different properties which are characteristic

of the protein family you are interested for all the proteins in the data (DNA binding proteins and the non-DNA binding proteins

• Represent each protein in a set by a vector of calculated features and build a statistical model to split the groups

Basic Steps1. Building a Model

Page 30: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

• Calculate the properties for a new protein

And represent them in a vector

• Predict whether the tested protein belongs to the family

Basic Steps2. Predicting the function of a new protein

Page 31: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

TEST CASEY14 – A protein sequence translated from an ORF (Open Reading Frame)Obtained from the Drosophila complete Genome

>Y14PQRSVGWILFVTSIHEEAQEDEIQEKFCDYGEIKNIHLNLDRRTGFSKGYALVEYETHKQALAAKEALNGAEIMGQTIQVDWCFVKG G

Page 32: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245
Page 33: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245
Page 34: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245
Page 35: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245
Page 36: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245
Page 37: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

>Y14PQRSVGWILFVTSIHEEAQEDEIQEKFCDYGEIKNIHLNLDRRTGFSKGYALVEYETHKQALAAKEALNGAEIMGQTIQVDWCFVKG G

Y14 DOES NOT BIND RNA

Page 38: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Projects 2011-12

Page 39: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

Key dates

19.12 lists of suggested projects published **You are highly encouraged to choose a project yourself or find a relevant project which can help in your research

29.1 Submission project overview (power point presentation Max 5 slides)-Title-Main question-Major Tools you are planning to use to answer the questions30.1/31.1 Presentation of project overview7.3 Poster submission14.3 Poster presentation

Instructions for the final projectIntroduction to Bioinformatics 2011-12

Page 40: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

2. Planning your research After you have described the main question or questions of your project, you should carefully plan your next stepsA. Make sure you understand the problem and read the necessary background to proceed B. formulate your working plan, step by stepC. After you have a plan, start from extracting the necessary data and decide on the relevant tools to use at the first step. When running a tool make sure to summarize the results and extract the relevant information you need to answer your question, it is recommended to save the raw data for your records , don't present raw data in your final written project. Your initial results should guide you towards your next steps.D. When you feel you explored all tools you can apply to answer your question you should summarize and get to conclusions. Remember NO is also an answer as long as you are sure it is NO. Also remember this is a course project not only a HW exercise. .

Page 41: Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245

3. Summarizing final project in a poster (in pairs)Prepare in PPT poster size 90-120 cmTitle of the project Names and affiliation of the students presenting

The poster should include 5 sections :Background should include description of your question (can add

figure)Goal and Research Plan: Describe the main objective and the research planResults (main section) : Present your results in 3-4 figures, describe

each figure (figure legends) and give a title to each result Conclusions : summarized in points the conclusions of your projectReferences : List the references of paper/databases/tools used for

your project

Examples of posters will be presented in class