26
Sequence variations, protein interactions, and diseases Teresa Przytycka NIH / NLM / NCBI October 7, 2009 Inferring protein interaction from sequence co-evolution

Sequence variations, protein interactions, and diseases€¦ · Sequence variations, protein interactions, and diseases Teresa Przytycka NIH / NLM / NCBI October 7, 2009 Inferring

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

  • Sequence variations, protein

    interactions, and diseases

    Teresa Przytycka

    NIH / NLM / NCBI

    October 7, 2009

    Inferring protein interaction

    from sequence co-evolution

  • • Protein interaction and sequence co-

    evolution

  • Interacting proteins are expected to co-

    evolve to ensure proper binding

  • BSOSC Review, November 2008 4

    Ph

    ylo

    gen

    eti

    c

    Tre

    es

    Protein A Protein B

    Ort

    ho

    log

    s

    Organism 1

    Organism 2

    Organism 3

    Organism n

    Mirrortree Method:Protein A Protein B

    Inferring protein interaction from

    the co-evolution principle

    [Goh et al 2000, Pazos and Valencia 2001]

  • 5

    Evolutionary vector

    1

    2

    3

    .

    .

    .

    1 2 3 ….. n

    Distance Matrix

  • BSOSC Review, November 2008 6Compute correlation

    Ph

    ylo

    gen

    eti

    c

    Tre

    es

    Protein A Protein B

    Ort

    ho

    log

    s

    Organism 1

    Organism 2

    Organism 3

    Organism n

    Mirrortree Method:Protein A Protein B

    Inferring protein interaction from

    the co-evolution principle

  • Simple idea but lot’s of questions…

    • How to separate co-evolution due to

    common speciation history form co-

    evolution due to function?

    • Is the co- evolution signal distributed

    uniformly over the sequence? Between

    binding site only?

    • Predicting Interaction specificity SOSC Review, November 2008 7

    Kann et. al. Proteins 2007

    Kann et. al. JMB 2009

    Jothi et. al. JMB 2007

    Jothi et. al. Bioinformatics 2005

  • Challenges

    • How to separate co-evolution due to

    common speciation history form co-

    evolution due to function?

    • Is the co- evolution signal distributed

    uniformly over the sequence? Between

    binding site only?

    • Predicting Interaction specificity BSOSC Review, November 2008 8

    Kann et. al. Proteins 2007

    Kann et. al. JMB 2009

    Jothi et. al. JMB 2007

    Jothi et. al. Bioinformatics 2005

  • Do binding sites co-evolve more tightly?

    BSOSC Review, November 2008 9Kann et. al. JMB 2009

  • Binding sites are important but not the

    only contributor of the co- evolutionary

    signal

    BSOSC Review, November 2008 11Kann et. al. JMB 2009

  • BSOSC Review, November 2008 12

    Predicting interacting domains

    Jothi et. al. JMB 2007

    Given interacting multi-domain proteins

    domains that are in contact

  • BSOSC Review, November 2008 13

    Co

    rre

    lati

    on

    Mirror tree approach can be used to recognize

    interacting domains

    MSA of

    Protein A

    MSA of

    Protein B

    Do

    ma

    in

    Ali

    gn

    me

    nts

    Sim

    ilari

    ty

    Ma

    tric

    es

    Protein A Protein B

    Organism 3

    Organism 1

    Organism 2

    Organism n

    Ort

    ho

    log

    s

    0.63 0.83 0.79

    0.59 0.91 0.89

    In 64% cases, the domain pair with

    highest correlation was interacting

    (55% expected by chance)

    RESULTS: Jothi et. al. JMB 2007

  • BSOSC Review, November 2008 14

    Predicting interacting domains

    Jothi et. al. JMB 2007

    Given protein-protein interaction network

    domains that are in contact

    Guimaraes et. al. Genome Biology 2008

    Given interacting multi-domain proteins

    domains that are in contact

  • BSOSC Review, November 2008 15

    • Protein interaction and sequence co-

    evolution

    • Predicting domain interaction from

    protein interaction networks

  • BSOSC Review, November 2008 16

    Parsimony approach

    Assumption: Protein interactions are

    mediated by domain interactions

    Hypothesis: Interactions evolved in

    most parsimonious way

    Method: Find the smallest set of

    domain pairs whose interaction

    would explain all protein

    interactions in the network

    Guimaraes et. al. Genome Biology 2008

  • BSOSC Review, November 2008 17

    Additional problems to solve:

    Constraints: one per protein interaction Pm Pn

    Linear programming formulation

    For each domain pair Di Dj: variable xij taking value 0 or 1xij

    • Model the noise in the network

    • Estimate p-values

    Pm

    Pn

    Objective function(representing parsimony assumption):

    Interacting domains pairs – domains pairs with Xij=1

    Guimaraes et. al. Genome Biology 2008

  • BoSC, October 2006 18

    Results compared to previous methods: Identifying

    interacting domain pair in interacting protein pair

  • BSOSC Review, November 2008 19

    • Protein interaction and sequence co-

    evolution

    • Predicting domain interaction from

    protein interaction networks

    • Combining genetic sequence variation,

    genome wide expression profile and

    protein interaction to infer pathways

    dysregulated in complex diseases

  • Genetic variations in individuals affects gene

    expression level

    20

    Co

    ntr

    ol

    1

    Co

    ntr

    ol

    2

    Co

    ntr

    ol

    3

    Ca

    se 1

    Ca

    se 2

    Ca

    se 3

    Ca

    se 4

    Ca

    se 5

    Ca

    se 6

    Ca

    se 7

    Case 8

    Gene 1

    Gene 2

    Gene 3

    .

    .

    .

    .

    .

    Gene 3

    Genomes

    loci eQTL

    Case 1

    Ca

    se

    2

    Case 7

    Genotype variations

    Huang et.al Bioinformatics 2009

  • Bringing PPI network and other high throughput

    networks

    21

    Co

    ntr

    ol

    1

    Co

    ntr

    ol

    2

    Co

    ntr

    ol

    3

    Ca

    se 1

    Ca

    se 2

    Ca

    se 3

    Ca

    se 4

    Ca

    se 5

    Ca

    se 6

    Ca

    se 7

    Case 8

    Gene 1

    Gene 2

    Gene 3

    .

    .

    .

    .

    .

    Gene 3

    Genomes

    loci

    Target geneCausal gene

    Case 1

    Ca

    se

    2

    Case 7

    Kim et.al in submitted

    eQTL

  • Associations of genes expressed differently

    in disease /control groups are primary target

    22

    Co

    ntr

    ol

    1

    Co

    ntr

    ol

    2

    Co

    ntr

    ol

    3

    Ca

    se 1

    Ca

    se 2

    Ca

    se 3

    Ca

    se 4

    Ca

    se 5

    Ca

    se 6

    Ca

    se 7

    Case 8

    Gene 1

    Gene 2

    Gene 3

    .

    .

    .

    .

    .

    Gene 3

    loci

    controls Disease Cases C

    ase 1

    Ca

    se

    2

    Case 7

    eQTL

    Representative target disease genesPutative causal mutations

  • Uncovering causal genes and dys-regulated

    pathways

    23

    Co

    ntr

    ol

    1

    Co

    ntr

    ol

    2

    Co

    ntr

    ol

    3

    Ca

    se 1

    Ca

    se 2

    Ca

    se 3

    Ca

    se 4

    Ca

    se 5

    Ca

    se 6

    Ca

    se 7

    Case 8

    Gene 1

    Gene 2

    Gene 3

    .

    .

    .

    .

    .

    Gene 3

    loci

    controls Disease Cases C

    ase 1

    Ca

    se

    2

    Case 7

    Kim et.al. submitted

    eQTL

    Disease markers

    Causal genes

  • Acknowledgments

    BSOSC Review, November 2008 24

    Former lab members

    Katia Guimaraes (associate professor, Brazil)

    Raja Jothi (currently PI at NIEHS )

    Elena Zotenko (currently Max Planck Institute)

    Current lab members

    Dong Yeon Cho

    Yoo-ah Kim

    Yang Huang

    Damian Wojtowicz

    Jie Zheng

    Collaborators

    Maricel Kann UMBC

  • BSOSC Review, November 2008 25

  • BSOSC Review, November 2008 26

    Compute

    conservation profile

    Use conserved positions only

    Additional correction using previously mentioned methods

    Evolutionarily conserved regions help separate functional

    co-evolution from co-evolution due common speciation

    historyLow entropy (evolutionarily conserved)

    Kann et. al. Proteins 2007