NetBioSIG2014-Talk by Salvatore Loguercio

Network-augmented Genomic Analysis (NAGA) Applied to Cystic Fibrosis studies

Salvatore Loguercio, Ph.D. loguerci@scripps.edu

@sal99k http://sulab.org

July 11, 2014 Network Biology SIG – ISMB 2014

Cystic fibrosis overview

• inherited recessive chronic disease - chest infection, lung damage, and bowel obstruction.

• 30,000 children and adults in the US (70,000 worldwide); 1,000 new cases diagnosed each year.

• Predicted median age of survival for a person with CF: late 30s.

• Primary therapy: airway clearance techniques (ACT)

Source: Cystic Fibrosis Foundation

CFTR and mucous flow 3

Source: http://www.flickr.com/photos/ajc1/3737955649

• Mutation cause the body to produce unusually thick, sticky mucus

• Clogs the lungs and leads to life-threatening lung infections

• Obstructs the pancreas and stops natural enzymes from helping the body break down and absorb food

Lysosome

WT CFTR

chloride conductance

SDS-PAGE

endosomes

Apical

surface

degradation

DF508 CFTR cannot

exit the ER

Credit: Bill Balch

CFTR mutations affect protein folding and export

A systematic approach to CF correction

Cell line: CFBE

Functional: siRNA screen

ΔF508 CFTR against PN library*

368 siRNAs that significantly rescue CFTR function

*Collection of 2500 siRNA targeting proteins involved in protein homeostasis (‘proteostasis’)

Biochemical: MudPIT proteomics

775 differentially interacting proteins (WT/ ΔF508-CFTR)

A systematic approach to CF correction

Functional: siRNA screen

ΔF508 CFTR against PN library*

368 siRNAs that significantly rescue CFTR function

Biochemical: MudPIT proteomics

775 differentially interacting proteins (WT/ ΔF508-CFTR)

Connect Functional with Biochemical data

Target

I) Compute all shortest paths from siRNA hits to the target through a weighted protein interaction network (Dijstra algorithm)

II) Prioritize connecting proteins specific to the set of high-scoring siRNA hits considered.

Connect siRNA hits to a target through the Human Interactome

I. Build integrated PPI network

II. Run Shortest Path analysis

III. Control for unrelated protein hubs

Publicly available interaction data: From 10 source databases and 11 studies

14796 proteins 169625 interactions

Quality score [0:1] for each interaction, based on experimental evidences*

*Source: Human Integrated Protein- Protein Interaction reference (HIPPIE)

d = 9 Average path length: 3.6

I. Build a weighted protein interaction network – include MS data

+ Experimental interactome

(nodes + edges)

Updated scores, based on databases and experimental interactome S(u,v) = 2 – Sexp – Sdb

1 if e(u,v) in exp 0

Target

III. Control for unrelated protein hubs

siRNA library

Randomly select a subset of the same size of the target set

shortest path analysis

Repeat n times

Randomized “hubness” For each connecting node

Target

Randomization – select proteins specific for the set of siRNA hits

For each protein connecting siRNA hits to the target, compute:

Nsp: number of distinct siRNA hits that utilize the protein on its shortest path to the target

Nrnd: randomized Nsp

p-value = 𝑠𝑢𝑚(𝑁𝑟𝑛𝑑≥𝑁𝑠𝑝)

𝑙𝑒𝑛𝑔𝑡ℎ(𝑁𝑟𝑛𝑑)

Nsp, Nrnd and the associated p-value are used to prioritize connecting proteins specific to the set of siRNA hits considered

CFTR – PN connectors – first degree – real vs. randomized

Nsp ≥3 Select: Nsp ≥3 Nsp /Nrnd≥2 (12 proteins)

Assessing candidate regulators 15

42 candidate regulators

31 previously screened

11 novel genes

22 (71%) previously

identified as hits

8 (73%) validate in de novo

experiments

Validation of predicted protein targets

siRNA screen CFTR rescue of function

8/11 (73%) novel candidate regulators validate

Gene Symbol

Solo vs. MudPit

Vx809 vs. MudPit

SRRM1 x

CDC5L x NDKB x

TPR x AIFM1 x

2ABB x KPCD2 x PLSCR1 x

MAP3K14 x TFG x x

XRCC5 x x CTNB1 x

XPO1 x MCM7 x WDR61 x

PP2AB x H2AFX x

Validation of predicted targets - Specificity

X: predicted : validated

New condition: Vx-809 drug

X: predicted : validated

Validation of predicted targets - Coverage

Restrain flow through a subset of direct interactors

Gene Symbol

Solo vs. MudPIT (partial)

Solo vs. MudPIT

(full)

Vx809 vs. MudPIT

(full)

SRRM1 x x EIF3L x STAU1 x

CAN2 x SNRPA x

AUP1 x

Good specificity Sub-optimal coverage

Summary

• NAGA is a network-based method to integrate functional genomics data (e.g. siRNA screens) with interactomics datasets (e.g. AP-MS, MudPIT)

• Useful for prioritizing novel functional targets and for

identifying relevant network modules

• It leverages publicly available information on protein-protein

interactions and thus is readily applicable to many scenarios where a connection between functional and biochemical data is sought

• Good specificity, coverage to be improved

Contact loguerci@scripps.edu

@sal99k http://sulab.org

Andrew Su

Su Lab

William Balch

Darren Hutt

Daniela Roth

Chao Wang

Anita Pottekat

Sumit Chanda

Stephen Soon

Dieter Wolf

Trey Ideker

Anne Carvunis

Jean Wang

Daniel Quan

Travel funding to ISMB 2014 was generously

provided by NSF and the NetBio SIG committee NetBio SIG

NetBioSIG2014-Talk by Salvatore Loguercio

Science

SALVATORE NEWLAND.pdf

Salvatore Morgante

Salvatore 15

Salvatore Colosi

Salvatore Giuliano

Dove va Google? Intervento di Marco Loguercio al Summit del Marketing e Comunicazione Turistica 2011

NetBioSIG2014-Talk by Yu Xia

Salvatore Tudisco

NetBioSIG2014-Intro by Alex Pico

NetBioSIG2014-Talk by Traver Hart

Salvatore Cozzolino

Salvatore Ch01

DIRETTORE SALVATORE VIVONA SALVATORE PETROTTO VIOLINO ANTONIO DI ROSALIA CHITARRA · 2015. 5. 22. · Salvatore Imbesi, Francesco Paolo Morello,Salvatore Passantino Violino II - Antonella

Loguercio Et Al-2009-European Journal of Oral Sciences

2Chapter_2 Salvatore

3Chapter_3 Salvatore

NetBioSIG2014-Talk by David Amar

Salvatore Gardi

Morrone Salvatore

NetBioSIG2014-Talk by Ashwini Patil