Upload
vukhue
View
218
Download
3
Embed Size (px)
Citation preview
114/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
Mining the Structural Genomics Mining the Structural Genomics Pipeline: Identification of Protein Pipeline: Identification of Protein
Properties that Affect HighProperties that Affect High--throughput Experimental Analysisthroughput Experimental Analysis
by by ChernChern--Sing GohSing Goh,, Ning Lan Ning Lan et al.et al.
Presented by Jingjun SunPresented by Jingjun Sun
/14
224/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
BackgroundBackgroundMethodsMethodsResults Results ConclusionConclusion
/14
334/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
Structural genomics generateStructural genomics generatess unique unique datasets in terms of datasets in terms of proteinprotein physical and physical and chemical properties chemical properties discover significant protein features that discover significant protein features that influence a protein being determined influence a protein being determined structurallystructurallyidentify potential bottlenecks in various identify potential bottlenecks in various stages of the structural genomics pipelinestages of the structural genomics pipeline
/14
444/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
decision treesdecision treesconservation, charged residues, conservation, charged residues, hydrophobic patches, binding partners, hydrophobic patches, binding partners, lengthlengthcloning, expression, purification, structural cloning, expression, purification, structural determinationdetermination
/14
554/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
Structure vs Structure vs No StructureNo Structure15952 0.59
arg 27267proteins have COGs
total t et= =
COG < 1 [27267]
.00511262 53
DE < 9.7
GAVLI < 31.7
C < 1.8
.002
.006
.03 .01
2432 6
2600 16
8044 249 2579 26
297 0.85arg 350
proteins structure COGstotal structure t et
= =
TargetDB
15655 297/14
664/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
FFeature eature Total targetsTotal targets Structure targetsStructure targets
COGsCOGsGAVLI GAVLI 34.6%34.6% 38.4%38.4%HHp_aa p_aa 15 15 77LLength ength 291291 243 243 KR KR 12.5% 12.5% 12.6% 12.6% ………………
15952 0.59arg 27267
proteins have COGstotal t et
= = 297 0.85arg 350
proteins structure COGstotal structure t et
= =
/14
774/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
Feature Description Feature Description Total targetsTotal targets SD targetsSD targetsGAVLI Average GAVLI GAVLI Average GAVLI 34.634.6 38.4 38.4
composition(%) composition(%) hhp_aa average No. of p_aa average No. of 15 15 77
hp residueshp residues
within a hp stretchwithin a hp stretch
SEGAVLIWKAAVLIRGAVLI…
NQSEGAVLIEKDWRLIRMS…
20
/14
884/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
Expressed vs Expressed vs cloned & not expressedcloned & not expressed
exp 3182 0.57exp 5622
proteins not ressed COGstotal not ressed
= =
COG < 1 [14385]
.582203 3043
any_partners<1
length < 524
.73
.76
.44 .13
979 2785
1816 1776
208 659
416 57exp 5827 0.70
exp 8319proteins ressed COGs
total ressed= =
3182 5827
pI < 5.9
cloned
/14
994/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
Purified vs Purified vs expressed & not purifiedexpressed & not purified
0.65proteins not purified COGstotal not purified
=
0.75proteins purified COGstotal purified
=
0.75proteins not structure COGstotal not structure
=
0.85proteins structure COGstotal structure
=
Structure vs Structure vs purified & not structurepurified & not structure
/14
10104/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
PPercents that belong to COGsercents that belong to COGs
59% total
(27711)
63% cloned
(14767)
70% expressed
(8587)
85% structures
(370)
54% not cloned (1
2944)
53% not expressed (6
180)
65% not purified (4
472)
75% not structu
res (3745)
COGs
not expressed
cloned
not cloned
expressed
purified
not purified
structure No structure
75% purified
(4115)
/14
11114/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
Numbers of Hydrophobic resides Numbers of Hydrophobic resides
15.1 total
(27711)
16.3 cloned
(14767)
11.6 expressed
(8587)
6.6 structures
(370)
13.8 not cloned (1
2944)
22.7 not expressed (6
180)
16.3 not purified (4
472)
6.5 not structu
res (3745)
not expressed
cloned
not cloned
expressed
purified
not purified
structure No structure
6.5 purified
(4115)
/14
12124/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
charged residue composition charged residue composition bottleneck at purification bottleneck at purification number of binding partnersnumber of binding partnersbottlenecks at purification and structurebottlenecks at purification and structurelengthlengthbottleneck at expressionbottleneck at expression
/14
13134/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
COGsCOGscharged residue composition charged residue composition number of hydrophobic residues in number of hydrophobic residues in hydrophobic stretcheshydrophobic stretchesnumber of binding partners it hasnumber of binding partners it hasLengthLength
nuclear localization signals nuclear localization signals ……/14
14144/28/2006 2:10 AM4/28/2006 2:10 AM MSG Pipeline Jingjun Sun MSG Pipeline Jingjun Sun
Thanks!Thanks!
/14