Microbial Functional Genomics, Genomic Technologies, And Their Applications

Microbial Functional Genomics, Genomic Technologies, And Their

Applications

Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA

Jizhong (Joe) Zhou

[email protected], 865-576-7544

mailto:[email protected]

Community Genome Arrays

Whole Genome Microarrays

Functional Gene

Arrays

Gene Expression Patterns

Protein arrayMicrobial Community

Diversity & Mechanisms

Genomic Technology

Microbial functional Genomics

Microbial Ecology &

Extremophiles

Oligonucleotide Arrays

Producing Magnetic Nanoparticles Uranium Reduction

Community & Ecosystem Genomics

Defining gene functions:30-60% open reading frames are functionally unknown.

• Regulatory networkGene number difference could not explain phenotypic differences, suggesting regulation is the key.

Challenges in functional genomics

Microbial Functional Genomics Integrating Gene Expression Profiling, Bioinformatics, mutagenesis and

Proteomics

2-D Gels

Mass Spectrometry

Genome Sequence

Figure 2

F1 F2 N1 N2 A. Electron transport:

Group 3r = 0.84

Group 1r = 0.93

Group 2r = 0.86

B. Intermediary carbon metabolism:

- 3961, succinyl-CoA synthetase, sucD 0.99 (±0.26) c 0.54 (±0.06)- 749, glucose-6-phosphate isomerase, gpi 0.59 (±0.11) 0.67 (±0.11)- 748, transaldolase B, talB 0.51 (±0.10) 0.70 (±0.04)- 3960, succinyl-CoA synthetase, sucC 0.46 (±0.02) 0.44 (±0.04)- 3956, succinate dehydrogenase, sdhA 0.54 (±0.08) 0.50 (±0.08)- 3954, citrate synthase, gltA 0.52 (±0.17) 0.46 (±0.12)- 1073, malate oxidoreductase, sfcA 0.52 (±0.18) 0.49 (±0.06)- 3958, 2-oxoglutarate dehydrogenase, sucA 0.40 (±0.08) 0.41 (±0.05)- 2778, malate dehydrogenase, mdh 0.58 (±0.20) 0.40 (±0.06)- 3959, 2-oxoglutarate dehydrogenase, sucB 0.75 (±0.12) 0.41 (±0.05)- 3957, succinate dehydrogenase, sdhB 0.70 (±0.03) 0.57 (±0.14)

Group 4r = 0.90

- 1203, cytochrome c552, nrfA 1.11 (±0.04) c 3.33 (±0.47)- 3458, dimethyl sulfoxide reductase, dmsB 3.16 (±1.26) 1.93 (±0.04)- 4138, Ni/Fe hydrogenase, hydA 3.37 (±1.34) 3.46 (±0.16)- 2987, fumarate reductase, fcc 2.25 (±0.35) 1.78 (±0.10)- 3455, outer membrane protein 4.16 (±1.31) 2.46 (±0.51)- 3457, dimethyl sulfoxide reductase, dmsA 4.99 (±0.49) 3.21 (±0.80)- 4141, Ni/Fe hydrogenase, hydB 2.13 (±0.71) 2.09 (±0.41)- 4142, Ni/Fe hydrogenase, hydC 3.11 (±1.22) 1.34 (±0.18)- 3454, deca-heme cytochrome c 5.91 (±1.49) 2.51 (±0.21)- 1863, fumarate reductase, flavocytochrome c3 2.08 (±0.42) 2.01 (±0.48)- 1752, formate dehydrogenase, fdhA 5.57 (±0.86) 10.38 (±4.45)- 1754, formate dehydrogenase, fdhC 4.74 (±0.56) 12.48 (±1.61)- 2851, periplasmic nitrate reductase, napA 3.53 (±1.43) 1.05 (±0.31)c

- 2952, di-heme split-soret cytochrome c 3.55 (±0.32) nd d

- 2849, ferredoxin-type protein napH 2.04 (±0.15) 0.82 (±0.04)- 3388, prismane 2.89 (±1.62) 0.59 (±0.06)- 3005, formate dehydrog., Se-cystein, fdhA 2.29 (±0.58) 1.24 (±0.36) c

- 2389, fumarate reductase, frdA 2.69 (±0.98) nd d

- 2390, fumarate reductase, frdB 1.80 (±0.05) nd d

- 3134, bacterioferritin, bfr 0.30 (±0.06) 0.26 (±0.12)- 624, cytochrome c' 0.54 (±0.01) 0.33 (±0.02)- 4403, cbb3-cytochrome oxidase, ccoP 0.52 (±0.02) 0.36 (±0.03)- 4405, cbb3-cytochrome oxidase, ccoQ 0.60 (±0.13) 0.34 (±0.05)- 4406, cbb3-cytochrome oxidase, ccoN 0.64 (±0.33) 0.36 (±0.04)- 487, cytochrome d ubiquinol oxidase, cydA 0.62 (±0.06) 0.26 (±0.06)- 488, cytochrome d ubiquinol oxidase, cydB 0.83 (±0.05) 0.29 (±0.14)- 2262, mono-heme c-type cytochrome, scyA 0.50 (±0.05) 0.37 (±0.07)- 3280, probable oxidoreductase ordL 0.43 (±0.12) 0.55 (±0.08)- 3290, conserved hypothetical protein 0.42 (±0.28) 0.58 (±0.08)- 4795, cytochrome b, cybP 0.37 (±0.09) 0.47 (±0.02)- 722, NADH dehydrogenase, ndh 0.43 (±0.09) 0.65 (±0.11)

Mean intensity ratiob

Fumarate Nitrate

ORF #, putative functiona

- 3006, H2O2-activator, hpkR, LysR family 0.44 (±0.11) 0.57 (±0.03)- 2099, histidine utilization repressor, hutC 0.41 (±0.10) 0.40 (±0.05)- 3965, ferric uptake regulatory protein, fur 0.59 (±0.01) 0.60 (±0.06)- 1987, transcritpional regulator, DeoR family 0.65 (±0.24) 0.24 (±0.05)- 4603, sensor histidine kinase, kinA 0.48 (±0.16) 1.10 (±0.13)- 1386, ATP-dependent protease, hslV 0.40 (±0.12) nd d

- 721, transcritpional regulator, LacI family 0.43 (±0.05) 0.93 (±0.21) c

- 4019, chemotaxis CheV homolog 2.27 (±0.81) 1.32 (±0.11)- 1382, tetrathionite sensor kinase, ttrS 2.43 (±1.02) 1.74 (±0.05)

C. Transcription regulation:

Group 6r = 0.81

Group 5r = 0.86

Figure 2

F1 F2 N1 N2 A. Electron transport:

Group 3r = 0.84

Group 1r = 0.93

Group 2r = 0.86

B. Intermediary carbon metabolism:

- 3961, succinyl-CoA synthetase, sucD 0.99 (±0.26) c 0.54 (±0.06)- 749, glucose-6-phosphate isomerase, gpi 0.59 (±0.11) 0.67 (±0.11)- 748, transaldolase B, talB 0.51 (±0.10) 0.70 (±0.04)- 3960, succinyl-CoA synthetase, sucC 0.46 (±0.02) 0.44 (±0.04)- 3956, succinate dehydrogenase, sdhA 0.54 (±0.08) 0.50 (±0.08)- 3954, citrate synthase, gltA 0.52 (±0.17) 0.46 (±0.12)- 1073, malate oxidoreductase, sfcA 0.52 (±0.18) 0.49 (±0.06)- 3958, 2-oxoglutarate dehydrogenase, sucA 0.40 (±0.08) 0.41 (±0.05)- 2778, malate dehydrogenase, mdh 0.58 (±0.20) 0.40 (±0.06)- 3959, 2-oxoglutarate dehydrogenase, sucB 0.75 (±0.12) 0.41 (±0.05)- 3957, succinate dehydrogenase, sdhB 0.70 (±0.03) 0.57 (±0.14)

Group 4r = 0.90

- 1203, cytochrome c552, nrfA 1.11 (±0.04) c 3.33 (±0.47)- 3458, dimethyl sulfoxide reductase, dmsB 3.16 (±1.26) 1.93 (±0.04)- 4138, Ni/Fe hydrogenase, hydA 3.37 (±1.34) 3.46 (±0.16)- 2987, fumarate reductase, fcc 2.25 (±0.35) 1.78 (±0.10)- 3455, outer membrane protein 4.16 (±1.31) 2.46 (±0.51)- 3457, dimethyl sulfoxide reductase, dmsA 4.99 (±0.49) 3.21 (±0.80)- 4141, Ni/Fe hydrogenase, hydB 2.13 (±0.71) 2.09 (±0.41)- 4142, Ni/Fe hydrogenase, hydC 3.11 (±1.22) 1.34 (±0.18)- 3454, deca-heme cytochrome c 5.91 (±1.49) 2.51 (±0.21)- 1863, fumarate reductase, flavocytochrome c3 2.08 (±0.42) 2.01 (±0.48)- 1752, formate dehydrogenase, fdhA 5.57 (±0.86) 10.38 (±4.45)- 1754, formate dehydrogenase, fdhC 4.74 (±0.56) 12.48 (±1.61)- 2851, periplasmic nitrate reductase, napA 3.53 (±1.43) 1.05 (±0.31)c

- 2952, di-heme split-soret cytochrome c 3.55 (±0.32) nd d

- 2849, ferredoxin-type protein napH 2.04 (±0.15) 0.82 (±0.04)- 3388, prismane 2.89 (±1.62) 0.59 (±0.06)- 3005, formate dehydrog., Se-cystein, fdhA 2.29 (±0.58) 1.24 (±0.36) c

- 2389, fumarate reductase, frdA 2.69 (±0.98) nd d

- 2390, fumarate reductase, frdB 1.80 (±0.05) nd d

- 3134, bacterioferritin, bfr 0.30 (±0.06) 0.26 (±0.12)- 624, cytochrome c' 0.54 (±0.01) 0.33 (±0.02)- 4403, cbb3-cytochrome oxidase, ccoP 0.52 (±0.02) 0.36 (±0.03)- 4405, cbb3-cytochrome oxidase, ccoQ 0.60 (±0.13) 0.34 (±0.05)- 4406, cbb3-cytochrome oxidase, ccoN 0.64 (±0.33) 0.36 (±0.04)- 487, cytochrome d ubiquinol oxidase, cydA 0.62 (±0.06) 0.26 (±0.06)- 488, cytochrome d ubiquinol oxidase, cydB 0.83 (±0.05) 0.29 (±0.14)- 2262, mono-heme c-type cytochrome, scyA 0.50 (±0.05) 0.37 (±0.07)- 3280, probable oxidoreductase ordL 0.43 (±0.12) 0.55 (±0.08)- 3290, conserved hypothetical protein 0.42 (±0.28) 0.58 (±0.08)- 4795, cytochrome b, cybP 0.37 (±0.09) 0.47 (±0.02)- 722, NADH dehydrogenase, ndh 0.43 (±0.09) 0.65 (±0.11)

Mean intensity ratiob

Fumarate Nitrate

ORF #, putative functiona

- 3006, H2O2-activator, hpkR, LysR family 0.44 (±0.11) 0.57 (±0.03)- 2099, histidine utilization repressor, hutC 0.41 (±0.10) 0.40 (±0.05)- 3965, ferric uptake regulatory protein, fur 0.59 (±0.01) 0.60 (±0.06)- 1987, transcritpional regulator, DeoR family 0.65 (±0.24) 0.24 (±0.05)- 4603, sensor histidine kinase, kinA 0.48 (±0.16) 1.10 (±0.13)- 1386, ATP-dependent protease, hslV 0.40 (±0.12) nd d

- 721, transcritpional regulator, LacI family 0.43 (±0.05) 0.93 (±0.21) c

- 4019, chemotaxis CheV homolog 2.27 (±0.81) 1.32 (±0.11)- 1382, tetrathionite sensor kinase, ttrS 2.43 (±1.02) 1.74 (±0.05)

C. Transcription regulation:

Group 6r = 0.81

Group 5r = 0.86

Structure-Based Function Prediction

BIOINFORMATICS

G ene

o r iR 6 ky

K anR

lo xP

o r iC o lE 1

A M P R

p ro m o te r

F o s lo xP

M 1 3 o r iCMPR

P S P p ro m o te r

pJun

oriSC101

Ju n g e n e III

T ranscrip tion &Trans lation

JunpIII

F os

P O I

E x trac e llu la r

P e rip la s m

C y to p la s m

Phage Display

TRANSCRIPTOMICSPROTEOMICS

DNA Microarrays

pDS31

sacB

aac1 Gmr

MUTAGENESIS

Whole genome microarrays available at ORNL

Rhodopseudomonas palustris: Photosynthetic bacterium (MGP, GTL)

Nitrosomonas europaea: Ammonium-oxidizing bacterium (MGP)

Desulfovibrio vulgaris: Sulfate-reducing bacterium (GTL, NABIR)

Geobacter metallireducens: Metal-reducing bacterium (GTL)

Shewanella oneidensis MR-1: Metal-reducing bacterium (MGP, GTL)

Deinococcus radiodurans R1: Radiation-resistant bacterium (GTL)

Methanococcus maripaludis (GTL)

Two primary uses of microarrays for functional analysis

• Hypothesis-generating, i.e., exploratory, Gene expression profiling under different conditions:

e.g., Radiation responses in Deinococcus radiodurans .

• Hypothesis-driven: e.g., mutant characterization in Shewanella

oneidensis MR-1.

Mega-plasmid

177.5 Kbp

Chromosome I2.65 Mbp

Chromosome II412.3 Kbp

Plasmid45.7 Kbp

# Similar to known proteins 52.2%# Conserved hypothetical 16%# Hypothetical 31.5%rRNA operons 9

% G+C 66.6%# ORFs 3,195Mean ORF size 937 bp% Coding 91%

*D. radiodurans R1 genome sequence and annotation courtesy of

TIGR

Deinococcus radiodurans R1 Genome: 3.3Mb

Radiation Resistance of D. radiodurans R1

•Majority of E. coli cells are dead at ~500 grays.

•D. radiodurans exhibits a shoulder of resistance up to ~5000 Gy; no loss of viability.

•Very little is known about the DNA repair pathways enabling D. radiodurans to resist ionizing and UV irradiation.

E. coli

D. radiodurans R1

Radiation Survival Curve

bp

23.1

9.4

6.6

4.4

M CK 0 1.5 3 5 9 24Hours post irradiation

-radiation

DNA damages

Replication impaired

Cell division arrested

mRNA degradation

Protein degradation

Cellular functions impaired

Cells grow slow or dead

-photon(20%)

DNA damage repairRe-initiate DNA synthesis

(early events after irradiation)

Minimize free radical levels(late events after irradiation)

Deinococcus Deinococcus Cells Can Cells Can Survive Acute Survive Acute -radiation -radiation due to its due to its ability to ability to repair direct repair direct damage and damage and remove free remove free radicals. radicals.

• Direct damage Direct damage (20%)(20%)

• Indirect Indirect damage due to damage due to free radicals free radicals (80%)(80%)

Cells

Irradiation-inducedFree radicals (80%)

Gene Expression Profiling: Experimental Design

Recovery of D. radiodurans (wild-type strain R1) from acute radiation (exposure dose = 15,000 Grays of -radiation)

Cell Sample Recovery Time (in hours) @ 32CControl (non-irradiated) –

1 0

2 0.5

3 1.5

4 3

5 5

6 9

7 12

8 16

9 243 biological replicates (different mRNAs)

4 technical replicates

Total replicates: 12

Irradiated Control

Collaboration with Mike Daly

C . R e p r e s s e d p a t t e r n

B . G r o w t h - r e l a t e d a c t i v a t i o n p a t t e r n

A . r e c A - l i k e a c t i v a t i o n p a t t e r n

G e n e # , p u t a t i v e f u n c t i o n a R a t i o( f o l d ) b

T i m e( h r ) c

r = 0 . 8 3

r = 0 . 7 1

r = 0 . 7 7

10 . 2 5

D R 0 9 1 1 D N A - d i r e c t e d r n a p o l y m e r a s e b e t a s u b u n i t , r p o C 1 . 9 9 ( ± 1 . 3 7 ) 0 . 5D R 2 2 2 0 T e l l u r i u m r e s i s t a n c e p r o t e i n T e r B 3 . 1 3 ( ± 1 . 4 9 ) 5D R 2 2 2 1 T e l l u r i u m r e s i s t a n c e p r o t e i n T e r E 5 . 2 4 ( ± 2 . 9 4 ) 3D R B 0 0 6 9 S u b t i l i s i n s e r i n e p r o t e a s e 3 . 1 8 ( ± 1 . 3 9 ) 3D R B 0 0 6 7 E x t r a c e l l u l a r n u c l e a s e w i t h F i b r o n e c t i n I I I d o m a i n s 4 . 3 7 ( ± 1 . 2 1 ) 3D R 0 2 6 1 8 - o x o - d G T P a s e , m u t T 3 . 3 6 ( ± 1 . 6 8 ) 0 . 5D R A 0 3 4 4 L E X A r e p r e s s o r , H T H + p r o t e a s e , l e x A 1 . 8 0 ( ± 1 . 0 8 ) 1 . 5D R 0 0 9 9 S s D N A - b i n d i n g p r o t e i n , s s b 3 . 0 1 ( ± 1 . 2 0 ) 0 . 5D R 2 1 2 9 R i b o s o m a l c o m p o n e n t L 1 7 , r p l Q 5 . 9 2 ( ± 2 . 0 9 ) 1 . 5D R 2 1 2 8 R N A p o l y m e r a s e a l p h a s u b u n i t , r p o A 4 . 0 3 ( ± 2 . 8 0 ) 1 . 5D R 0 3 2 4 P r o b a b l e g l u t a m a t e f o r m i m i n o t r a n s f e r a s e 3 . 3 0 ( ± 1 . 4 7 ) 0 . 5D R 2 3 3 7 U n c h a r a c t e r i z e d p r o t e i n 7 . 4 1 ( ± 5 . 7 1 ) 1 . 5D R A 0 3 4 6 P p r A p r o t e i n , i n v o l v e d i n D N A d a m a g e r e s i s t a n c e 3 . 5 2 ( ± 1 . 9 4 ) 0 . 5D R 1 8 2 5 P r o t e i n - e x p o r t m e m b r a n e p r o t e i n 3 . 2 1 ( ± 1 . 4 8 ) 1 . 5D R 1 7 7 1 U V R A A B C f a m i l y A T P a s e , u v r A - 1 3 . 5 2 ( ± 1 . 1 5 ) 1 . 5D R A 0 3 4 5 P r e d i c t e d e s t e r a s e 1 0 . 0 5 ( ± 4 . 3 9 ) 1 . 5D R 0 4 2 2 T r a n s - a c o n i t a t e m e t h y l a s e 1 8 . 8 5 ( ± 7 . 4 6 ) 1 . 5D R 1 1 4 3 U n c h a r a c t e r i z e d p r o t e i n 8 . 8 5 ( ± 4 . 2 6 ) 1 . 5D R 0 0 0 3 U n c h a r a c t e r i z e d p r o t e i n 1 4 . 0 3 ( ± 5 . 5 3 ) 1 . 5D R 1 7 7 6 N u d i x f a m i l y p y r o p h o s p h a t a s e 4 . 7 0 ( ± 2 . 8 3 ) 1 . 5D R 2 3 4 0 R e c A , r e c A 7 . 9 8 ( ± 3 . 8 6 ) 1 . 5D R 2 6 1 0 P e r i p l a s m i c b i n d i n g p r o t e i n , fl i Y 4 . 1 3 ( ± 1 . 6 7 ) 0 . 5D R 1 6 4 5 T e i c h o i c a c i d b i o s y n t h e s i s p r o t e i n , w e c G 5 . 8 8 ( ± 2 . 7 9 ) 1 . 5D R 0 6 9 6 V - t y p e A T P a s e s y n t h a s e , s u b u n i t K 7 . 1 9 ( ± 2 . 1 6 ) 1 . 5D R 0 4 2 1 U n c h a r a c t e r i z e d p r o t e i n 4 . 9 4 ( ± 2 . 3 0 ) 1 . 5D R 1 7 7 5 S u p e r f a m i l y I h e l i c a s e , u v r D 3 . 3 0 ( ± 1 . 6 9 ) 1 . 5D R 1 5 6 1 U D P - N - a c e t y l g l u c o s a m i n e 2 - e p i m e r a s e , w e c B 6 . 0 0 ( ± 1 . 4 0 ) 1 . 5D R 2 2 8 5 M u t Y , A / G - s p e c i fi c a d e n i n e g l y c o s y l a s e , m u t Y 2 . 3 6 ( ± 0 . 4 0 ) 3D R 2 3 5 6 N u d i x f a m i l y h y d r o l a s e 3 . 3 5 ( ± 0 . 4 5 ) 3D R 2 2 7 5 E x c i n u c l e a s e A B C s u b u n i t B , u v r B 4 . 9 3 ( ± 1 . 8 1 ) 3D R 0 2 0 6 U n c h a r a c t e r i z e d p r o t e i n 5 . 4 5 ( ± 2 . 6 5 ) 3D R 0 2 0 4 U n c h a r a c t e r i z e d m e m b r a n e p r o t e i n 6 . 0 1 ( ± 1 . 3 5 ) 3D R 1 3 5 4 E x c i n u c l e a s e A B C s u b u n i t C , u v r C 3 . 7 8 ( ± 0 . 4 2 ) 3D R 0 2 0 3 U n c h a r a c t e r i z e d m e m b r a n e p r o t e i n 3 . 8 2 ( ± 0 . 8 6 ) 1 . 5D R 0 2 0 5 A B C t r a n s p o r t e r A T P a s e 4 . 1 0 ( ± 2 . 4 5 ) 3D R 1 3 5 7 A B C t r a n s p o r t e r , p e r m e a s e s u b u n i t 6 . 7 9 ( ± 2 . 5 6 ) 1 . 5D R 2 4 8 2 P r e d i c t e d t r a n s c r i p t i o n r e g u l a t o r 5 . 7 5 ( ± 2 . 9 2 ) 1 . 5D R 2 4 8 3 M c r A n u c l e a s e 5 . 4 3 ( ± 1 . 2 2 ) 1 . 5D R A 0 0 0 8 C o n s e r v e d m e m b r a n e p r o t e i n 6 . 6 0 ( ± 2 . 0 0 ) 3D R A 0 2 3 4 U n c h a r a c t e r i z e d p r o t e i n , 1 2 . 7 6 ( ± 5 . 2 7 ) 1 . 5D R 1 3 5 9 A B C t r a n s p o r t e r , p e r i p l a s m i c s u b u n i t 2 4 . 8 3 ( ± 1 1 . 1 3 ) 1 . 5D R 2 1 2 7 R i b o s o m a l p r o t e i n S 4 , r p s D 5 . 4 0 ( ± 1 . 5 0 ) 3D R 1 3 5 6 A B C t r a n s p o r t e r , A T P - b i n d i n g p r o t e i n 9 . 8 5 ( ± 5 . 9 8 ) 3D R B 0 1 3 6 P u t a t i v e D E A H A T P - d e p e n d e n t h e l i c a s e , h e p A 5 . 2 2 ( ± 0 . 4 6 ) 3D R 1 5 4 8 B a c i l l u s y k w D o r t h o l o g , P R P 1 s u p e r f a m i l y p r o t e i n 5 . 6 2 ( ± 2 . 3 5 ) 3D R 0 2 0 7 C o m E A r e l a t e d p r o t e i n , s e c r e t e d 1 5 . 4 7 ( ± 8 . 3 1 ) 3D R A 0 2 4 9 M e t a l l o p r o t e i n a s e , l e i s h m a n o l y s i n - l i k e 6 . 4 7 ( ± 4 . 4 3 ) 3D R 0 6 6 5 U n c h a r a c t e r i z e d p r o t e i n 1 1 . 6 6 ( ± 5 . 7 4 ) 3D R 0 5 9 6 R e s o v a s o m e R u v A B C , s u b u n i t B , r u v B 3 . 2 2 ( ± 1 . 3 1 ) 0 . 5D R 0 9 1 2 D N A - d i r e c t e d r n a p o l y m e r a s e b e t a s u b u n i t , r p o B 3 . 1 9 ( ± 0 . 8 0 ) 0 . 5

D R 1 1 7 2 L e a 7 6 / L E a 2 9 - l i k e d e s i c c a t i o n r e s i s t a n c e p r o t e i n 2 . 6 6 ( ± 0 . 6 0 ) 2 4D R 0 4 6 1 B a c i l l u s y a c B o r t h o l o g 2 . 5 8 ( ± 0 . 8 1 ) 2 4D R 1 5 9 5 6 - p h o s p h o g l u c o n a t e d e h y d r o g e n a s e , g n d 2 . 3 0 ( ± 0 . 5 2 ) 2 4D R A 0 0 4 3 T D P - r h a m n o s e s y n t h e t a s e 5 . 0 8 ( ± 2 . 1 2 ) 1 2D R A 0 0 4 2 G l u c o s e - 1 - p h o s p h a t e t h y m i d y l y l t r a n s f e r a s e , r f b A 3 . 7 0 ( ± 1 . 1 9 ) 1 2D R A 0 0 3 1 G l u c o s e - 1 - p h o s p h a t e t h y m i d y l y l t r a n s f e r a s e 2 . 4 8 ( ± 1 . 6 4 ) 1 2D R A 0 0 6 5 C h r o m o s o m a l p r o t e i n H U H u p A , h u p A 7 . 7 1 ( ± 2 . 0 7 ) 2 4D R 2 2 6 3 B a c t e r i o f e r r i t i n , I r o n c h e l a t i n g p r o t e i n 6 . 4 1 ( ± 1 . 9 7 ) 1 6D R A 0 2 7 5 S o l u b l e c y t o c h r o m e C 4 . 8 0 ( ± 1 . 2 2 ) 2 4D R 1 2 7 9 S u p e r o x i d e d i s m u t a s e ( M n ) 3 . 9 1 ( ± 1 . 4 3 ) 2 4

D R 1 1 2 6 R e c J l i k e D H H s u p e r f a m i l y P h o s p h o h y d r o l a s e 0 . 3 3 ( ± 0 . 1 2 ) 1 2D R 1 3 3 7 T r a n s a l d o l a s e , t a l 0 . 2 5 ( ± 0 . 0 5 ) 3D R 0 7 2 8 F r u c t o k i n a s e , c s c K 0 . 3 7 ( ± 0 . 1 3 ) 3D R 0 9 7 7 P h o s p h o e n o l p y r u v a t e c a r b o x y k i n a s e , p c k A 0 . 4 8 ( ± 0 . 2 2 ) 1 . 5D R 1 7 4 2 G l u c o s e - 6 - p h o s p h a t e i s o m e r a s e , p g i 0 . 4 2 ( ± 0 . 1 2 ) 1 . 5D R 1 9 9 8 C a t a l a s e , C A T X , k a t A 0 . 2 3 ( ± 0 . 0 7 ) 3D R 1 1 4 6 G S P 2 6 g e n e r a l s t r e s s l i k e p r o t e i n 0 . 2 5 ( ± 0 . 0 6 ) 1 . 5D R 0 4 9 3 F o r m a m i d o p y r i m i d i n e - D N A g l y c o s i d a s e , m u t M 0 . 4 6 ( ± 0 . 0 9 ) 1 . 5D R 0 6 7 4 A r g i n i n o s u c c i n a t e s y n t h a s e , A S S Y , a r g G 0 . 3 5 ( ± 0 . 1 5 ) 3D R 2 6 2 0 C y t o c h r o m e o x i d a s e s u b u n i t I , C O X 1 , c a a A 0 . 4 5 ( ± 0 . 2 5 ) 5

T i m e ( h )

recA-like expression profile:

DNA replication DNA repair Recombination Cell wall metabolism Cellular transport Uncharacterized proteins

Induced Genes (early to mid phases):

Glyoxylate shunt

Superoxide dismutase

Stress response

Proteases, nucleases

Repressed Genes (early to mid phases):

TCA cycle

Genes involved in de novo synthesis of amino acids and nucleotides

•More than 800 genes are induced at 1.5 hr radiation.

•More genes are up-regulated than down-regulated.

•More than 40% of the genes which are functionally unknown are significantly changed upon irradiation.

Hierarchical Clustering Analysis of Expression Profile Patterns

Discovery of a Novel ATP-dependent DNA ligase

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

16.00

0 5 10 15 20 25time (h)

rela

tiv

e e

xp

res

sio

n le

ve

l

DRB0098 HD familyphosphohydrolase and nucleotidekinaseDRB0099 Uncharacterizedconserved protein

DRB0100 P redicted DNA ligase

DR2069 NAD dependent ligase, dnlJ

motif I motif III motif IIIa motif IV *6459863 DNLJ_DR2069 123 FTGELKIDGLSV 44 LEVRGEVYL 44 KAILYAVGKRDG 50 ADGTVLK 3002506362 DNLJ_ECOLI 110 WCCELKLDGLAV 46 LEVRGEVFL 44 TFFCYGVGVLEG 51 IDGVVIK 2901352290 DNL1_MOUSE 561 FTCEYKYDGQRA 41 FILDTEAVA 31 CLYAFDLIYLNG 51 CEGLMVK 7231706482 DNL4_HUMAN 201 FYIETKLDGERM 46 CILDGEMMA 28 CYCVFDVLMVNN 51 EEGIMVK 3651706481 DNL3_HUMAN 416 MFSEIKYDGERV 40 MILDSEVLL 27 CLFVFDCIYFND 51 LEGLVLK 57311498455 AF0849 91 VVLEEKMNGYNV 40 YMLCCEAVG 16 EFFLFDVREGKT 46 REGVVFK 23215894039 CAC0752 38 CVLEEKVDGANC 49 YVMYGEWLY 12 YFMEFDIFDKKE 50 RENLEIR 1886460914 DRB0100 35 VVVTEKLDGENT 37 WRFCGENVY 12 YFYLFSVWDDLN 42 MEGYVVR 165consensus/100% hh...KhsG.th h.h.sE.hh .hh.ashh...t .-sh.h+secondary str (1DGS) EEEEE EEE EEEEEEEE EEEE EEEEE

• A novel ATP-dependent DNA ligase was highly expressed with recA profile.

• It has consensus motifs with ligase from eucaryotes.

Ligase (DR0100)

Liu et al. 2003. PNAS, 100: 4191-4196

• Energy pathway switching, less energy produced.

• Minimizing energy demands --- Shutdown de novo biosynthetic pathways

• Energy pathway switching --- less free radicals produced.

• Increasing activities of the genes involved in removing free radicals.

• Shutdown de novo biosynthetic pathways to minimize energy requirement.

• Increasing activities of proteases and nucleases to provide amino acids and nucleotides for protein, DNA and RNA synthesis.

Energy Free radicals

Biosynthetic precursors

Highly coordinated regulations

Shewanella oneidensis – MR-1

S

Formate

Lactate

Pyruvate

Amino Acids

H2

O2

NO3-, NO2

-

Mn(IV) Mn(III) Fe (III)

Fumarate

DMSO TMAO So

S2O32-

U(VI) Cr(VI), Tc, As, Se, I,

Mine wasteBlack SeaOneida LakeGreen Bay Panama BasinMississippi DeltaNorth Sea Redox Interfaces

With this kind of versatility, what will it really do?

Habitats:• lake & marine

sediments• deep sea• oil brine• spoiled food

ORNL ESDMicrobial

Functional Genomics

Group

TIGR (John Heidelberg)

ORNL LSD, CASD (F.Larimer, B. Hettich)

Center for Microbial Ecology, MSU (J.Tiedje, J.Cole, J.Klappenbach)USC, JPL (K.Nealson)

ANL (C.Giometti)

Sequencing, annotation

Physiology, Genetics

2-D PAGE

Microarrays,

LIMS

Database

PNNL (J.Frederickson, D. Smith)

Physiology, MS

proteomics

BCM (T. Palzkill)

Phage display

ISB (E. Kolker)

Mod

elin

g

Bioinform

atics, MS

B.Palsson (UCSD)Adam Arkin (LBL)M.Riley (Woods Hole)

DOE Shewanella Federation

Pathway

cons

tructi

on an

d mod

eling

UCB (J. Keasling)

Metab

olo

mic

s

Rapid Deduction of Stress Response Pathways in Metal/Radionuclide Reducing Bacteria

U Washington

U Missouri

National Laboratories Universities Private Organizations

(Consultant)

Large Genomes To Life Project: $38M for 5 years

UC Berkeley

Stress responses on:Desulfovibrio vulgarisShewanella oneidensisGeobacter metallireducens

Summary of microarray analysis for Shewanella

Responses to 11 different electron acceptors

Mutant characterization with chemostats

Low-pH and high-pH stress

Heat shock, cold shock

Oxidative stress (e.g., H2O2)(Ting Li)

High salt

Carbon starvation

Metal stress: strontium, chromium

Hypothetical proteins

Many mutants

Defining Gene Function through Deletion Mutagenesis, ~ 80 deletion mutants

GLOBAL REGULATORS: etrA, narQ, fur, crp, arcA, envZ

cAMP-BINDING REGULATORS: cAMP1, cAMP2, cAMP3

ADENYLATE CYCLASES: cya1, cya2, cya3

OUTER MEMBRANE PROTEINS AND CYTOCHROMES: mtrC, mtrA, omcA

SIGMA FACTORS: rpoH, rpoE,

STRESS RESPONSE: oxyR, bolA, dps, ompR, cpxR

DOUBLE MUTANTS: etrA-fur, etrA-crp, cpxR-cpxA, ompR-envZ, cpxR-cpxA

PAS domain (old annotation): 0834, 0906, 1761,4254, 4326, 4917

Hypothetical proteins: 1377, 3584

Transcriptional factors: 220 genes, 78 within single operon,

Cytochrome genes: 42 genes

Computational Prediction of the function of the SO1328 Gene Product (LysR)

C-terminal domain

N-terminal DNA-binding domain

• It was annotated as LysR family protein.• It is induced 5-7 folds by H2O2 treatment. • It shares ~34% sequence homology with E.coli OxyR

gene.• 3D structure is similar to OxyR in E. coli.

Growth phenotype of LysR deletion mutant (SO1328)

LysR, H2O2

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0 2 4 6 8 10

Time (hours)

OD

lo

g 0um

2000um

WT,H2O2

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0 2 4 6 8 10

Time (hours)

OD

lo

g 0um

2000um

• Less growth was obtained when the WT cells were treated with 2,000 um H2O2.

• Wild type cells were sensitive to H2O2.

• No differences between treatment and control for the mutant cells

• The LysR mutant is not sensitive to H2O2.

• OxyR mutant is more sensitive to H2O2 in E. coli

WT

Mutant

0 uM

2,000 uM

0 uM2,000 uM

Microarray analysis of LysR mutant in response to H2O2 stress

deregulation of the major H2O2 (40uM, 2 min) responsive genes

0

20

40

60

80

100

Dps familyprotein

ahpC KatG-1 ahpF

folds

of ind

uction

WT

LysR

• Key genes (e.g., dps, katG) known to be involved in oxidative stress were not affected by H2O2 in the mutant.

• Since OxyR mutant is more resistant to H2O2, it is expected that the genes involved in oxidative stress should be highly expressed, but they are not. This

suggests that novel mechanisms and pathways may exist.

• OxyR-dps double mutant is also resistant to H2O2, suggesting that the oxidative responses in

MR-1 are very complicated.

Proteomics

Tools for studying proteomics2-Dimentional gel electrophoresis

Mass spectrometry

Phage-display

Yeast two hybrid system

Protein arrays

Structural determination: X-rays, NMR

G ene

o r iR 6 k y

K anR

lo x P

o r iC o lE 1

A M P R

p ro m o te r

F o s lo x P

M 1 3 o r iCMPR

P S P p ro m o te r

pJun

oriSC101

Ju n g e n e III

Transcription &Translation

JunpIII

Fos

P OI

E x trac e llu la r

P e rip la s m

C y to p la s m

Using phage-display to study protein-protein interactions and regulations

Phage display

• First key step: cloning all genes into universal vector.

• The cloning systems were optimized.

• All primers were synthesized.

• 3,853 genes were cloned.• Sequenced 50 clones, no

errors were found.

Gateway cloning vector

Expression of Shewanella proteins from the pDEST17 vector

175kDa

83kDa

62kDa

48kDa

33kDa

25kDa

34.2kDa

GSTGST

EtrAArcA FurNarQ

70.2kDa

32.4kDa20.5kDa

n i i i i in i i

n= no insert controli= expression induced with 0.5 mM IPTG

Global regulatory genes are well expressed in E. coli

Icd

aceA

aceB

sdhCAB

gltA

sucCD

sucAB

1. Consistent with E. coli : Icd, gltA-sdhCAB, sucABCD

2. Different from E. coli, aceBA, potentially regulate the glyoxylate shunt pathway.

3. Shewanella ArcA can also interact with promoters of other TCA cycle related genes (not found in E. coli): SO0970 (fumarate reductase flavoprotein subunit precursor), SO1538 (isocitrate dehydrogenase), , SO2222 (fumarate hydratase)

Identification of binding motifs of ArcA by gel shifting assays

Using promoter microarray for studying protein-DNA interactions to understand regulatory network

Verification by EMSA/RT-PCR/cDNA microarray

In vitro/vivo pull down

qPCR amplification

2

1

Non specific competitors

1. BSA/milk

2. Random DNADirect binding

Challenges in protein arrays

Antibodies are commonly used as probes in protein arrays

Two big challenges: Loss of activity: The big challenge for antibody arrays

is the loss of activity of antibody because the active binding site may bind to slide surface through chemical bonding, and thus the active site may not be available to the antigen.

Cross reactivity: Specificity is also a big issue for antibody protein arrays..

1, Polycation

3, Polyanion

2, Wash4, Wash

Cleaned slide

5, Polycation

repeat

Development of novel chemistry for protein array fabrication

Proteins are affixed on the slide by: • Entrapment by porous structure of the polymer• Electrostatic interaction• But not by covalent bonding

Langmuir 20, (2004), 8877-8885.Proteomics, in revision

Thin filmcoating

Glasssubstrate

Proteins spotted on different slidesNanofilm coated slide • More sensitive• Less background noise

Nanofilm-coated SuperaminePoly-LysineSuperaldehyde

2 fold decrease

Anti-Human IgG

Anti-Fibronectin

Streptavidin

BSA

BSA

BSA

1 2 3 4 5

Antibody arrays

• A patent was filed and licensed to a company• Nominated by ORNL for R&D100 Award.

Very good specificity of the antibody-antigen reactions were obtained.

GAG GGG GAA AGC GGG GGA TCG CAA GAC CTC GCG TGA TTG GAG CGG CCG ATCCT AGC GTT XTG GAG CGC ACCT AGC GTT XYG GAG CGC ACCT AGC GTT XYZ GAG CGC A

One-mismatch probetwo-mismatch probe

3-mismatch probe

Checkborder

Checkborder

Checkborder

X=G

X=T

X=C

X=A

XY=GGXY=AAXY=ATXY=GA

XYZ=GAC

XYZ=AGC

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16probes

Dis

cri

min

ati

on

fa

cto

r(F

m/F

p)

blank bar-polymer coated slide

Filled bar-SuperAldehyde slide

1-mismatch

2-mismatch

3-mismatch

4 & 5 mismatch

Perfect match

Detection of Single Base Pair Differences

• Short oligos (<25 bp) without end modification, typically $20/oligo.• More than 5 fold difference of signal intensity between PM and MM

probes.• Single mismatch can be clearly differentiated.

Main challenges

All methods defined a cutoff arbitrarily.

Identified clusters or modules are ambiguous.

1 0 0.9 0.5 0.4

1 0.7 0 0.8

1 0.4 0

1 0.6

1

Arbitrary cutoff for network identification

1 0 0.9 0 0

1 0.7 0 0.8

1 0 0

1 0

1

1 0.3 0.9 0.5 0.4

1 0.7 0.3 0.8

1 0.4 0.2

1 0.6

1

Only 3 interactions left when Rc=0.7.

7 interactions left when Rc=0.4

Rc=0.4 Rc=0.7

Correlation matrix of 5 genes

7 interactions left3 interactions left

Level Spacing Distribution of Yeast Gene Correlation Matrix

0

0.2

0.4

0.6

0.8

1

0 0.5 1 1.5 2 2.5 3

Level Spacing

p

P(0.8) P(0.7) p(0.6) P(0.5)

Wigner-Dyson Distribution(cutoff < 0.7)

Poisson Distribution(cutoff >0.7)

Novel approach for network identification

• Random properties: Wigner-Dyson distribution

• Nonrandom properties: Poisson distribution

Main advantages:• Universal laws support• Automatic cutoff• Reliable, sensitive, robust

Random Matrix Theory and Level Statistics

Poisson Distribution:

Wigner-Dyson Distribution:

( ) exp( )P s s

2

( ) exp2 4

sP s s

Identification of 27 Modules from Yeast Cell Cycle Expression Data

Experimental Validation of some hypothetical proteins

• Cycloheximide inhibits protein synthesis by blocking peptidyl transferase.

• Mutants are more sensitive to this drug, suggesting that it has defective ribosome.

• Thus the function of the genes is involved in ribosomal biogenesis.

1 2

3

4

56

7

1. dnaK 2. htpG 3. groEL

4. groES 5. Lon 6. dnaJ

7. SO2017

Functional identification of a hypothetical protein in Shewanella

For Shewanella heat shock data, SO2017 is grouped with heat shock proteins.

Experimental validation of SO2017

• Mutant of SO2017 is sensitive to heat shock.

• This gene is indeed involved in heat shock response.

• Suggesting that the prediction is correct

0.01

0.1

1

10

0 2 4 6 8

Time (h)

OD

600

Series1

Series2

Series3

Series4

DSP10 30oCSO2017 30oCDSP10 42oCSO2017 42oC

Pioneering advances in microarray-based technologies to address challenges in microbial community genomics

Challenges: Specificity: Environmental sequence divergences. Sensitivity: Low biomass. Quantification:

Existence of contaminants: Humic materials, organic contaminants, metals and radionuclides.

Solutions Developing different types of microarrays and novel chemistry to

address different levels of specificity. Developing novel signal amplification strategy to increase

sensitivity Optimizing microarray protocols for reliable quantification.

Summary of 50mer-based FGAs for environmental studies

• Nitrogen cycling: 302• Sulfate reduction: 204• Carbon cycling: 566• Phosphorus utilization: 79• Organic contaminant degradation: 770• Metal resistance and oxidation: 85

• Total: 2,006 probes• All probes are < 88% similarity

Oligonucleotide probe size: 50 bp

Tiquia et al. 2004. BioTechniques 36, 664-675Rhee et al. 2004, AEM 70:4303-4317

Specificity of 50 mer microarrays

nir S

nir K

nif H

amo A

dsr AB

pmo A

1

2

3

4 5• 5 nirS genes were mixed

together

• Only corresponding genes were hybridized

• 6 types of genes were mixed together

• Only corresponding genes were hybridized

Specific hybridization was obtained with probes 85% similarity

Sensitivity

Detection limit • 50 ng pure DNA in the presence of non-

target templates• 107 cells

500 ng gDNA 50 ng

1234

5678

25 ng 1.6109 1.31073.0106

Genomic DNA Cells

Quantification and validation

• Microarray result is consistent with real-time PCR

1: gi4704462-TFD2: gi4704463-TFD-Microcosm3: gi4704464-TFD-Enrichment4: gi4704463-TFD5: gi4704464-TFD-Microcosm6: gi4704465-TFD-Enrichment7: gi2828015-TFD8: gi2828016-TFD-Microcosm9: gi2828017-TFD-Enrichment10: gi2828018-TFD11: gi2828019-TFD-Microcosm12: gi2828020

Genes

0 2 4 6 8 10 12 14

Log

Valu

e

-4

-2

0

2

4

6

8

10

12

Real Time PCR (Log Copy Number)Microarray Hybridization (Log SNR)

r=0.861.6 109

8.0 108

1.0 107

4.0 1092.0 108

2.5 107

5.0 107

6.0 106

3.0 106 1.3 107

r2 = 0.98

Log

Sig

nal R

atio

(Lo

g R

)

Log (Cell Number [N])

Quantification• Good linear relationship• Quantitative

Real-PCRMicroarray hybridization

M A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 A6 B6 A7 B7 A8 B8 M

10fg

As low as 10fg (2 cells) can be detected

Novel amplification approach for increasing hybridization sensitivity

Amplification is quantitative for majority of the genes

Submitted to PNAS

S-3 PondsCap

Area 3

Area 130 mN

005

010

015

003 16

Area 2

275 m

pH Nitrate Uranium Nickel TOC

FW-300* 6.1 1.200 0.001 0.005 30

FW-003 6.0 1060 0.01 0.015 100

FW-005 3.9 175.0 6.40 5.00 70

FW-010 3.5 42000 0.17 18.0 175

FW-015 3.4 8300 7.70 8.80 65

TPB-16 6.3 30.00 1.10 ND 65

NABIR Field Research Center Samples

2 L groundwater Genes analyzed

16S rRNA, nirS, nirK, dsrAB, amoA

Contaminant source

Most contaminated

Less contaminatedLeast contaminated

6 samples were taken to assess the effects of contaminants on microbial community structure

Groundwater samples with very low biomass

• 2L groundwater from six different sites.

• Cell counts: 1-5x105/ml• DNA was isolated, 1/20

of the DNA was manipulated and used for hybridization.

• Nice hybridization was obtained with the DNA manipulated with the new method.

• No hybridization were obtained if the DNA is not manipulated.

0

5000

10000

15000

20000

25000

30000

35000

40000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53

FW300

0

5000

10000

15000

20000

25000

30000

35000

40000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53

FW010

Difference of functional genes in samples from NABIR Field Research Center

• Clear difference was observed among contaminated and noncontaminated sites.

• E.g., some genes are present in noncontaminated site but not in contaminated sites

Reference site

Highly contaminated site

FW300 FW003 FW021 FW010 FW024

FW300 61(20%) 189(36%) 174(35%) 80(21%) 111(23%)

FW003 25(11%) 144(35%) 61(17%) 84(20%)

FW021 10(5%) 64(20%) 90(24%)

FW010 6(5%) 118(37%)

FW024 30(16%)

Total Genes Detected 302 219 192 130 190

Genetic diversity, Simpson’s (1/D)a 125.5 67.1 26.6 17.4 35.7

• Overall diversity correlates with contaminant level.• The proportion of overlapping genes between samples was consistent with the

contaminant level and geochemistry. • A significant portion (5-20%) of all detected genes were unique to each sample,

even though they are very close. Thus, important microbial populations appear to be highly heterogeneous in this groundwater system.

Overall diversity among different samples

CommOligo --- New oligo probe design program for community analysis

Number and specificity of designed probes (50-mer) by different programs

Group sequences of nirS and nirK (842 gene sequences)

Programs used Total

ORFs ORFs rejected

Probes designed

Specific probe

Non-specific

Group-specific

ArrayOligoSeector

842

0

842

117

725

0

OligoArray

842

35

807

70

737

0

OligoArray 2.0

842

51

791

35

756

0

OligoPicker

842

657

185

141

44

0

CommOligo

842

512

330

147 0

183

• Useful for both whole genome microarrays and community arrays• Able to design group-specific probes• Better performance than other programs

Probes Designed for a Second Generation FGA

• Nitrogen cycling: 5089• Carbon cycling: 9198• Sulfate reduction: 1006• Phosphorus utilization: 438• Organic contaminant degradation: 5359• Metal resistance and oxidation: 2303

Total: 23,408 genes•23,000 probes designed• Will be very useful for community and ecological

studies

Community Genomics

Grand challenges

• Extremely high diversity, 5000 species/g soil

• 99% of the microbial species are uncultured

Whole community sequencingWhole community sequencing 010A-A05

Ralstonia eutropha Azoarcus eutrophus

Ralstonia NI1 010A-E08

010D-B06 010A-F09

Azoarcus FL05 010B-A01

uncultured clone 3 010A-A04 Acidovorax 3DHB1

010D-C09 uncultured clone 81

010A-D01 Rhodoferax antarcticus

010A-F11 uncultured clone HC-32

010B-E10 Aquaspirillum autotrophicum

010D-D06 010D-A06

uncultured clone S015 uncultured clone GOUTA12

010B-G08 010B-B11

Pseudomonas marginalis 010D-G08

010B-B09 010D-C08

Pseudomonas stutzeri 010A-C01

010A-A01 Rhizobium gallicum

010A-F12 uncultured clone LAH1 10

0

100

99

5110

0

100

87

100

100

89

98

99

99

95

98

80

53

97

64

55

84

96

897

1

61

675

9

54

100

0.05

• Sample from NABIR Field Research Center at ORNL• Sequenced by DOE Joint Genome Institute• 20 species based on 16S rRNA

Sequencing a stable thermophilic terephthalate (TA)-degrading community

• Terephthalate (TA) or 1,4-benzene dicarboxylic acid is a major byproduct of the plastics manufacturing industry.

• Three dominant populations:– Pelotomaculum: converting TA to acetate and hydrogen.– Methanothrix: converting acetate to methane and carbon dioxide.– A representative of candidate bacterial phylum OP5, unknown

function, but may also ferment TA.

-151.9)(15CH9H17HCO

O35H-4TA (4)

-31.0)(CHHCOOHacetate (3)

)(-135.6 O3HCHHHCO4H (2)

43.2)(H3HCO2H3acetate3

OH 8 TA (1)

43

22

432

2432

23

22

Go’

(kJ/reaction)

TAAc

H2+CO2

CO2

CH4 + CO2

(A) (B)

Syntrophic Interaction Functional Genomics of

Shewanella in Co-Culture – [towards microbial communities] Establish Shewanella-

Clostridium co-culture MR-1 & Clostridium

acetobutylicum or C. sphenoides

Global expression analyses of co-cultures

Daniel, Gottschalk et al. 1999Daniel, Gottschalk et al. 1999

Growth

Fe(II)

14CO2

Shewanella-ClostridiumShewanella-Clostridium Co-Culture Co-CultureMeOH + Fe(III)MeOH + Fe(III)

Also

Desulfovibrio (H2 production) + Methanococcus (H2 utilization)

Genomics, community functions and stability

Proposal to NSF Frontiers In Integrated Biological Research (FIBR) program.

Obj 1. Genome diversity of nitrifying community & isolation

Obj 5. Integration, modeling, simulation & prediction

across different organization levels

Obj 4. Effects of elevated CO2 on microbial

community, functions & stability in nature

Obj 3. Competition, functional redundancy,

stresses, & stability

Obj 2. AOB-NOB interactions,

regulation & stability

Analyses: mRNA, protein, metabolites,

populations dynamics, community function

Natural systemMany species

Defined systems3 & 4 - species

Natural systemMany species

Analyses: genome sequencing, FGA

microarrays

Defined system 2 species

Isolates, sequences

Providing systems and knowledge for constructing more complex systems

Pro

vidi

ng s

igna

ture

tar

get

gene

s fo

r m

onit

orin

g

Probe sequences, diversity

Linking genomics to populations, to community diversity, functions, stability and to global change

Dynamics, stability in nature

Mec

hani

stic

und

erst

andi

ng o

f co

exis

tenc

e in

nat

ure

Dynam

ics, stability in nature

Insights on stability of the mutalistic interactions in more complex systems

• Nitrifying communities.

• One of the biggest NSF program in life science.

• 1M/yr for 5 years.

• Preproposal was panel reviewed, and invited to submit a full proposal.

Qualitative microbial ecology: Due to the difficulty in obtaining experimental data, microbial ecology is qualitative, but not quantitative.

Opportunity for quantitative microbial science: With availability of genomic technologies, microbial ecology is no longer limited by the deficiency of experimental data.

Challenges: Modeling, simulation and prediction A big mathematical challenges: dimensionality problem. The sample number is less than the gene

number.

Possible solution: System ecology + Genomics

Predictive Microbial Ecology

( ) ( ) ( )

1

( ) ( )km

k k ki ij j

j

dx t W x t

dt

i. Modeling microarray data at individual gene level

ii. Modeling interactions between functional gene groups or gilds.

ii. Modeling interactions between functional gene groups or gilds. 1

( ) ( ) ( )n

k k kj kjj

dy t f t Q y t

dt

( )km

kk i k

i

y x m

1

( ) ( ) ( )N

p p pq pqq

dz t g t U z t

dt

1 1( ) ( )n

k ii

z t y t n

An example of the conceptual integration scheme

• Network identification and modeling

• Scaling from single cells to ecosystems

• Spatial

• Temporal

Grand Challenges for Systems Biology

Experiment

CommunityLevel

Modeling

DesignExperiment

Species 1

Species 3

Species 2

Sequence andPathway Analyses

Data Analysis &Management

PopulationLevel

Modeling

MicroarraySequencing

First Book on Microbial Functional Genomics

Authors Jizhong Zhou, Dorothea Thompson, Ying Xu, James M.

Tiedje John Wiley & Sons, March 19, 2004 15 chapters, > 600 pages Rita Colwell, former NSF Director, wrote a forward To our knowledge, this is the first book in microbial functional genomics

Acknowledgement(1)

• Department of Energy– Microbial Genome Program– Genomes To Life Program– NABIR Program– Ocean Margin Program– Carbon cycling programs

• Oak Ridge National Laboratory– Laboratory Directed Research and Development

Microbial Genomics and Ecology Group at Environmental Sciences Division, ORNL

Acknowledgement • ORNL

– Zhili He– Liyou Wu– Dorothea Thompson– Yongqing Liu– Ting Li– Matthew Fields– Xuedan Liu– Tingfen Yan– Sung-Keun Rhee– Song Chong– Yunfeng Yang– Jost Liebich– Christopher Schadt– Dawn Stanek– Adam Leaphart– Weimin Gao– Terry Gentry– Steve Brown– Qiang He– Feng Luo– Crystal McAlvin – Susan Carroll– Lisa Fagan– Haichun Gao– Hongbin Pan– Xiufeng Wan– Xichun Zhou– Zamin Yang– Jianxin Zhong– Dong Yu– Ying Xu

• Michigan State University – James M. Tiedje– James Cole– Joel Klappenbach

• USUHS– Mike Daly

• USC– Ken Nealson

• Argonne National Lab– Carol Giomettie

• Univ of Iowa– Caroline Harwood

• Oregon State Univ– Dan Arp

• UC Berkeley– Jay Kneasling

• Ohio State Univ– Bob Tabita

• Univ of Missouri– Judy Wall

• Bayler College– Tim Palzkill

• SREL– Chuanlun Zhang

• PNNL– Jim Frederickson– Margie Romine– Yuri Gorby– Dick Smith– Mary Lipton

• LBL– Terry Hazen– Adam Arkin

• Perkin Elmer– Xinyuan Li

Documents

Microbial Functional Genomics, Genomic Technologies, And Their Applications