13
Supporting Information Ashworth et al. 10.1073/pnas.1300962110 SI Materials and Methods Growth and Sampling of Thalassiosira pseudonana Cultures. Axenic cultures of T. pseudonana CCMP1335 (Provasoli-Guillard Na- tional Center for Culture of Marine Phytoplankton) were grown in enriched articial seawater (ESAW) medium (1) modied with lower levels of nitrate (80 μM), silicic acid (110 μM), and phosphate (20 μM) and grown in bioreactors. Before the ex- periment, the diatoms were acclimated to constant 20 °C tem- perature and ambient CO 2 under a 12:12 h dark:light diurnal cycle at 125 μmol photons·m 2 ·s 1 of photosynthetically active radiation (PAR) provided by uorescent lamps. Cultures were considered acclimated when the growth rates of three consecu- tive growth curves were signicantly equal to each other. Chemicals were purchased from Sigma-Aldrich. During the ex- periments the bioreactors were equilibrated at ambient (400 ppm) or elevated (800 ppm) CO 2 by bubbling of mixed gasses (2 L/min) and stirring (50 rpm) in a 10-L glass bioreactor system (BioFlow). CO 2 -scrubbed ambient air and pure CO 2 gas were mixed and regulated using two mass-ow controller/monitors (G265; Qubit Systems) and solenoid controller. The air supply was monitored using a CO 2 analyzer (model S155; Qubit Sys- tems) and dispersed into the media by bubble injection from orices in a stainless steel tube at the bottom of the bioreactor with stirring. These sterile, equilibrated bioreactors (8 L of me- dia) were inoculated with 50,000 cells/mL of acclimated, axenic T. pseudonana and grown for up to 1 wk on a 12;12 h dark:light cycle. pH was measured continuously with a Mettler Toledo 405 DPAS SC K851425 pH probe, and calibrated with spectropho- tometric measurements, taken every 2 h (2). DIC was measured once daily with an Apollo SciTechs DIC analyzer available us- ing reference materials from A. Dickson. Oxygen production was measured continuously with a Mettler-Toledo Sensor inPro 6830 probe. Cellular uorescent capacity (CFC) was measured using the selective inhibitor of photosystem II, 3-(3,4-dichlorophenyl)-1, 1-dimethylurea (DCMU). Samples (300 μL n = 3) were dark adapted for 5 min and the initial uorescence was measured in a uorescence plate reader (BIOTEK FL800) at 680 nm when excited at 488 nm. Thereafter a solution of DCMU was added to the samples to a nal concentration of 3 × 10 5 M and the maximal uorescence (Fm) measured after 30 s. Cellular uo- rescent capacity was calculated according to the equation CFC = Fm Fo/Fm (3). Specic growth rates were calculated as the natural log of the growth ratio over time (k = ln(N t /N o )/T 2 T 1 ). Cell culture was stained with SYBR Green (Invitrogen) to test for the presence (and ensure the absence of) bacteria. Physio- logical measurements and cell counts were done every 2 h and cell harvesting for RNA extraction and measurements every 12 h. Nutrients (NO 3 , NH 4, PO 3 , and Si) were sampled twice daily and measured according to ref. 4, at the University of Washington Chemical Oceanography facility. Flow Cytometry. For the measurement of cell division, cells were stained with SYBR green counted with an Inux ow cytometer (Cytopeia). Samples were run directly in the Inux ow sorter without concentration. The sorter uses two solid-state lasers, with an excitation source tuned to emit at 488 nm and a second tuned at 405 nm. The cells were counted based on their forward scatter and red uorescence characteristics. The green uorescence emission was measured using a combination of a 500-nm and 600-nm dichroic long-pass lters and a 530/20-nm BP lter attached to a photomultiplier (R928; Hamamatsu). The red uorescence emission was measured with a 600-nm dichroic long-pass lter and a 640-nm long-pass lter with a 700/30-nm lter and a red sensitive photomultiplier (R3829; Hamamatsu). The ow cytometer was calibrated with 1-μm uorescent micro- spheres (Polysciences). Dividing cells were determined from their cell size and DNA content. RNA Extraction, Labeling, and Hybridization. Cells (3 × 10 7 ) were harvested at the end of the dark period (dawn) and at the end of the light period (dusk) by ltration (30200 mL) and then ash frozen. Total RNA was extracted using the mirVANA kit from Invitrogen (total RNA extraction protocol). Transcribed mRNA was amplied and labeled using the Agilent Quick-Amp Label- ing kit (two color). Custom Agilent oligonucleotide array slides (see below) were hybridized with equal amounts of oppositely labeled sample and internal reference RNA according to Agilent array hybridization protocols, and scanned using an Agilent two- color array scanner. Dye-ipped samples and references were hybridized to correct for biases in dye incorporation. In total, 36 arrays were hybridized. Gene-Specic Microarray Design. Gene-specic oligonucleotide arrays were designed and ordered from Agilent Technologies. At least three probes were designed to match the 3end of each of the 11,390 Thaps3.0 nuclear gene models (5), as well as 180 chloroplast genes (6). This array platform is available in the GEO database (7) under accession GPL16806. Microarray Data Processing and Gene Expression Analysis. Two-color expression data were normalized using the limma package in R (8). After normalizing to each uniform internal reference sam- ple, expression ratios were calculated relative to the mean within-experiment expression level for each probe to remove batch effects between experiments. Gene-level correlation be- tween the 400 ppm and 800 ppm expression series was assessed using the Pearson method and compared with a null distribution consisting of 10 × 11,390 randomly resampled gene series pair- ings. Expression data from all corresponding time series sampling times were analyzed as replicates for the purposes of detecting whole-genome expression patterns that were associated with growth phase transitions. The whole-genome expression data were initially clustered using the K-means method to identify the dominant modes of expression. Subsequently, a two-factor ANOVA (dawn:dusk and exponential:stationary) was performed on the data using the MeV program (9). Exponential samples (n = 10) were all of those collected before the light period of day 3 (80 h). Stationary samples (n = 9) were all those collected afterward. This grouping of samples coincided with the decline in specic growth rate of the freely growing cultures to a value below 1/d. Dawn samples (n = 10) were collected in the dark after 12 h of constant darkness. Dusk samples (n = 9) were collected in the light after 12 h of constant light. The default parameters were used and signicance was assessed using an alpha threshold of 0.01 based on 1,000 permutations. Gene functional enrichment within these groups was analyzed using a binomial test with BenjaminiHochberg correction in Gitools (10) and the DAVID database (11) via the Gaggle framework (12). All microarray data used in this study are available in the GEO database (7) under accession no. GSE45252. Transcriptional Regulatory Network Inference. A model for potential gene regulation by transcription factors on target genes was inferred for each transcription factor in ve major families (13). Transcription factors were proposed as candidate regulators for Ashworth et al. www.pnas.org/cgi/content/short/1300962110 1 of 13

Supporting Information - Proceedings of the National ... Information ... 10. Perez-Llamas C, Lopez-Bigas N (2011) Gitools: Analysis and visualisation of genomic ... Matys V, et al

Embed Size (px)

Citation preview

Supporting InformationAshworth et al. 10.1073/pnas.1300962110SI Materials and MethodsGrowth and Sampling of Thalassiosira pseudonana Cultures. Axeniccultures of T. pseudonana CCMP1335 (Provasoli-Guillard Na-tional Center for Culture of Marine Phytoplankton) were grownin enriched artificial seawater (ESAW) medium (1) modifiedwith lower levels of nitrate (80 μM), silicic acid (110 μM), andphosphate (20 μM) and grown in bioreactors. Before the ex-periment, the diatoms were acclimated to constant 20 °C tem-perature and ambient CO2 under a 12:12 h dark:light diurnalcycle at 125 μmol photons·m−2·s−1 of photosynthetically activeradiation (PAR) provided by fluorescent lamps. Cultures wereconsidered acclimated when the growth rates of three consecu-tive growth curves were significantly equal to each other.Chemicals were purchased from Sigma-Aldrich. During the ex-periments the bioreactors were equilibrated at ambient (400ppm) or elevated (800 ppm) CO2 by bubbling of mixed gasses (2L/min) and stirring (50 rpm) in a 10-L glass bioreactor system(BioFlow). CO2-scrubbed ambient air and pure CO2 gas weremixed and regulated using two mass-flow controller/monitors(G265; Qubit Systems) and solenoid controller. The air supplywas monitored using a CO2 analyzer (model S155; Qubit Sys-tems) and dispersed into the media by bubble injection fromorifices in a stainless steel tube at the bottom of the bioreactorwith stirring. These sterile, equilibrated bioreactors (8 L of me-dia) were inoculated with 50,000 cells/mL of acclimated, axenicT. pseudonana and grown for up to 1 wk on a 12;12 h dark:lightcycle. pH was measured continuously with a Mettler Toledo 405DPAS SC K851425 pH probe, and calibrated with spectropho-tometric measurements, taken every 2 h (2). DIC was measuredonce daily with an Apollo SciTech’s DIC analyzer available us-ing reference materials from A. Dickson. Oxygen production wasmeasured continuously with a Mettler-Toledo Sensor inPro 6830probe. Cellular fluorescent capacity (CFC) was measured using theselective inhibitor of photosystem II, 3-(3,4-dichlorophenyl)-1,1-dimethylurea (DCMU). Samples (300 μL n = 3) were darkadapted for 5 min and the initial fluorescence was measured ina fluorescence plate reader (BIOTEK FL800) at 680 nm whenexcited at 488 nm. Thereafter a solution of DCMU was added tothe samples to a final concentration of 3 × 10−5 M and themaximal fluorescence (Fm) measured after 30 s. Cellular fluo-rescent capacity was calculated according to the equation CFC =Fm − Fo/Fm (3). Specific growth rates were calculated as thenatural log of the growth ratio over time (k = ln(Nt/No)/T2 − T1).Cell culture was stained with SYBR Green (Invitrogen) to testfor the presence (and ensure the absence of) bacteria. Physio-logical measurements and cell counts were done every 2 h andcell harvesting for RNA extraction and measurements every12 h. Nutrients (NO3, NH4, PO3, and Si) were sampled twicedaily and measured according to ref. 4, at the University ofWashington Chemical Oceanography facility.

Flow Cytometry. For the measurement of cell division, cells werestained with SYBR green counted with an Influx flow cytometer(Cytopeia). Samples were run directly in the Influx flow sorterwithout concentration. The sorter uses two solid-state lasers, withan excitation source tuned to emit at 488 nm and a second tunedat 405 nm. The cells were counted based on their forward scatterand red fluorescence characteristics. The green fluorescenceemission was measured using a combination of a 500-nm and600-nm dichroic long-pass filters and a 530/20-nm BP filterattached to a photomultiplier (R928; Hamamatsu). The redfluorescence emission was measured with a 600-nm dichroic

long-pass filter and a 640-nm long-pass filter with a 700/30-nmfilter and a red sensitive photomultiplier (R3829; Hamamatsu).The flow cytometer was calibrated with 1-μm fluorescent micro-spheres (Polysciences). Dividing cells were determined from theircell size and DNA content.

RNA Extraction, Labeling, and Hybridization. Cells (∼3 × 107) wereharvested at the end of the dark period (dawn) and at the end ofthe light period (dusk) by filtration (30–200 mL) and then flashfrozen. Total RNA was extracted using the mirVANA kit fromInvitrogen (total RNA extraction protocol). Transcribed mRNAwas amplified and labeled using the Agilent Quick-Amp Label-ing kit (two color). Custom Agilent oligonucleotide array slides(see below) were hybridized with equal amounts of oppositelylabeled sample and internal reference RNA according to Agilentarray hybridization protocols, and scanned using an Agilent two-color array scanner. Dye-flipped samples and references werehybridized to correct for biases in dye incorporation. In total, 36arrays were hybridized.

Gene-Specific Microarray Design. Gene-specific oligonucleotidearrays were designed and ordered from Agilent Technologies.At least three probes were designed to match the 3′ end of eachof the 11,390 Thaps3.0 nuclear gene models (5), as well as 180chloroplast genes (6). This array platform is available in theGEO database (7) under accession GPL16806.

Microarray Data Processing and Gene Expression Analysis. Two-colorexpression data were normalized using the limma package in R(8). After normalizing to each uniform internal reference sam-ple, expression ratios were calculated relative to the meanwithin-experiment expression level for each probe to removebatch effects between experiments. Gene-level correlation be-tween the 400 ppm and 800 ppm expression series was assessedusing the Pearson method and compared with a null distributionconsisting of 10 × 11,390 randomly resampled gene series pair-ings. Expression data from all corresponding time series samplingtimes were analyzed as replicates for the purposes of detectingwhole-genome expression patterns that were associated withgrowth phase transitions. The whole-genome expression datawere initially clustered using the K-means method to identifythe dominant modes of expression. Subsequently, a two-factorANOVA (dawn:dusk and exponential:stationary) was performedon the data using the MeV program (9). Exponential samples(n = 10) were all of those collected before the light period ofday 3 (∼80 h). Stationary samples (n = 9) were all those collectedafterward. This grouping of samples coincided with the declinein specific growth rate of the freely growing cultures to a valuebelow 1/d. Dawn samples (n = 10) were collected in the darkafter 12 h of constant darkness. Dusk samples (n = 9) werecollected in the light after 12 h of constant light. The defaultparameters were used and significance was assessed using analpha threshold of 0.01 based on 1,000 permutations. Genefunctional enrichment within these groups was analyzed usinga binomial test with Benjamini–Hochberg correction in Gitools(10) and the DAVID database (11) via the Gaggle framework(12). All microarray data used in this study are available in theGEO database (7) under accession no. GSE45252.

Transcriptional Regulatory Network Inference.Amodel for potentialgene regulation by transcription factors on target genes wasinferred for each transcription factor in five major families (13).Transcription factors were proposed as candidate regulators for

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 1 of 13

target genes if all of the following were true: (i) the transcriptionfactor (TF) and its target genes were highly correlated in ex-pression changes over multiple published array experiments(two-tailed bootstrap P < 0.025); (ii) the upstream region ofeach target gene (−400 to +50 bp) contains at least one po-tential DNA binding site that matches a conserved DNA-binding motif for the TF family (motif P < 1 × 10−4); and (iii)the upstream regions of highly correlated target genes wereenriched in putative TF-DNA binding sites, compared with all

genes (hypergeometric P < 0.05). Putative DNA-binding motifsfor conserved TF families were assumed from DNA sequencespecificity profiles that are experimentally determined for or-thologous transcription factor families in plants. These includedthe experimentally derived specificity profiles that are repre-sentative of the HSF (14), Myb (15), bZIP (16), AP2 (17), andE2F (17) families of transcription factors. The program FIMO(18) was used to detect transcription factor-DNA binding sites ingene upstream promoter regions.

1. Berges J, Franklin D, Harrison P (2001) Evolution of an artificial seawater medium:Improvements in enriched seawater, artificial water over the last two decades RIDA-4186-2010. J Phycol 37:1138–1145.

2. Dickson AG, Sabine CL, Christian JR (2007) Guide to Best Practices for Ocean CO2

Measurements (PICES Special Publication, Paris), PICES Special Publication 3, IOCCPReport No. 8.

3. Vincent WF (1980) Mechanisms of rapid photosynthetic adaptation in naturalphytoplankton communities. II. Changes in photochemical capacity as measured byDCMU-induced chlorophyll fluorescence. J Phycol 16:568–577.

4. Strickland JDH, Parsons TR (1972) A Practical Handbook of Seawater Analysis(Fisheries Research Board of Canada, Toronto).

5. Armbrust EV, et al. (2004) The genome of the diatom Thalassiosira pseudonana:ecology, evolution, and metabolism. Science 306(5693):79–86.

6. Oudot-Le Secq M-P, et al. (2007) Chloroplast genomes of the diatoms Phaeodactylumtricornutum and Thalassiosira pseudonana: Comparison with other plastid genomesof the red lineage. Mol Genet Genomics 277(4):427–439.

7. Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI geneexpression and hybridization array data repository. Nucleic Acids Res 30(1):207–210.

8. Ritchie ME, et al. (2007) A comparison of background correction methods for two-colour microarrays. Bioinformatics 23(20):2700–2707.

9. Saeed AI, et al. (2003) TM4: A free, open-source system for microarray datamanagement and analysis. Biotechniques 34(2):374–378.

10. Perez-Llamas C, Lopez-Bigas N (2011) Gitools: Analysis and visualisation of genomicdata using interactive heat-maps. PLoS ONE 6(5):e19541.

11. Huang W, Sherman BT, Lempicki RA (2009) Bioinformatics enrichment tools: Pathstoward the comprehensive functional analysis of large gene lists. Nucleic Acids Res37(1):1–13.

12. Bare JC, Shannon PT, Schmid AK, Baliga NS (2007) The Firegoose: Two-wayintegration of diverse data from different bioinformatics web resources with desktopapplications. BMC Bioinformatics 8:456.

13. Rayko E, Maumus F, Maheswari U, Jabbari K, Bowler C (2010) Transcription factorfamilies inferred from genome sequences of photosynthetic stramenopiles. NewPhytol 188(1):52–66.

14. Storozhenko S, De Pauw P, Van Montagu M, Inzé D, Kushnir S (1998) The heat-shockelement is a functional component of the Arabidopsis APX1 gene promoter. PlantPhysiol 118(3):1005–1014.

15. Romero I, et al. (1998) More than 80R2R3-MYB regulatory genes in the genome ofArabidopsis thaliana. Plant J 14(3):273–284.

16. Martínez-García JF, Moyano E, Alcocer MJ, Martin C (1998) Two bZIP proteinsfrom Antirrhinum flowers preferentially bind a hybrid C-box/G-box motif andhelp to define a new sub-family of bZIP transcription factors. Plant J 13(4):489–505.

17. Matys V, et al. (2003) TRANSFAC: Transcriptional regulation, from patterns to profiles.Nucleic Acids Res 31(1):374–378.

18. Grant CE, Bailey TL, Noble WS (2011) FIMO: Scanning for occurrences of a given motif.Bioinformatics 27(7):1017–1018.

Fig. S1. Detailed growth analysis of replicate T. pseudonana cultures. Shaded areas indicate dark (gray) or light (yellow) conditions. Upper arrows indicatesampling points for gene expression analysis. (Top) Cell densities are shown in blue, chlorophyll fluorescence (relative units) in green, and cellular fluorescentcapacity in purple. (Middle) pH is shown in black and relative O2 levels are shown in red. (Bottom) Nutrients (NO3, SiO4, and PO4) are shown as percentages ofthe starting concentrations. (A) 400 and (B) 800 ppm CO2. (C) Superposition of the cell density, fluorescence, and cellular fluorescent capacity data from A and B.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 2 of 13

Fig. S2. Specific growth rates and cytometric measurements during replicate growth experiments. (A) Specific growth rates (SI Materials and Methods).Arrows indicate sampling times for gene expression analysis. The vertical dashed line indicates the division between the exponential and stationary phases. (B)Cytometric measurements from cell culture during the experiment. Dividing and nondividing cells were estimated by cell size and DNA staining (see SI Ma-terials and Methods).

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 3 of 13

Day

1 D

aw

nD

ay1

Du

skD

ay2

Da

wn

Day

2 D

usk

Day

3 D

aw

nD

ay3

Du

skD

ay4

Da

wn

Day

4 D

usk

Day

5 D

aw

nD

ay5

Du

sk

Day

1 D

aw

nD

ay1

Du

skD

ay2

Da

wn

Day

2 D

usk

Day

3 D

aw

nD

ay3

Du

skD

ay4

Da

wn

Day

4 D

usk

Day

5 D

aw

n28334

261635261781

22473240562481629008

263371251162579722213

26147938266

26798741548

1247260835

11615644

38191

292287677

269765269393

5078262753

96883698

263181211754096638248

62903163610943

26827021989

268070261684Cp029

93752591131232

2692744142531818

26889528239

262433134852741426366

2685942234540393

6958269238

38428269127269095

22747270241

488225402

23422341

26934810495174

262463858311501

3816268127270210

303852166733018

26039242962

−4 −2 0 2 4

(GAPD4) glyceraldehyde−3−phosphate dehydrogenase(pcna) pcna−like protein(MutS) MutS familyaliphatic amidase(MLK1) myosin light chain kinase(ASA) aspartate−ammonia ligasebifunctional aspartokinase/homoserine dehydrogenasehistidine triad−like protein(PGK5) phosphoglycerate kinase(Tp_bHLH3) regulator [Rayko](PFK2) phosphofructokinase(Tp_E2F−DP1a_SET1a) regulator [Rayko](GPI1) glucose−6−phosphate isomerase(CYS2) cysteine synthase(GALE3) dual function enzyme: UDP−glucose 4−epimerase(CYS3) cysteine synthasealkaline phosphataseperoxidase/catalase(AK2) adenosine kinase(HSP60) mitochondrial chaperonin

(GcpE) gcpe−like protein(DDE) diadinoxanthin de−epoxidasepsedouridylate synthase(SQDB1) sulfolipid biosynthesis proteinhistidyl−trna synthetase(ApxS) ascorbate peroxidase(Tp_bZIP7a/PAS) regulator [Rayko](Tp_HSF_1a) regulator [Rayko]chloroplast protein Cpn21 in A−like protein(TKT2) transketolase(Tp_sigma70.6) regulator [Rayko](ALATS1) alanine−trna ligase(NDK2) nucleoside diphosphate kinasealdose−1−epimerase(DXR1) 1−deoxy−D−xylulose 5−phosphate reductoisomeraseniemann−pick C type protein−like protein(Tp_bZIP15) regulator [Rayko]threonine synthase(LEUS) leucyl−trna synthetase(rpl1) ribosomal protein L1 [chloroplast]

(Tp_HSF_4d) regulator [Rayko](Tp_bZIP6) regulator [Rayko](PFK1) 6−phosphofructokinase(NRT2) nitrate transportercytosolic malate dehydrogenasemitochondriale tricarboxylate carrier protein(SIT1) silicic acid transporter, silicon transporttriosephosphate isomerase/GA−3−P dehydrogenaseRNA binding protein of the pumilio familyxanthine uracil permease(NRT1) nitrate transporteroxoglutarate/malate translocatortyrosine aminotransferase(PYK2) pyruvate kinase(PYK1) pyruvate kinase(Tp_HSF_4e) regulator [Rayko](Tp_HSF_1.3e) HSF35, heat shock transcription factor(AOX1_2) mitochondrial alternative oxidase(ACD3) probable acyl−coa dehydrogenase, short/branched chainmethyl transferase−like protein

(Lhcr10) fucoxanthin chlorophyll a/c light−harvesting protein(Lhcf11) fucoxanthin−chlorophyll a−c binding protein(Lhcr4) fucoxanthin chl a/c light−harvesting protein(Lhcf10) fucoxanthin chl a/c light−harvesting protein, major type(Lhcr11) fucoxanthin chlorophyll a/c light−harvesting protein(Lhcr12) fucoxanthin chlorophyll a/c proteininorganic pyrophosphataseoxidoreductase(Lhcf8) fucoxanthin chlorophyll a/c protein 8(Lhcr3) fucoxanthin chl a/c light−harvesting protein(Lhcf1) fucoxanthin chlorophyll a/c protein 1(FCP3) pt17531−like protein(Lhcr7) fucoxanthin chl a/c light−harvesting protein(Lhcf9) fucoxanthin chlorophyll a/c protein 8(PsbO) oxygen−evolving enhancer protein 1 precursor(Lhcx6_1) fucoxanthin chlorophyll a/c protein −LI818 clade(Lhcf7) fucoxanthin chlorophyll a/c light−harvesting protein(Lhcf6) fucoxanthin chlorophyll a/c protein 6(Lhcf2) fucoxanthin chlorophyll a/c protein 2(Lhcf5) fucoxanthin chlorophyll a/c protein 5

growth (400ppm) growth (800ppm)

Log2 Expression ratio

A B

C

Exponential

Statio

nary

“Dawn” (12h dark)

“Dusk” (12 h lig

ht)

N−Glycan biosynthesis (24)

Oxidative phosphorylation (55)

Protein export (22)

Purine metabolism (96)

Base excision repair (16)

Valine, leucine and isoleucine degradation (32)

Nucleotide excision repair (36)

Cysteine and methionine metabolism (27)

Fructose and mannose metabolism (15)

Starch and sucrose metabolism (13)

Pyrimidine metabolism (61)

Citrate cycle (TCA cycle) (22)

Phagosome* (36)

Pyruvate metabolism (34)

Glyoxylate and dicarboxylate metabolism (10)

Aminoacyl−tRNA biosynthesis (37)

Pentose phosphate pathway (22)

Proteasome (36)

Amino sugar and nucleotide sugar metabolism (40)

Terpenoid backbone biosynthesis (18)

Photosynthesis (30)

Mismatch repair (19)

DNA replication (30)

Porphyrin and chlorophyll metabolism (29)

Enrichment score (corrected p-value)

KEGG Enrichment

“Daw

n”-a

ssoc

iate

d“D

usk”

-ass

ocia

ted

Expo

nent

ial

Stat

iona

ry

Fig. S3. Summary of genome-wide expression data. (A) The top 20 genes for which expression increased most highly during dawn following 12 h of darkness,dusk following 12 h of light, exponential growth, and nutrient-depleted stationary phase. Numbers are Joint Genome Institute (JGI) protein identifiers. (B)Distribution of gene series correlations between 400 ppm and 800 ppm CO2 replicates. The distribution of gene series correlations between the 400 ppm and800 ppm CO2 expression series is shown in red. Null distribution consisting of resampled gene series pairings from the two series is shown in black. Dashed linesindicate alpha levels of 0.05 and 0.95 for the null distribution. (C) The association of functional pathways [Kyoto Encyclopedia of Genes and Genomes (KEGG)]with the four conditions (dawn, dusk, exponential, and stationary), as a function of the enrichment of significantly up-regulated genes. Genes mapping to thephagosome pathway in KEGG (ko04145, marked by an asterisk) include actin, dynein, tubulin, protein transport protein Sec61-1 and -2, and several vacuolarATPases that are likely involved in cell wall and secretory mechanisms.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 4 of 13

Fig. S4. Growth phase-associated expression patterns of genes putatively involved in cell morphology, secretion, and cell wall biogenesis. Aqua boxes (Right)indicate significant association with the conditional factors according to the analysis of variance. Numbers indicate JGI protein identifiers.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 5 of 13

36208

262555

261750

26941

26031

269942

39799

26190

262125

406

25544

2846

32096

35314

29212

269866

656

263562

268449

41433

264121

41979

268293

26457

11615

262809

35409

38724

262753

263062

263061

40713

32874

(GDCT) glycine decarboxylase t−protein

Soluble liver antigenliver pancreas antigen

hydroxypyruvate reductase−like protein

(NIR_1) NADPH nitrite reductase

serine hydroxymethyltransferase

glycine or serine hydroxymethyltransferase

(GDCP) glycine decarboxylase p−protein

serine hydroxymethyltransferase

(NIR_2) nitrite reductase−ferredoxin dependent

(GOX) glycolate oxidase

(PGP) phosphoglycolate phosphatase

glycerate dehydrogenase and hydroxypyruvate reductase

hydroxyacylglutathione hydrolase

glutathione peroxidase

(GSHS) glutathione synthetase

glutathione s−transferase

glutathione s−transferase

glutathione s−transferase

glutathione s−transferase

glutathione reductase

glutathione s−transferase

probable glutathione reductase−like protein

hydroxyacylglutathione hydrolase

(GR) glutathione reductase, gro−2

peroxidase/catalase

peroxisomal ascorbate peroxidase

cytochrome C peroxidase

ascorbate peroxidase

(ApxS) ascorbate peroxidase

superoxide dismutase (Fe)

iron superoxide dismutase

mitochondrial manganese superoxide dismutase

mitochondrial manganese superoxide dismutase

9710

7702

11117

11118

40713

38187

270007

268857

2505

36443

270038

3463

261803

9714

32874

4370

MAP Kinase

viral inhibitor of apoptosis

mitochondrial Mn superoxide dismutase

Tp metacaspase 6

Tp metacaspase 5

Tp metacaspase 4

Tp metacaspase 3

Tp metacaspase 2

Tp metacaspase 1

subtilisin family serine protease

caspase activity recruitment domain

caspase activity recruitment domain

mitochondrial Mn superoxide dismutase

stress−regulated cysteine peptidase

−2 −1 0 1 2

Da

y1

Da

wn

Da

y1

Du

skD

ay

2 D

aw

nD

ay

2 D

usk

Da

y3

Da

wn

Da

y3

Du

skD

ay

4 D

aw

nD

ay

4 D

usk

Da

y5

Da

wn

Da

y5

Du

sk

growth Ex

po

ne

nti

al

Sta

tio

na

ryD

aw

nD

usk

Da

y1

Da

wn

Da

y1

Du

skD

ay

2 D

aw

nD

ay

2 D

usk

Da

y3

Da

wn

Da

y3

Du

skD

ay

4 D

aw

nD

ay

4 D

usk

Da

y5

Da

wn

Da

y5

Du

sk

growth Ex

po

ne

nti

al

Sta

tio

na

ryD

aw

nD

usk

log2 Expression Ratio

A

B

Fig. S5. (A) Growth phase expression patterns of genes involved in photorespiration and oxidative stress. Right bar is colored according to the following genegroups: blue, superoxide dismutases; red, ascorbate peroxidases; gold, glutathione-related enzymes; and gray, other. Right aqua boxes indicate significantassociation with the conditional factors according to the analysis of variance. (B) Growth phase-associated expression patterns of putative metacaspases andcell death-related genes. Right aqua boxes indicate significant association with the conditional factors according to the analysis of variance. Numbers indicateJGI protein identifiers.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 6 of 13

Fig. S6. RuBisCO subunits and several other enzymes related to photosynthesis were differentially expressed between dawn and dusk. Asterisks indicategenes that were significantly differentially expressed (t test, Bonferroni-corrected P values <0.05).

Fig. S7. The expression of genes for nutrient assimilation increases in response to the depletion of their respective nutrients during growth. Right aqua boxesindicate significant association with the conditional factors according to the analysis of variance. Colored bar indicates the following gene categories: blue, cellmembrane nitrate, urea, and ammonium transporters; orange, cytosolic enzymes; green, plastid-borne genes; cyan, silicate transporters; and brown, phos-phate transporters. Numbers indicate JGI protein identifiers.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 7 of 13

Fig. S8. The expression patterns of several hundred genes were associated with nonlinear combinatorial effects between light:dark and exponential:sta-tionary conditions. (A) Exponential phase modulation occurred for genes that exhibited diel expression changes only during exponential phase. (B) Stationaryphase modulation occurred for genes that exhibited diel expression changes only during stationary phase. (C) Diel phase reversal occurred for genes whose dielfluctuations in expression apparently switched from high dawn expression to high dusk expression (or vice versa). Genes with less than twofold changes in dielexpression during either relevant phase are excluded from this figure for clarity. See Dataset S1 for individual gene annotations.

20065 AUREO1b

37198 AUREO1c10414 CPD4

262946 CPF114315 CPD1

22848 Dph

263935 PAC33193 PAS20899 bHLH / PAS

4202 bHLH / PAS9688 bZIP / PAS9689 bZIP / PAS

262720 PAS

−2−1

01

2lo

g2 e

xpre

ssio

n ra

tiolo

g2 e

xpre

ssio

n ra

tio−2

−10

12

Day1Day2

Day3Day4

Day5

C

B

A

D

E

33407263504232082223426272179372616312616062627209689968842022089933193263935

−2 0 2

PASPAS / PACbHLH / PASPASPASGC / PASPASPASPASbZIP / PASbZIP / PASbHLH / PASbHLH / PASPASPAC

CPD3CPD2CPF4CPF3CPF2

AUREO1aAUREO2

CPF1CPD1DphAUREO1bCPD4AUREO1c

4099573682350026898835005

333402629461431522848200651041437198

Ex

po

ne

nti

al

Sta

tio

na

ryD

aw

nD

usk

Day

1 D

aw

nD

ay1

Du

skD

ay2

Da

wn

Day

2 D

usk

Day

3 D

aw

nD

ay3

Du

skD

ay4

Da

wn

Day

4 D

usk

Day

5 D

aw

nD

ay5

Du

sk

33407

log2 Expression Ratio

843110912124928082352826229826465326830126972511180392412507820732133262284822792122516961579013134

78913427362025018185015533855933979240561579133955375231403026121726988926336625304353873377287191626558745945110372886535350263251142425775

Da

y1

Da

wn

Da

y1

Du

skD

ay

2 D

aw

nD

ay

2 D

usk

Da

y3

Da

wn

Da

y3

Du

skD

ay

4 D

aw

nD

ay

4 D

usk

Da

y5

Da

wn

Da

y5

Du

sk

growth

growth

Ex

po

ne

nti

al

Sta

tio

na

ry

Da

wn

Du

sk

Fig. S9. Expression profiles for putative photoreceptors (A) and signal-sensing genes (B), including those for which expression was not detectably associatedwith light:dark or exponential:stationary phase transitions. Aqua boxes indicate gene/condition associations by ANOVA at P < 0.01. Multicolored bar indicatesthe following classes of genes: putative photoreceptors and photolyases (blue) and PAS/PAC domain-containing genes (green). (C) The expression time series ofdifferentially expressed photoreceptors (Dph, CPF1, Aureo 1B and 1C) and photolyases (CPD1, CPD4). (D) The expression time series for differentially expressedgenes containing signal-sensing PAS domains. (E) Putative two-component signaling elements and membrane-bound receptors that were significantly asso-ciated with one or more states. Aqua boxes indicate gene/condition associations by ANOVA. Green bar denotes genes containing the following InterPro motifsin conjunction with at least one putative transmembrane domain, as determined by TMHMM v2: protein kinase (IPR000719), response regulator receiver(IPR001789), GAF (IPR003018), GPCR (IPR000337), rhodopsin-like GPCR family (IPR000276), LRR (IPR001611), NB-ARC (IPR002182) and small GTPases (IPR001806).Pink bar indicates putative soluble protein kinases (genes containing IPR00719 motif). Numbers are JGI protein identifiers.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 8 of 13

Fig. S10. (A) Expression patterns of all transcription factors that were significantly associated with dawn, dusk, exponential, and stationary phases. Aquaboxes indicate gene/condition associations by ANOVA. (B) Transcription factors were differentially expressed along two dimensions: dawn vs. dusk andexponential vs. stationary phases. TFs for which the difference in expression was significant and greater than twofold over at least one transition arehighlighted. Numbers indicate JGI protein identifiers.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 9 of 13

Fig. S11. (A) Genes involved in the regulation of chromatin state are associated with dawn, dusk, exponential or stationary conditions. Colored bars indicatethe following gene groups: C5 cytosine DNA methylase (yellow), histones (green), lysine acetyltransferases (light blue), lysine deacetylase (dark blue), lysinedemethylases (purple), and lysine methyltransferases (magenta). Aqua boxes in the Middle indicate significant association with the conditional factors ac-cording to the analysis of variance. (B) Genes with annotated functions for chromatin state regulation were differentially expressed along two dimensions:dawn vs. dusk and exponential vs. stationary phases. Numbers indicate JGI protein identifiers.

Fig. S12. Core expression network generated from our data and all other publicly available microarray data. (A) The 1,840 genes that were significantlyassociated with growth phase transitions during this experiment are also highly correlated over 40 array conditions, including data obtained from refs. 12, 15,51, and 52 (bootstrap P < 0.01). Genes (colored circles) are arranged in a force-weighted network that subdivides the transcriptome into units of conditionalcoexpression: dawn (1), dusk (2 and 3), exponential (4 and 5), and stationary (6 and 7) phases. The genes and functional enrichments for each of these clustersare listed in Dataset S2. (B) Sensing, signaling, regulatory, and chromatin-modifying genes that are members of these expression clusters. Numbers indicate JGIprotein identifiers.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 10 of 13

Table S1. Top 20 genes that were significantly associated and most highly expressed under each of the four phases:(i) dawn, (ii) dusk (iii) exponential, and (iv) stationary

Gene Name Protein Fold change

Higher expression at dawnTHAPSDRAFT_11564 Hypothetical protein 23.3THAPSDRAFT_31636 Aldose-1-epimerase 19.6THAPSDRAFT_2892 Hypothetical protein 16.5THAPSDRAFT_40966 Tp_sigma70.6 Regulator [Rayko] 13.0THAPSDRAFT_22699 Hypothetical protein 11.6THAPSDRAFT_262753 ApxS Ascorbate peroxidase 11.4THAPSDRAFT_269765 Psedouridylate synthase 11.0THAPSDRAFT_1734 Hypothetical protein 10.8THAPSDRAFT_258111 Hypothetical protein 9.0THAPSDRAFT_6731 Hypothetical protein 8.9THAPSDRAFT_3215 Hypothetical protein 8.7THAPSDRAFT_20593 Hypothetical protein 8.6THAPSDRAFT_3698 Tp_HSF_1a Regulator [Rayko] 8.6THAPSDRAFT_25629 Hypothetical protein 8.4THAPSDRAFT_270308 Hypothetical protein 8.0THAPSDRAFT_37277 Hypothetical protein 7.7THAPSDRAFT_22725 Hypothetical protein 7.6THAPSDRAFT_261684 LEUS Leucyl-trna synthetase 7.5THAPSDRAFT_22429 Hypothetical protein 7.4THAPSDRAFT_6290 NDK2 Nucleoside diphosphate kinase 7.4

Higher expression at duskTHAPSDRAFT_6551 Hypothetical protein 9.6THAPSDRAFT_22860 Hypothetical protein 9.6THAPSDRAFT_18846 Hypothetical protein 8.9THAPSDRAFT_25116 PGK5 Phosphoglycerate kinase 7.9THAPSDRAFT_24649 Hypothetical protein 7.4THAPSDRAFT_32555 Hypothetical protein 7.3THAPSDRAFT_7406 Hypothetical protein 7.1THAPSDRAFT_644 AK2 Adenosine kinase 6.9THAPSDRAFT_8522 Hypothetical protein 6.5THAPSDRAFT_34592 Hypothetical protein 6.4THAPSDRAFT_4536 Hypothetical protein 6.4THAPSDRAFT_25797 Tp_bHLH3 Regulator [Rayko] 6.4THAPSDRAFT_262506 Hypothetical protein 6.3THAPSDRAFT_2376 Hypothetical protein 6.2THAPSDRAFT_4214 Hypothetical protein 6.2THAPSDRAFT_5431 Hypothetical protein 6.2THAPSDRAFT_11569 Hypothetical protein 6.2THAPSDRAFT_22213 PFK2 Phosphofructokinase 6.1THAPSDRAFT_10366 Hypothetical protein 6.1THAPSDRAFT_267987 CYS2 Cysteine synthase 6.0

Higher expression during exponential phaseTHAPSDRAFT_2356 Hypothetical protein 30.9THAPSDRAFT_3453 Hypothetical protein 18.4THAPSDRAFT_20603 Hypothetical protein 16.6THAPSDRAFT_20335 Hypothetical protein 15.3THAPSDRAFT_7916 FCP_2 Hypothetical protein 14.3THAPSDRAFT_11557 Hypothetical protein 13.5THAPSDRAFT_2341 Lhcr12 Fucoxanthin chlorophyll a/c protein 13.3THAPSDRAFT_1049 Oxidoreductase 13.3THAPSDRAFT_2673 Hypothetical protein 12.7THAPSDRAFT_33018 Lhcf6 Fucoxanthin chlorophyll a/c protein 6 12.0THAPSDRAFT_2342 Lhcr11 Fucoxanthin chlorophyll a/c protein 11.9THAPSDRAFT_5793 Hypothetical protein 11.3THAPSDRAFT_38583 Lhcf1 Fucoxanthin chlorophyll a/c protein 1 11.3THAPSDRAFT_270210 PsbO Oxygen-evolving enhancer protein 1 11.0THAPSDRAFT_260392 Lhcf2 Fucoxanthin chlorophyll a/c protein 2 10.9THAPSDRAFT_22747 Lhcr10 Fucoxanthin chlorophyll a/c protein 10.9THAPSDRAFT_30385 Lhcx6_1 Fucoxanthin chlorophyll a/c protein 10.8THAPSDRAFT_270241 Lhcf11 Fucoxanthin-chlorophyll a-c binding prot 10.7THAPSDRAFT_1989 Hypothetical protein 10.6

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 11 of 13

Table S1. Cont.

Gene Name Protein Fold change

THAPSDRAFT_268127 Lhcf9 Fucoxanthin chlorophyll a/c protein 8 10.4Higher expression during stationary phase

THAPSDRAFT_24738 Hypothetical protein 26.5THAPSDRAFT_269095 Methyl transferase-like protein 16.1THAPSDRAFT_795 Hypothetical protein 15.4THAPSDRAFT_20827 Hypothetical protein 13.8THAPSDRAFT_268594 Tyrosine aminotransferase 11.6THAPSDRAFT_23160 Hypothetical protein 11.5THAPSDRAFT_39516 Hypothetical protein 11.4THAPSDRAFT_4355 Hypothetical protein 11.1THAPSDRAFT_22731 Hypothetical protein 10.7THAPSDRAFT_10740 Hypothetical protein 10.7THAPSDRAFT_8521 Hypothetical protein 10.7THAPSDRAFT_260934 Hypothetical protein 10.0THAPSDRAFT_5394 Hypothetical protein 9.8THAPSDRAFT_5424 Hypothetical protein 9.3THAPSDRAFT_11569 Hypothetical protein 8.6THAPSDRAFT_268895 SIT1 Silicic acid transporter, silicon transport 8.5THAPSDRAFT_4067 Hypothetical protein 8.4THAPSDRAFT_13485 Xanthine uracil permease 8.4THAPSDRAFT_22345 PYK2 Pyruvate kinase 8.3THAPSDRAFT_22671 Hypothetical protein 7.6

Table S2. DAVID database enrichment categories for condition-associated genes

Categories Cluster enrichment score

Genes that differ between dark and light periods 2,846 genes (1,146 up at dawn and 1,700 up at dusk)

Genes higher at dawn/lower at duskChloroplast, plastid, translation, rRNA-binding, ribosome, mitochondrion 13.57Aminoacyl-tRNA ligase and synthetase, amino acid activation, protein biosynthesis 4.41Transcription factor activity, heat-shock factor, regulation of transcription 1.99ABC transporter, ATPase activity 1.57Isoprenoid, terpenoid, carotenoid, pigment biosynthetic processes 1.36

Genes higher at dusk/lower at dawnDNA replication, mismatch repair, nucleotide excision repair 3.3Microtubule, cytoskeleton, motor activity, kinesin 2.63Proteasome, protein catabolic process, proteolysis 1.78

Genes that differ between exponential and stationary phases 1,747 genes (875 up in exponentialphase, 872 up in stationary phase)

Genes higher during exponential phase/lower during stationary phasePhotosynthesis, thylakoid, light harvesting, chloroplast, electron transport chain 10.64Cis-trans isomerase activity 4.94Polysaccharide metabolic process, chitin binding 3.42Tetrapyrrole/chlorophyll/pigment metabolic process 3.47

Genes higher during stationary phase/lower during exponential phaseGlycolysis, gluconeogenesis, citrate cycle 2.67Glutamate metabolic process, carboxylic acid biosynthetic process 1.51Transcription factor activity, heat-shock factor, regulation of transcription 1.2

Genes for which the effects of dawn:dusk and exponential:stationary transitions interact 478 genesTranslation, plastid, rRNA binding, chloroplast, ribosome, mitochondrion 6.76Aminoacyl-tRNA ligase and synthetase, amino acid activation, protein biosynthesis 5.08ATP binding, ribonucleotide binding 1.89

Significance (−log(P values)) for each gene cluster enrichment is shown in parentheses.

Ashworth et al. www.pnas.org/cgi/content/short/1300962110 12 of 13