10
Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor, Kevin J. Wu, Joshua Osborn, Laurie G. Smith, and Steven P. Briggs 1 Division of Biological Sciences, University of California, San Diego, La Jolla, CA 92093 Contributed by Steven P. Briggs, October 15, 2013 (sent for review September 10, 2013) A comprehensive knowledge of proteomic states is essential for understanding biological systems. Using mass spectrometry, we mapped an atlas of developing maize seed proteotypes compris- ing 14,165 proteins and 18,405 phosphopeptides (from 4,511 proteins), quantied across eight tissues. We found that many of the most abundant proteins are not associated with detectable levels of their mRNAs, and we provide evidence for three potential explanations: transport of proteins between tissues; diurnal, out- of-phase accumulation of mRNAs and cognate proteins; and differential lifetimes of mRNAs compared with proteins. Likewise, many of the most abundant mRNAs were not associated with detectable levels of their proteins. Across the entire dataset, protein abundance was poorly correlated with mRNA levels and was largely independent of phosphorylation status. Comparisons between proteotypes revealed the quantitative contribution of specic proteins and phosphorylation events to the spatially and temporally regulated starch and oil biosynthetic pathways. Re- construction of signaling networks established associations of proteins and phosphoproteins with distinct biological processes acting during seed development. Additionally, a protein kinase substrate network was reconstructed, enabling the identication of 762 potential substrates of specic protein kinases. Finally, examination of 694 transcription factors revealed remarkable constraints on patterns of expression and phosphorylation within transcription factor families. These results provide a resource for understanding seed development in a crop that is the foundation of modern agriculture. quantitative proteomics | protein phosphorylation | systems biology A central goal of biology is to understand phenotype. Proteins make or regulate every component of cells, and therefore phenotype is an emergent property of the specic state of the proteome. The proteomic state of a cell is its proteotype, which integrates the constraints of its genotype, developmental history, and environment. Thus, a complete description of the proteotype should dene a phenotype at the molecular level. Typically, measurements of mRNA abundance are used to infer the pro- teotype (1, 2). However, it has become clear that mRNA levels are poorly correlated with protein abundance (38). Proteome- wide surveys are crucial for bridging this gap and dening specic cellular proteotypes. Maize is a model organism with a rich history in fundamental research in addition to being the worlds largest production crop. The maize seed is a developmentally complex structure com- prised of two major compartments, the diploid embryo and the triploid endosperm, that arise from two separate fertilization events (double fertilization) and are enclosed within the mater- nally derived pericarp (9). Like in other grasses, the maize en- dosperm is persistent throughout seed development (10). The endosperm consists primarily of starchy endosperm cells that are responsible for synthesis of starch and storage proteins and its perimeter is comprised of a single layer of aleurone cells. At maturity, the embryo is comprised of a root meristem, a shoot meristem, and ve or six leaf primordia enclosed within the scutellum (9, 11). Additionally, the embryo is the primary site of lipid biosynthesis in the seed. The production of storage products during seed formation is tightly regulated, and their accumulation is directly correlated with cell number and cell size (12, 13). Thus, the maize seed is an excellent model for proling the proteotypes from a complex set of tissues that exhibit ex- tensive spatiotemporal control and coordinated morphogenesis. We used mass spectrometry (MS) to build an atlas of proteo- types for the developing maize seed based on protein abundance and levels of protein phosphorylation. These quantitative, highly replicated data enabled the reconstruction of protein networks for key biochemical processes and for developmental pathways. Results Mapping the Maize Seed Proteotype Atlas. To enhance our un- derstanding of regulatory events controlling seed development as well as the key harvested traits of starch, lipid, and storage protein accumulation, we hand-dissected the maize seed into compartments at seven stages of development for MS analyses (Fig. 1A). These compartments include embryo, endosperm, and aleurone/pericarp tissues. Total protein was extracted from each sample, and tryptic peptides from the samples, with or without phosphopeptide enrichment, were analyzed by MS. The spectra were searched by using the B73 RefGen_v2 5a Working Gene Set (WGS) (14). By using stringent cutoffs to maintain a low false discovery rate at the spectral, peptide, and protein level, we identied 13,459 proteins (protein groups), originating from 13,203 gene models, based solely on 108,786 distinct nonmod- ied peptides (Fig. 1B and Dataset S1). The genes responsible for producing 12,453 of the proteins could be unequivocally assigned by the identication of at least one uniquely mapping peptide (Dataset S1). These proteins are predominantly in the ltered gene set (FGS), which consists of 39,656 high-condence gene models that exclude transposons, pseudogenes, and other Signicance Here we report deep, quantitative, and replicated proteome analysis of a developing multicellular organism. We quantied protein abundance and levels of protein phosphorylation dur- ing development of the maize seed. The depth and quantita- tive nature of the data enabled a network-based approach to identify kinase-substrate relationships as well as the recon- struction of biochemical and signaling networks that underpin seed development and seed storage product production. We found that many of the most abundant proteins are not asso- ciated with detectable levels of their mRNAs and vice versa. These data signicantly add to our understanding of seed de- velopment and facilitate knowledge-based crop improvement. Author contributions: J.W.W., L.G.S., and S.P.B. designed research; J.W.W., Z.S., J.O., and L.G.S. performed research; J.W.W., Z.S., R.S., K.J.W., and S.P.B. analyzed data; and J.W.W., Z.S., and S.P.B. wrote the paper. The authors declare no conict of interest. Data deposition: Raw spectra have been deposited at the Mass Spectrometry Interactive Virtual Environment (MassIVE) repository, http://proteomics.ucsd.edu/ProteoSAFe/datasets. jsp (nonmodied proteome ID MSV000078444 and phosphoproteome ID MSV000078443). 1 To whom correspondence should be addressed. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1319113110/-/DCSupplemental. E4808E4817 | PNAS | Published online November 18, 2013 www.pnas.org/cgi/doi/10.1073/pnas.1319113110 Downloaded by guest on January 3, 2020

Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

Reconstruction of protein networks from an atlas ofmaize seed proteotypesJustin W. Walley, Zhouxin Shen, Ryan Sartor, Kevin J. Wu, Joshua Osborn, Laurie G. Smith, and Steven P. Briggs1

Division of Biological Sciences, University of California, San Diego, La Jolla, CA 92093

Contributed by Steven P. Briggs, October 15, 2013 (sent for review September 10, 2013)

A comprehensive knowledge of proteomic states is essential forunderstanding biological systems. Using mass spectrometry, wemapped an atlas of developing maize seed proteotypes compris-ing 14,165 proteins and 18,405 phosphopeptides (from 4,511proteins), quantified across eight tissues. We found that manyof the most abundant proteins are not associated with detectablelevels of their mRNAs, and we provide evidence for three potentialexplanations: transport of proteins between tissues; diurnal, out-of-phase accumulation of mRNAs and cognate proteins; anddifferential lifetimes of mRNAs compared with proteins. Likewise,many of the most abundant mRNAs were not associated withdetectable levels of their proteins. Across the entire dataset,protein abundance was poorly correlated with mRNA levels andwas largely independent of phosphorylation status. Comparisonsbetween proteotypes revealed the quantitative contribution ofspecific proteins and phosphorylation events to the spatially andtemporally regulated starch and oil biosynthetic pathways. Re-construction of signaling networks established associations ofproteins and phosphoproteins with distinct biological processesacting during seed development. Additionally, a protein kinasesubstrate network was reconstructed, enabling the identificationof 762 potential substrates of specific protein kinases. Finally,examination of 694 transcription factors revealed remarkableconstraints on patterns of expression and phosphorylation withintranscription factor families. These results provide a resource forunderstanding seed development in a crop that is the foundationof modern agriculture.

quantitative proteomics | protein phosphorylation | systems biology

A central goal of biology is to understand phenotype. Proteinsmake or regulate every component of cells, and therefore

phenotype is an emergent property of the specific state of theproteome. The proteomic state of a cell is its proteotype, whichintegrates the constraints of its genotype, developmental history,and environment. Thus, a complete description of the proteotypeshould define a phenotype at the molecular level. Typically,measurements of mRNA abundance are used to infer the pro-teotype (1, 2). However, it has become clear that mRNA levelsare poorly correlated with protein abundance (3–8). Proteome-wide surveys are crucial for bridging this gap and defining specificcellular proteotypes.Maize is a model organism with a rich history in fundamental

research in addition to being the world’s largest production crop.The maize seed is a developmentally complex structure com-prised of two major compartments, the diploid embryo and thetriploid endosperm, that arise from two separate fertilizationevents (double fertilization) and are enclosed within the mater-nally derived pericarp (9). Like in other grasses, the maize en-dosperm is persistent throughout seed development (10). Theendosperm consists primarily of starchy endosperm cells that areresponsible for synthesis of starch and storage proteins and itsperimeter is comprised of a single layer of aleurone cells. Atmaturity, the embryo is comprised of a root meristem, a shootmeristem, and five or six leaf primordia enclosed within thescutellum (9, 11). Additionally, the embryo is the primary siteof lipid biosynthesis in the seed. The production of storage

products during seed formation is tightly regulated, and theiraccumulation is directly correlated with cell number and cell size(12, 13). Thus, the maize seed is an excellent model for profilingthe proteotypes from a complex set of tissues that exhibit ex-tensive spatiotemporal control and coordinated morphogenesis.We used mass spectrometry (MS) to build an atlas of proteo-

types for the developing maize seed based on protein abundanceand levels of protein phosphorylation. These quantitative, highlyreplicated data enabled the reconstruction of protein networks forkey biochemical processes and for developmental pathways.

ResultsMapping the Maize Seed Proteotype Atlas. To enhance our un-derstanding of regulatory events controlling seed development aswell as the key harvested traits of starch, lipid, and storageprotein accumulation, we hand-dissected the maize seed intocompartments at seven stages of development for MS analyses(Fig. 1A). These compartments include embryo, endosperm, andaleurone/pericarp tissues. Total protein was extracted from eachsample, and tryptic peptides from the samples, with or withoutphosphopeptide enrichment, were analyzed by MS. The spectrawere searched by using the B73 RefGen_v2 5a Working GeneSet (WGS) (14). By using stringent cutoffs to maintain a lowfalse discovery rate at the spectral, peptide, and protein level, weidentified 13,459 proteins (protein groups), originating from13,203 gene models, based solely on 108,786 distinct nonmod-ified peptides (Fig. 1B and Dataset S1). The genes responsiblefor producing 12,453 of the proteins could be unequivocallyassigned by the identification of at least one uniquely mappingpeptide (Dataset S1). These proteins are predominantly in thefiltered gene set (FGS), which consists of 39,656 high-confidencegene models that exclude transposons, pseudogenes, and other

Significance

Here we report deep, quantitative, and replicated proteomeanalysis of a developing multicellular organism. We quantifiedprotein abundance and levels of protein phosphorylation dur-ing development of the maize seed. The depth and quantita-tive nature of the data enabled a network-based approach toidentify kinase-substrate relationships as well as the recon-struction of biochemical and signaling networks that underpinseed development and seed storage product production. Wefound that many of the most abundant proteins are not asso-ciated with detectable levels of their mRNAs and vice versa.These data significantly add to our understanding of seed de-velopment and facilitate knowledge-based crop improvement.

Author contributions: J.W.W., L.G.S., and S.P.B. designed research; J.W.W., Z.S., J.O., andL.G.S. performed research; J.W.W., Z.S., R.S., K.J.W., and S.P.B. analyzed data; and J.W.W.,Z.S., and S.P.B. wrote the paper.

The authors declare no conflict of interest.

Data deposition: Raw spectra have been deposited at the Mass Spectrometry InteractiveVirtual Environment (MassIVE) repository, http://proteomics.ucsd.edu/ProteoSAFe/datasets.jsp (nonmodified proteome ID MSV000078444 and phosphoproteome ID MSV000078443).1To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1319113110/-/DCSupplemental.

E4808–E4817 | PNAS | Published online November 18, 2013 www.pnas.org/cgi/doi/10.1073/pnas.1319113110

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 2: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

low-confidence members of the WGS (Fig. 1 D and E andDataset S1). Finally, we identified and measured 4,511 phos-phoproteins based on 18,405 distinct phosphopeptides contain-ing 19,049 sites of phosphorylation, 8,889 of which were localizedto a specific amino acid (Fig. 1C and Dataset S2).The ∼5 million identified mass spectra, collected from several

biological replicates (n = 4–7 for each nonmodified proteotypeand n = 3–6 for each phosphoproteotype), were used to quantifyprotein abundance and phosphorylation levels by spectralcounting (15, 16). To assess reproducibility and accuracy of thebiological replicates, we computed Pearson correlations andfound averages of 0.92 and 0.62 for the nonmodified proteomeand phosphoproteome replicates, respectively (Dataset S3). Ad-ditionally, the data accurately reflect known patterns of proteinaccumulation (Fig. S1) (9, 17–22). Examination of the datarevealed that the majority of proteins are present in multiple tissueswhereas only 1,203 are tissue-specific (Fig. S2 A and B). Thesetissue-specific proteins are enriched in MapMan functional cate-gories (i.e., ontological terms) (23) including “receptor kinases”and “regulation of transcription.” Additionally, most of theglobally expressed proteins exhibit dynamic patterns of accumu-lation during seed development. Specifically, of the 4,709 glob-ally expressed proteins, only 180 are stably expressed throughoutdevelopment (change less than twofold in abundance; Fig. S2C),suggesting that dynamic changes in protein abundance underpinseed development.

Relationship Between Transcript Level and Protein Abundance. Weexplored the relationship between mRNA and protein levels in

the endosperm 12 d after pollination (DAP) and in the embryo20 DAP by comparing our prototype data with publically avail-able transcript profiling data (24). Considering only genes forwhich mRNA and protein were both reliably measured (endo-sperm, n= 5,922; embryo, n= 7,257), we found poor correlationbetween transcript and protein levels (endosperm r = 0.414,embryo r = 0.413; Dataset S4). Although the global correlationwas low, there were a wide range of correlations dependent onthe functional category (MapMan bins; Dataset S4); for instance,“aspartate metabolism” (r = 0.99), “phosphoenolpyruvate car-boxylase” (r = 0.98), “oxidative pentose phosphate 6-phospho-gluconate dehydrogenase” (r = 0.92), “abscisic acid signaltransduction” (r = −0.17), “auxin metabolism” (r = −0.06),“auxin response factor” (ARF; r = −0.001), “basic leucine zipper(bZIP) transcription factor (TF) family” (r= 0.03), and “cell wallmodification” (r = −0.70), indicating that posttranscriptionalregulation of protein abundance is function-specific.We observed that 22% (endosperm) and 21% (embryo) of the

genes had matching rank abundance between their mRNA andprotein (Fig. 2A, yellow dots) and 29% of these genes were thesame in both tissues (i.e., protein and mRNA; Fig. 2B), sug-gesting that they are largely free of posttranscriptional regulation(Dataset S5). Additionally, many genes produced high abun-dance mRNA but little or no detectable protein (Fig. 2, Fig. S3A,and Dataset S5). Such cases may be explained by translationalinhibition or targeted protein degradation.Surprisingly, for many of the most abundant proteins there was

little or no detectable mRNA (Fig. 2, Fig. S3A, and Dataset S5).To verify this remarkable finding, we performed quantitative

Pericarp (2n) + Aleurone (3n)

Nucellus (2n)Endosperm (3n)

Embryo (2n)

8 DAP 10 DAP 12 DAP 20 DAP 27 DAP 38 DAP 2 DAI

Collected Spectra

Identified Spectra Proteins

Non-Modified Proteome(FDR)

67,167,550 4,194,218(0.03%)

13,459(0.82%)

Phosphoproteome(FDR)

961,912(0.03%)

4,511(2.19%)

A

B

CSite Percent

pSer 86.3%pThr 13.1%pTyr 0.61%

Transposable elementPseudogeneProtein Coding

97.9%

0.9% 1.2%D E

Total 99,444,223 5,156,130

DistinctPeptides108,786(0.12%)

59,780(0.23%)

Localized

7,6711,164

54

168,566

32,276,673

14,165

96.2%

3.8%

Working Gene SetFiltered Gene Set

of Localized

Fig. 1. Overview of the maize seed proteotype atlas. (A) Nonmodified proteins (n = 4–7 biological replicates) and phosphoproteins (n = 3–6 biologicalreplicates) were identified from diverse tissues in the developing maize seed. Tissues sampled are shown in color and ploidy is indicated in parentheses (DAI,days after imbibition). (B) Summary of sampled spectra, peptides, and proteins identified. (C) Number of total phosphorylation sites as well as localizedphosphorylation sites. (D) Percentage of detected proteins that are in the FGS or WGS. (E) Breakdown of detected proteins based on annotation. For D and E,only the subset of proteins (n = 12,453) identified via uniquely mapping peptides were used.

Walley et al. PNAS | Published online November 18, 2013 | E4809

SYST

EMSBIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 3: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

reverse transcriptase-PCR (RT-qPCR) on the same samplesthat were used for proteomics and found concordance betweenthe published array data and our RT-qPCR values (Fig. S3B).For example, the protein produced by GRMZM2G472236 isa Late Embryogenesis Abundant family member and was amongthe most abundant proteins in the 20 DAP embryo, but thecorresponding transcript was not detected by RT-qPCR ormicroarray in either tissue (Fig. S3 B and C).We explored three possible scenarios that could explain the

high-protein, low-mRNA discrepancy: (i) transcript levels cyclediurnally while the protein remains, (ii) transcription and trans-lation occur earlier in development and the proteins are stablewhile the mRNA is not, and (iii) transcription and translationoccur in another tissue from which the protein moves. We foundevidence to support all three hypotheses by comparing the pro-tein greater than mRNA genes with microarray data character-izing maize leaf circadian cycling genes (25) as well as additionalseed microarrays from Sekhon et al. (24), which profiled 12 and20 DAP whole seed (WS), 12 DAP endosperm, and 16, 18, and20 DAP embryos. (i) Seven of the endosperm and three of theembryo protein greater than mRNA genes encode transcripts

known to be circadian regulated (Fig. 2C). (ii) Transcripts for 5of the 20 DAP embryo protein greater than mRNA genes aredetected in 16 and/or 18 DAP embryos but not 20 DAP embryos(Fig. S3D). Further, 20 DAP embryo protein greater thanmRNA genes, for which mRNA is detected in 16 DAP embryos,are transcriptionally expressed at a higher level at 16 DAP com-pared with 20 DAP (Fig. S3E). (iii) A total of 15 of the 50 en-dosperm protein greater than mRNA genes, which were notdetected at the mRNA level in the 12 DAP endosperm, weredetected in the 12 DAP WS (Fig. 2D). Additionally, 15 of the 54embryo protein greater than mRNA genes, which were notdetected at the mRNA level in the 20 DAP embryo, were detectedin the 20 DAP WS and/or 20 DAP endosperm (Fig. 2E).

Phosphorylation Levels Are Independent of Protein Abundance. Thelack of concordance between mRNA and protein levels promptedus to ask whether protein abundance dictates phosphorylationlevel. For this, we focused on the 3,805 nonmodified proteinsthat were also observed as phosphoproteins (Fig. 3A). Afterclustering proteins based on phosphoprotein abundance, it wasapparent that phosphorylation level and protein abundance are

6 12 18 240

2

4

6

8

0

2.5

5

7.5

10

06 12 18 240

Pro

tein

Abu

ndan

ce R

ank

(Tho

usan

ds)

mRNA Abundance Rank(Thousands)

mRNA Abundance Rank(Thousands)

Endosperm 12 DAP Embryo 20 DAP

Protein > mRNA Protein = mRNA mRNA > Protein

3231 39

En EmProtein > mRNA

26632938 2336

En EmProtein = mRNA

4881345 1768

En EmmRNA > Protein

0

4

8

12

16

20

Cou

nt

CyclingNon-Cycling

En 12 DAP

Em 20 DAP

0 5

10 15 20 25 30 35 40 45

Not Detected

Em En

WS

Em & WS

En & WS

Em & En & WS

Cou

nt

0 5

10 15 20 25 30 35 40

Not Detected

En W

SEn & W

S

20 DAP12 DAP

Cou

nt

B

A

C D E

Fig. 2. Relationship of transcript level to protein abundance. (A) Rank order abundance of mRNA and protein in the endosperm 12 DAP and the embryo 20 DAP.Protein abundance was quantified by using unique and multimapping peptides. Detected proteins lacking a corresponding microarray probe were excluded. (B)Overlap in gene products exhibiting similar regulation in the endosperm and embryo. (C) Transcript abundance cycling behavior of protein greater than mRNAgenes. (D) Number of 12 DAP endosperm protein greater than mRNA genes that are detected at the transcript level in the 12 DAP endosperm and/or WS. (E)Number of 20 DAP embryo protein greater than mRNA genes that are detected at the transcript level in the 20 DAP embryo, endosperm, and/or WS. Transcriptdata shown in A, B, D, and E were reported by Sekhon et al. (24), whereas transcript data for C were described in the work of Khan et al. (25).

E4810 | www.pnas.org/cgi/doi/10.1073/pnas.1319113110 Walley et al.

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 4: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

largely independent (Fig. 3 B and C), as observed in mice (15).Further, individual sites of phosphorylation exhibit tissue-specificlevels that are not dictated by protein abundance (Fig. 3D).

Protein Kinase Network Reconstruction. Next, we used the atlas toreconstruct a regulatory network of protein kinases and their sub-strates. A common feature of protein kinases is their activationloop, which requires phosphorylation to enable catalysis (26).We exploited this feature to quantify kinase activity during seeddevelopment by measuring phosphopeptides from each activa-tion loop. Importantly, kinase activation could not have beenpredicted from kinase abundance (Fig. 4A). We next performed

a correlation analysis to identify proteins whose phosphorylationlevels corresponded with activation of a specific kinase, inferringthat these proteins may be substrates of the kinase. This enabledthe reconstruction of a network containing nine activated kinasesand 762 potential substrate proteins (Fig. 4B and Dataset S6). Forvalidation, we compared our predicted substrates of mitogen-activated protein kinase 6 (ZmMPK6) with known substrates ofthe orthologous Arabidopsis MPK6 (27) and found a significantoverlap (P = 2.99 × 10−3). Additionally, glycogen synthase ki-nase 3/SHAGGY (GSK) consensus motif, (S/T)XXX(S/T), isoverrepresented (P = 0.045) in the substrates of the GSK-related kinase (GRMZM2G155836). Further, the MAPK con-

-2.3 2.5

9,654706 3,805

Phosphoprotein Non-modifiedProtein

Phosphoproteins Non-modified Proteins

3,80

5

A

B

0

4

8

12 AGO

(R)AAGGsPVHESPTQANK(K)

0

1

2

3

4 IQD

(R)DsVTAGDDDDKPAEEGK(A)

0

9

18

27

36 ARFGAP(K)sISSSQFFGDQASFEK(D)

0

15

30

45

60 E3 RING

(K)DAEVHGsPVPESNGR(L)

0

15

30

45 Ubiquitin Thiolesterase

(R)NTSFGGAASNsGPVSNAGGR(S)

0

150

300

450

600

0

2

4

6

8 SR Splicing Factor(R)DVsRsPPPPPDAR(R)

0

20

40

60 bZIP

(R)NNLTEGGAEsDEEIR(R)

0

10

20

30

40 ZmC3H1

(R)GANEEVSsINVDEDPNVPYER(S)

Nor

mal

ized

Spe

ctra

l Cou

nts

Em 20 DAP

Em 38 DAP

Germ

Em 2 DAI

En 8 DAPEn 10 DAPEn 12 DAPEn Crown 27 DAPPer/Aleu 27 DAP

Em 20 DAP

Em 38 DAP

Germ

Em 2 DAI

En 8 DAPEn 10 DAPEn 12 DAPEn Crown 27 DAPPer/Aleu 27 DAP

Nor

mal

ized

Spe

ctra

l Cou

nts

Nor

mal

ized

Spe

ctra

l Cou

nts

Nor

mal

ized

Spe

ctra

l Cou

nts

Em

20

DA

PE

m 3

8 D

AP

Ger

m E

m 2

DA

I

En

8 D

AP

En

10 D

AP

En

12 D

AP

En

Cro

wn

27 D

AP

Per

/Ale

u 27

DA

P

Em

20

DA

PE

m 3

8 D

AP

Ger

m E

m 2

DA

I

En

8 D

AP

En

10 D

AP

En

12 D

AP

En

Cro

wn

27 D

AP

Per

/Ale

u 27

DA

P

Non-Modified Protein Localized Phosphopeptide

Relative Abundance

C

Fig. 3. Phosphorylation levels are largely independent of protein abundance. (A) Venn diagram showing the overlap between nonmodified and phos-phoproteins. (B) Heat maps ordered by hierarchical clustering of phosphoprotein abundance of all proteins detected at the nonmodified and phosphoproteinlevel. (C) Selected phosphopeptides exhibiting site-specific phosphorylation that is not dictated by protein abundance. Data are means of independent bi-ological replicates ± SE.

Walley et al. PNAS | Published online November 18, 2013 | E4811

SYST

EMSBIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 5: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

sensus motif, PX(S/T)P, is overrepresented in the substrates ofAtMPK6-like (P = 0.056) and AtMPK9-like (P = 0.023). Ex-amination of the network revealed kinases predicted to phos-phorylate a number of well-studied maize proteins (Dataset S6).For example, a GSK kinase is predicted to phosphorylate thebZIP TFs OPAQUE2 HETERODIMERIZING PROTEIN 1and 2 (OHP1 and OHP2) (28) on a conserved serine (Fig. S4).Taken together, this approach has enabled the creation of arobust predictive network of potential kinase–substrate pairs.

Spatiotemporal Regulation of Starch and Lipid Biosynthesis. In thedeveloping maize seed, starch and triglyceride accumulation arespatiotemporally regulated, resulting in their accumulation inendosperm and embryo, respectively (13, 29, 30). Thus, we ex-amined the starch and triglyceride pathways in detail, hypothe-sizing that protein abundance or phosphorylation may regulate

photosynthate partitioning. For this, we manually curated thepathways (SI Materials and Methods) to identify proteins that areknown or predicted, based on homology, to perform each bio-chemical step. This enabled identification and quantification ofknown and novel paralogs of enzymes and transporters at eachstage of development (Fig. 5, Fig. S5, and Dataset S1). Manylipid biosynthesis enzymes, including the key determinant of seedoil content DGAT1-2 (29), peaked in the early embryo, wheremost of the seed oil accumulates. In contrast, proteins known tobe important for seed starch biosynthesis such as SH1, BT2, SH2,BT1, WX1, SU2, DU1, AE, and SU1 exhibited maximal abun-dance in the endosperm crown at 27 DAP, which correspondswith the peak time of starch synthesis in our samples (Fig. 5).We also discovered numerous sites of phosphorylation that

may regulate starch biosynthesis. In the endosperm of maize andother grasses, glucose-1-P is converted to ADP-glucose pre-

00.2

0.40.6

0.81

Activated Kinase Non-modified Kinase

Em 20 DAP

Em 38 DAP

Germ Em 2 DAI

En 8 DAP

En 10 DAP

En 12 DAP

En Crown 27 DAP

Per/Aleu 27 DAP

Em 20 DAP

Em 38 DAP

Germ Em 2 DAI

En 8 DAP

En 10 DAP

En 12 DAP

En Crown 27 DAP

Per/Aleu 27 DAP

A

Relative A

bundance

67

1

2

3

4

56

7

9

8

B

Fig. 4. Prediction of protein kinase–substrate relationships. (A) Hierarchical clustering based upon the number of spectra mapping to the kinase activationloop was used to order heat maps depicting the amount of activated kinase (Left) and nonmodified form of the kinase (Right). (B) Network of activatedkinases and proteins whose phosphorylation is correlated. Hub identifiers: 1, GRMZM2G306028 (AtMPK9-like); 2, GRMZM2G424582 (AtKEG-like); 3,GRMZM2G149286 (CDKD related); 4, GRMZM2G028452 (CDKC related); 5, GRMZM2G067734; 6, GRMZM2G171987 (SRPK4-like); 7, GRMZM2G167280 (LRRReceptor Like Kinase Related); 8, GRMZM2G155836 (GSK-related); and 9, GRMZM2G020216 (AtMPK6-like).

E4812 | www.pnas.org/cgi/doi/10.1073/pnas.1319113110 Walley et al.

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 6: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

dominantly in the cytosol and then transported into the plastidby BT1, which is a nucleotide transporter of the mitochondrialcarrier family (MCF) (30). A key feature of MCF proteins is theirtransmembrane barrel composed of six α-helices (31). Mutationsin the yeast MCF protein ANT1 that mimic dephosphorylationof α-helix four abolish transport activity (32). Accordingly, weobserved a phosphopeptide that matches α-helix one of BT1,suggesting that serine phosphorylation may regulate its ADP-glucose transport activity (Fig. 5, Fig. S6, and Dataset S2). Re-search on maize and wheat has established that phosphorylationcauses starch synthesis enzymes to form active, multiproteincomplexes; the complexes include SBEI/AE/SP, SSI/SU2/AE,and SSI/SU2/SBEI/SBE2a/SP, with SBEI, AE, and SP identifiedas phosphoproteins (33, 34). The sites of phosphorylation onthese proteins have not been reported. Our atlas of proteotypesrevealed specific sites of phosphorylation for the starch synthesiscomplex members SBEI, SBE2a, AE, and SP (Fig. 5 and DatasetS2). Identification of these phosphorylation sites enables tar-geted mutational studies aimed at regulating starch synthesis

complex assembly, with the goal of tailoring starch quantity orquality for specific applications.

Seed Development Proteotypes. To gain insight into biologicalprocesses functioning throughout seed development and to as-sociate specific proteins with key seed phenotypes, we performedhierarchical clustering-based network reconstruction (35) on pro-tein abundance and levels of phosphorylation. Consistent withphosphorylation status being independent of protein abundance(Fig. 3), MapMan bins identified as enriched at the phosphopeptidelevel were largely distinct from bins enriched in the nonmodifiedproteome (Fig. 6 and Dataset S7). We detected enrichment ofMapMan bins for well-characterized biological processes in-cluding starch synthesis and lipid metabolism at the expectedtime and place (10, 12, 13, 29). Additionally, enzymes of phe-nylpropanoid metabolism were enriched in the pericarp/aleurone(Per/Aleu) tissue, which is known to accumulate phenylpropanoidsthat are associated with insect and pathogen resistance (36, 37).Jasmonate enzymes were also enriched in the Per/Aleu, suggesting

Sucrose

Relative AbundanceHighLow

Em 20 DAP

Em 38 DAP

Germ Em 2 DAI

En 8 DAP

En 10 DAP

En 12 DAP

En Crown 27 DAP

Per/Aleu 27 DAP

UDP

Fructose

UDP-glucoseUTP

Glucose 1-P PPi

ADP-glucoseATP

ATP ADPGlucose 6-P Glucose 1-P

ADP-glucose

PPi

ATP

SH1

SUS1

SUS2P P P P

P P

UGP1

UGPa

UGPb

FRK1

FRK2

FRKa

FRKb

PGIa

PGIb

PHI1

P

P P P

P P

PGMa

PGMb

PP P P

GPT

BT1

Starch

WX1 GBSII

SS1

SU2 SSIIb SSIIc

SSIIIbDU1

SSIV

SBE1

P P P

SBE2a

P

AE

P P

SU1 ISO2

P

ISO3

ZPU1

SP

P P PP P PP P P

P P P

AGPa

AGPb

AGPc

AGPSEMZMP

BT2

P

SH2

AGPSLZM

ADP

Cytosol

Plastid

AGPa

AGPb

AGPc

AGPSEMZMP

BT2

P

SH2

AGPSLZM

Fig. 5. Dynamics of starch biosynthetic enzymes during seed development. Heat maps depict the relative abundance of individual proteins throughoutdevelopment. The green “P” denotes tissues in which the corresponding phosphoprotein was detected. The starch biosynthesis pathway was adapted fromComparot-Moss and Denyer (30).

Walley et al. PNAS | Published online November 18, 2013 | E4813

SYST

EMSBIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 7: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

that this defense hormone (38) forms an additional layer of de-fense in the protective pericarp tissue.

TF Dynamics. Finally, we investigated TFs, which are key regu-lators of growth, development, and cell fate that have tradi-tionally been difficult to detect by proteomics because of theirlow abundance (39, 40). However, in the developing seed, weidentified and measured 694 (28%) of the 2,516 annotated TFs(Fig. 7). Clustering the TF data revealed extensive enrichmentof protein accumulation and phosphorylation in specific tissues

(Fig. 7A and Dataset S8). These tissue-enriched TFs representcandidate proteins responsible for patterning maize seed de-velopment as well as controlling the spatiotemporal expressionof starch, lipid, and storage protein biosynthesis pathways.As an alternative to looking at individual TFs, we searched for

family-level patterns of TF accumulation by summing all thespectral counts within a tissue for each TF family. Surprisingly,the abundance of most TF families was greatest in a specifictissue, with 25 of 47 TF families collectively peaking in the 20 DAPembryo (Fig. 7B). Specifically, TF families with well established

Non-Modified Proteome

Phosphopeptides

Cell Wall & Vesicle Transport

Protein Synthesis

Protein Synthesis & Degradation

Starch Synthesis - AGPase

Phenylpropanoids & Jasmonate

Lipid Metabolism

Late Embryogenesis Abundant & ABA regulated

Cysteine Protease & Starch Degradation

Em 20 D

AP

Em 38 D

AP

Germ E

m 2 DAI

En 8 D

AP

En 10 D

AP

En 12 D

AP

En Crow

n 27 D

AP

Per/Aleu

27 D

AP

Cell Wall & Cell Cycle

Cellulose Synthesis

bZIP TF & DNA Methyltransferases

Starch Synthesis - Branching

Receptor-Like Kinase & Biotic Stress

Regulation of Transcription

ABA regulated

Light Reaction - Ferredoxin

Em 20 D

AP

Em 38 D

AP

Germ E

m 2 DAI

En 8 D

AP

En 10 D

AP

En 12 D

AP

En Crow

n 27 D

AP

Per/Aleu

27 D

AP

Relative AbundanceLow High

A B

Fig. 6. Hierarchical clustering of protein abundance and phosphorylation status throughout seed development. (A) Clustering of nonmodifiedproteins that had at least five normalized spectral counts in one or more tissue. (B) Clustering of phosphopeptides containing localized phosphor-ylation sites that were detected in at least two biological replicates. Vertical bars to the right of the heat maps denote the cluster (composed of allterminal nodes in the hierarchical tree) of proteins (A) or phosphopeptides (B) selected as tissue-enriched. Selected MapMan functional categoriesthat are significantly enriched (hypergeometric test) in a given tissue (cluster) are listed to the right of each heat map. All enriched MapMan cat-egories are listed in Dataset S7.

E4814 | www.pnas.org/cgi/doi/10.1073/pnas.1319113110 Walley et al.

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 8: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

roles in tissue pattern formation such as MYB, ARF, homeobox,and TCP accumulated predominantly in the 20 DAP embryo,whereas MADS TFs peaked in the 8 DAP endosperm andYABBY TFs built up in the Per/Aleu. Consistent with the en-richment of defense proteins in the Per/Aleu (Fig. 6), the de-fense-related WRKY TF family peaked in the Per/Aleu. We alsoobserved asymmetries in the phosphorylation status of TF fam-ilies. For example, an increase in bZIP family abundance cor-responded to increases in bZIP phosphorylation, whereas theopposite pattern was observed for the ARF family. Consideringthat the DNA binding activity of OPAQUE2 (a maize bZIP) and

ARF2 (in Arabidopsis) is abolished by hyperphosphorylation (41,42), our TF phosphorylation data may be used to infer TF ac-tivities during development.

DiscussionThe maize seed is a developmentally complex entity composed oftwo major compartments, the diploid embryo and the triploidendosperm, that arise from double fertilization events. The en-dosperm and embryo develop inside the maternally derivedpericarp, which arises from the ovary wall (9). Despite the bi-ological complexity of the sampled tissues relatively few proteins

0 2.65Relative Abundance

Em

20

DA

PE

m 3

8 D

AP

Ger

m E

m 2

DA

I

En

8 D

AP

En

10 D

AP

En

12 D

AP

En

Cro

wn

27 D

AP

Per

/Ale

u 27

DA

P

618

Non

-mod

ifed

Pro

tein

s

Em

20

DA

PE

m 3

8 D

AP

Ger

m E

m 2

DA

I

En

8 D

AP

En

10 D

AP

En

12 D

AP

En

Cro

wn

27 D

AP

Per

/Ale

u 27

DA

P

392

Pho

spho

prot

eins

C2HEIL

GBPARRTUBBBRMYB

WHIRLYbHLHDOFbZIP

CADREREBCA3P

YABBZRSBPABI

CA2PGRF

GATAJMJ

PHDHSFALF

ARIDARF

CA5POrphanMYBR

ZIMCAMTA

CPPGRAS

HBTCPC3HNLPZHDVOZCOLE2FNAC

MADSGLK

WRKYTHX

23162129230348892346101798271086188

13430835182983741210813951220

21

12230

100

151

3726023372226

23555

103

69250365

242

3119113

1214

1115

Em

20 DA

PE

m 38 D

AP

Germ

Em

2 DA

I

En 8 D

AP

En 10 D

AP

En 12 D

AP

En C

rown 27 D

AP

Per/A

leu 27 DA

P

Em

20 DA

PE

m 38 D

AP

Germ

Em

2 DA

I

En 8 D

AP

En 10 D

AP

En 12 D

AP

En C

rown 27 D

AP

Per/A

leu 27 DA

P

Non-modified PhosphoproteinA B

Fig. 7. Dynamics of TFs during seed development. (A) Hierarchical clustering showing the nonmodified and phosphoprotein abundance profiles of TFs.Vertical bars to the right of the heat maps denote the cluster (comprised of all terminal nodes in the hierarchical tree) selected as tissue-enriched. Clustermembers are listed in Dataset S8. (B) Spectral counts were summed for each TF family. Red bars indicate the tissue of maximal abundance. Numbers to theright of the bars list the total number of detected TFs for each family.

Walley et al. PNAS | Published online November 18, 2013 | E4815

SYST

EMSBIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 9: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

exhibit tissue-specific accumulation patterns. Although tissue-specific proteins are strong candidates as key regulators of cellfate, the finding suggests that knowledge of quantitative changesin proteins abundance and phosphorylation status are critical forunderstanding seed development as well as biosynthetic processesthat occur in a spatiotemporal manner during seed development.Thus, the tissue-enriched proteins and phosphorylation eventsidentified via hierarchical clustering are likely important candidateregulators of tissue identity.Analysis of the data highlights the complexity of accurately

characterizing proteotype. Specifically, measures of mRNA cor-relate poorly with protein abundance, which is itself largely in-dependent of phosphorylation status. The lack of correlationbetween mRNA and protein levels has been documented inother organisms (3–8). However, we were surprised by the ob-servation that many of the most abundant proteins had little tono measurable mRNA. We found evidence suggesting that mul-tiple mechanisms underpin this phenomenon, including (i) tran-scription and translation occur earlier in development and theproteins are stable while the mRNA is not; (ii) transcript levelscycle diurnally while the protein remains; and (iii) transcriptionand translation occur in another tissue from which the proteinmoves. Consistently, in mammalian cells, proteins are approxi-mately five times more stable than mRNAs (3), and, in Arabi-dopsis, most transcripts cycle diurnally while the encoded proteinsdo not (43).Because TFs establish regulatory networks that shape growth

and development, knowledge of where and when TFs are activeis of widespread interest (40). Interrogation of our proteotypeatlas enabled the identification of TFs that exhibit enrichment inabundance or phosphorylation in a specific tissue, making theseproteins strong candidate regulators of seed development. Con-sistent with this idea, we observed maximal accumulation of MADSTFs in the 8 DAP endosperm, which contains maternal nucellartissue (Fig. S1A and ref. 44). In rice, MADS29 is expressed in thenucellus, where it regulates seed development by controllingprogrammed cell death of the maternal tissues (45). Finally, theobserved conservation of tissue-specific accumulation and phos-phorylation within many TF families suggests that there are sig-nificant evolutionary constraints on diversification of TF function.Phosphorylation is a fundamental mechanism for regulating

protein activity, and identification of thousands of phosphoryla-tion sites by MS is now feasible. However, identification of thekinase responsible for substrate phosphorylation remains chal-lenging. We therefore reconstructed a kinase–substrate regulatorynetwork by correlating kinase activation with substrate phosphor-ylation. The resulting network predicts kinases responsible forphosphorylation of numerous “classical” maize genes that havebeen identified by mutant phenotypes (46). Additionally, we ob-served multiple instances of a kinase being predicted to phos-phorylate a conserved phosphorylation site on paralogous TFs.These phosphorylation sites are of particular interest becausefunctional phosphorylation sites are more likely to be conservedthan nonfunctional sites (47). For example, a GSK kinase is pre-dicted to phosphorylate a conserved serine on OHP1 and OPH2,which are involved in regulating zein storage protein synthesis inthe endosperm (28).

In conclusion, we have created an atlas of maize seed proteo-types by using MS that quantifies protein abundance and phos-phorylation levels across developmental time. The atlas comprises14,165 proteins and 18,405 phosphopeptides, making it the mostcomplete, quantitative proteome to date. The reconstruction ofmetabolic and developmental networks illustrates the utility ofthe atlas as well as the causal relationships between proteotypesand phenotypes. The atlas and derived protein networks add sig-nificantly to our understanding of seed development, and theyshould facilitate knowledge-based crop improvement.

Materials and MethodsPlant Material. All samples were collected from Zea mays (maize) inbred lineB73 grown outdoors on the University of California, San Diego, campusduring summer 2009, following manual self or intersibling pollination. Adetailed description of the sampled tissues is provided in SI Materialsand Methods.

MS. Sample preparation and MS are based on previously described methods(48–50) and are detailed in SI Materials and Methods. Briefly, the generatedspectra were searched using the B73 RefGen_v2 5a WGS (14). Phosphoryla-tion sites were localized to a particular amino acid within a peptide by usingthe variable modification localization score in Agilent Spectrum Mill soft-ware (51). Proteins that share common peptides were grouped by usingprinciples of parsimony to address protein database redundancy. Thus, pro-teins within the same group share the same set or subset of peptides. Proteinabundance and phosphorylation levels were quantified by spectral counting.Spectral counts for each protein represent the total number of peptide spec-tral matches to that protein (15, 16, 48). MS runs (replicates) were normalizedso that the total number of spectral counts was equal for each run. Spectralcounts from technical replicates, when present, were then averaged to getthe spectral counts for each biological replicate at the protein level. Rawspectra are deposited at the Mass Spectrometry Interactive Virtual Envi-ronment (MassIVE) repository (nonmodified proteome ID MSV000078444and phosphoproteome ID MSV000078443).

Relationship of mRNA to Protein. Normalized mRNA expression data froma previous work (24), corresponding to the B73 RefGen_v2 5a Working GeneSet, were downloaded from PLEXdb Accession Zm37 (www.plexdb.org).A detailed description is provided in SI Materials and Methods.

PCR. Detailed information on PCR is provided in SI Materials and Methods.

Functional Annotations. Detailed information on functional annotations isprovided in SI Materials and Methods.

Hierarchical Clustering. Detailed information on hierarchical clustering isprovided in SI Materials and Methods.

Functional Category Enrichment. Detailed information on functional categoryenrichment is provided in SI Materials and Methods.

Pathway Analysis. Detailed information on pathway analysis is provided inSI Materials and Methods.

Protein Kinase Substrate Network. Detailed information on the protein kinasesubstrate network is provided in SI Materials and Methods.

ACKNOWLEDGMENTS. This work was supported by National Science Foun-dation Grant 0924023 (to S.P.B.) and a National Institutes of Health NationalResearch Service Award Postdoctoral Fellowship F32GM096707 (to J.W.W.).

1. Lockhart DJ, Winzeler EA (2000) Genomics, gene expression and DNA arrays. Nature

405(6788):827–836.2. Belmonte MF, et al. (2013) Comprehensive developmental profiles of gene activity in

regions and subregions of the Arabidopsis seed. Proc Natl Acad Sci USA 110(5):E435–E444.3. Schwanhäusser B, et al. (2011) Global quantification of mammalian gene expression

control. Nature 473(7347):337–342.4. Vogel C, et al. (2010) Sequence signatures and mRNA concentration can explain two-

thirds of protein abundance variation in a human cell line. Mol Syst Biol 6:400.5. Ghaemmaghami S, et al. (2003) Global analysis of protein expression in yeast. Nature

425(6959):737–741.

6. Taniguchi Y, et al. (2010) Quantifying E. coli proteome and transcriptome with single-

molecule sensitivity in single cells. Science 329(5991):533–538.7. Vogel C, Marcotte EM (2012) Insights into the regulation of protein abundance from

proteomic and transcriptomic analyses. Nat Rev Genet 13(4):227–232.8. Petricka JJ, et al. (2012) The protein expression landscape of the Arabidopsis root.

Proc Natl Acad Sci USA 109(18):6811–6818.9. Scanlon MJ, Takacs E (2009) Kernel biology. Handbook of Maize: Its Biology, eds

Bennetzen J, Hake S (Springer, New York), pp 121–143.10. Sabelli PA, Larkins BA (2009) The development of endosperm in grasses. Plant Physiol

149(1):14–26.

E4816 | www.pnas.org/cgi/doi/10.1073/pnas.1319113110 Walley et al.

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020

Page 10: Reconstruction of protein networks from an atlas of maize ... · Reconstruction of protein networks from an atlas of maize seed proteotypes Justin W. Walley, Zhouxin Shen, Ryan Sartor,

11. Consonni G, Gavazzi G, Dolfini S (2005) Genetic analysis as a tool to investigate themolecular mechanisms underlying seed development in maize. Ann Bot (Lond) 96(3):353–362.

12. Pirona R, Hartings H, Lauria M, Rossi V, Motto M (2005) Genetic control of endospermdevelopment and of storage products accumulation in maize seeds. Maydica 50(3-4):515–530.

13. Val LD, Schwartz SH, Kerns MR, Deikman J (2009) Development of a high oil trait formaize. Biotechnology in Agriculture and Forestry, Biotechnology in Agriculture andForestry, eds Kriz AL, Larkins BA (Springer, Berlin), Vol 63, pp 303–323.

14. Schnable PS, et al. (2009) The B73 maize genome: Complexity, diversity, and dynamics.Science 326(5956):1112–1115.

15. Huttlin EL, et al. (2010) A tissue-specific atlas of mouse protein phosphorylation andexpression. Cell 143(7):1174–1189.

16. Liu H, Sadygov RG, Yates JR, 3rd (2004) A model for random sampling and estimationof relative protein abundance in shotgun proteomics. Anal Chem 76(14):4193–4201.

17. Shen B, et al. (2010) Expression of ZmLEC1 and ZmWRI1 increases seed oil productionin maize. Plant Physiol 153(3):980–987.

18. Bowman VB, Huang V, Huang AH (1988) Expression of lipid body protein gene duringmaize seed development. Spatial, temporal, and hormonal regulation. J Biol Chem263(3):1476–1481.

19. Halford NG, Shewry PR (2007) The structure and expression of cereal storage proteingenes. Endosperm, Plant Cell Monographs, ed Olsen O-A (Springer, Berlin), Vol 8, pp195–218.

20. Huang AHC (1992) Oil bodies and oleosins in seeds. Annu Rev Plant Physiol Plant MolBiol 43(1):177–200.

21. Serna A, et al. (2001) Maize endosperm secretes a novel antifungal protein into ad-jacent maternal tissue. Plant J 25(6):687–698.

22. Reyes FC, et al. (2011) Delivery of prolamins to the protein storage vacuole in maizealeurone cells. Plant Cell 23(2):769–784.

23. Thimm O, et al. (2004) MAPMAN: a user-driven tool to display genomics data setsonto diagrams of metabolic pathways and other biological processes. Plant J 37(6):914–939.

24. Sekhon RS, et al. (2011) Genome-wide atlas of transcription during maize de-velopment. Plant J 66(4):553–563.

25. Khan S, Rowe SC, Harmon FG (2010) Coordination of the maize transcriptome bya conserved circadian clock. BMC Plant Biol 10(1):126.

26. Adams JA (2003) Activation loop phosphorylation and catalysis in protein kinases: Isthere functional evidence for the autoinhibitor model? Biochemistry 42(3):601–607.

27. Popescu SC, et al. (2009) MAPK target networks in Arabidopsis thaliana revealedusing functional protein microarrays. Genes Dev 23(1):80–92.

28. Pysh LD, Aukerman MJ, Schmidt RJ (1993) OHP1: A maize basic domain/leucine zipperprotein that interacts with opaque2. Plant Cell 5(2):227–236.

29. Zheng P, et al. (2008) A phenylalanine in DGAT is a key determinant of oil contentand composition in maize. Nat Genet 40(3):367–372.

30. Comparot-Moss S, Denyer K (2009) The evolution of the starch biosynthetic pathwayin cereals and other grasses. J Exp Bot 60(9):2481–2492.

31. Pebay-Peyroula E, et al. (2003) Structure of mitochondrial ADP/ATP carrier in complexwith carboxyatractyloside. Nature 426(6962):39–44.

32. Feng J, et al. (2010) Tyrosine phosphorylation by Src within the cavity of the adeninenucleotide translocase 1 regulates ADP/ATP exchange in mitochondria. Am J PhysiolCell Physiol 298(3):C740–C748.

33. Liu F, et al. (2009) The amylose extender mutant of maize conditions novel protein-protein interactions between starch biosynthetic enzymes in amyloplasts. J Exp Bot60(15):4423–4440.

34. Tetlow IJ, et al. (2004) Protein phosphorylation in amyloplasts regulates starchbranching enzyme activity and protein-protein interactions. Plant Cell 16(3):694–708.

35. Petricka JJ, Benfey PN (2011) Reconstructing regulatory network transitions. TrendsCell Biol 21(8):442–451.

36. Bily AC, et al. (2003) Dehydrodimers of ferulic acid in maize grain pericarp and al-eurone: Resistance factors to Fusarium graminearum. Phytopathology 93(6):712–719.

37. de O Buanafina MM (2009) Feruloylation in grasses: Current and future perspectives.Mol Plant 2(5):861–872.

38. Browse J (2009) Jasmonate passes muster: A receptor and targets for the defensehormone. Annu Rev Plant Biol 60(1):183–205.

39. Kaufmann K, Pajoro A, Angenent GC (2010) Regulation of transcription in plants:Mechanisms controlling developmental switches. Nat Rev Genet 11(12):830–842.

40. Moreno-Risueno MA, Van Norman JM, Benfey PN (2012) Transcriptional switchesdirect plant organ formation and patterning. Curr Top Dev Biol 98:229–257.

41. Ciceri P, et al. (1997) Phosphorylation of Opaque2 changes diurnally and impacts itsDNA binding activity. Plant Cell 9(1):97–108.

42. Vert G, Walcher CL, Chory J, Nemhauser JL (2008) Integration of auxin and brassi-nosteroid pathways by Auxin Response Factor 2. Proc Natl Acad Sci USA 105(28):9829–9834.

43. Baerenfaller K, et al. (2012) Systems-based analysis of Arabidopsis leaf growth revealsadaptation to water deficit. Mol Syst Biol 8:606.

44. Vernoud V, Hajduch M, Khaled A-S, Depege N, Rogowsky PM (2005) Maize em-bryogenesis. Maydica 50(3-4):469–483.

45. Yin L-L, Xue H-W (2012) The MADS29 transcription factor regulates the degradationof the nucellus and the nucellar projection during rice seed development. Plant Cell24(3):1049–1065.

46. Schnable JC, Freeling M (2011) Genes identified by visible mutant phenotypes showincreased bias toward one of two subgenomes of maize. PLoS ONE 6(3):e17855.

47. Landry CR, Levy ED, Michnick SW (2009) Weak functional constraints on phospho-proteomes. Trends Genet 25(5):193–197.

48. Qiao H, et al. (2012) Processing and subcellular trafficking of ER-tethered EIN2 controlresponse to ethylene gas. Science 338(6105):390–393.

49. Charest PG, et al. (2010) A Ras signaling complex controls the RasC-TORC2 pathwayand directed cell migration. Dev Cell 18(5):737–749.

50. Castellana NE, et al. (2008) Discovery and revision of Arabidopsis genes by proteo-genomics. Proc Natl Acad Sci USA 105(52):21034–21038.

51. Chalkley RJ, Clauser KR (2012) Modification site localization scoring: Strategies andperformance. Mol Cell Proteomics 11(5):3–14.

Walley et al. PNAS | Published online November 18, 2013 | E4817

SYST

EMSBIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Janu

ary

3, 2

020