Upload
dvrm-mslm-gzr
View
214
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Molecular Biology Volume 47 Issue 6 2013 [Doi 10.1134%2FS0026893313060198] Zhou, C. F.; Lin, P.; Yao, X. H.; Wang, K. L.; Chang, J.; Han, X -- Selection of Reference Genes for Quantitative Real-time P
Citation preview
ISSN 0026�8933, Molecular Biology, 2013, Vol. 47, No. 6, pp. 836–851. © Pleiades Publishing, Inc., 2013.Published in Russian in Molekulyarnaya Biologiya, 2013, Vol. 47, No. 6, pp. 959–975.
836
INTRODUCTION
The analysis of gene expression has becomeincreasingly important in biological research. Manydifferent research methods are now used to study geneexpression, such as northern blotting, serial analysis ofgene expression (SAGE), semi�quantitative PCR(semi�PCR), gene microarrays, quantitative real�timePCR (qRT�PCR), RNase protection analysis (RPA);and RNA�seq [1–3]. qRT�PCR is becoming a routinetool in molecular biology to study gene expression,and is a PCR�based technique for analysis of mRNAexpression in various biological samples. Comparedwith other gene expression analysis methods, it ishighly sensitive and useful even for genes with lowexpression levels or those which show small expressionchanges [4]. At the same time, it is easy to perform andenables rapid quantification. However, normalizationis required for accurate interpretation of results. Sev�eral strategies have been proposed for normalizingqRT�PCR data, including choosing samples of a sim�ilar size, quantifying RNA relative to genomic DNA,and using an internal housekeeping or reference gene.
Normalizing to a reference gene is a simple and popu�lar method for internally controlling for error in qRT�PCR [5]. Ideal reference genes are expected to have astable expression level across various experimentalconditions, such as plant developmental stages, tissuetypes and external stimuli. The most commonly usedreference genes include β�actin (ACTB), 18S ribosomalRNA, translation elongation factor1 (TEF1) and glyc�eraldehyde�3�phosphate dehydrogenase (GAPDH)[6–12]. However, the level of transcript expressedfrom such genes is not always stable under all experi�mental conditions, which may lead to the misinterpre�tation of results.
Species of the genus Camellia whose seeds can pro�duce oil are referred to as oil�tea camellia. These spe�cies are very important woody oil trees that have beencultivated for a long time and are widely distributed insouthern China [13]. The oil extract from oil�teacamellia seeds is named tea oil and is extensively usedthroughout China. Tea oil is regarded as one of thehealthiest edible vegetable oils in the world, with nutri�tional composition superior to olive oil. The content ofunsaturated fatty acids, such as oleic and linoleic acidis above 90%. It is also very rich in other nutrients suchas squalene, sterols, vitamins and tocopherol.Recently, improvement of high oil producing varieties
Selection of Reference Genes for Quantitative Real�Time PCRin Six Oil�Tea Camellia Based on RNA�seq1
C. F. Zhou*, P. Lin*, X. H. Yao, K. L. Wang, J. Chang, and X. J. HanResearch Institute of Subtropical Forestry of the Chinese Academy of Forestry, Fuyang, Hangzhou, China 311400;
e�mail: [email protected]; [email protected] February 28, 2013; in final form, May 28, 2013
Abstract—qRT�PCR is becoming a routine tool in molecular biology to study gene expression. It is necessaryto find stable reference genes when performing qRT�PCR. The expression of genes cloned in oil�tea camelliacurrently cannot be accurately analyzed due to a lack of suitable reference genes. We collected different tissues(including roots, stems, leaves, flowers and seeds) from six oil�tea camellia species to determine stable referencegenes. Five novel and ten traditional reference gene sequences were selected from the RNA�seq database ofCamellia oleifera Abel seeds and specific PCR Primers were designed for each. Cycle threshold (Ct) data wereobtained from each reaction for all samples. Three different software tools, geNorm, NormFinder and Best�Keeper were applied to calculate the expression stability of the candidate reference genes according to the Ctvalues. The results were similar between the three software packages, and indicated that the traditional genesTUBα�3, ACT7α and the novel gene CESA were relatively stable in all species and tissues. However, no geneswere sufficiently stable across all species and tissues, thus the optimal number of reference genes required foraccurate normalization varied from 2 to 6. Finally, the relative expression of squalene synthase (SQS) andsqualene epoxidase (SQE) genes related to important ingredients squalene and tea saponin in oil�tea camelliaseeds were compared by using stable to less stable reference genes. The comparison results validated the selec�tion of reference genes in the current study. In summary, for the different tissues of six oil�tea camellia speciesdifferent optimal numbers of suitable reference genes were found.
DOI: 10.1134/S0026893313060198
Keywords: reference genes, real�time PCR, oil�tea camellia
UDC 577.214.6
1 The article is published in the original.* Zhou C.F. and Lin P. made an equal contribution to the study,
and should be regarded as joint first authors.
GENOMICS. TRANSCRIPTOMICS
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
SELECTION OF REFERENCE GENES FOR QUANTITATIVE REAL�TIME PCR 837
has become one of the most important areas of camel�lia scientific research. Selection and hybridization arethe most popular and age�old methods for breeding,but, the progress require more time and resources.Therefore, improving the production and quality oftea oil using molecular approaches should becomeespecially necessary in the future. So far only the evo�lutionary relationships of the Camellia species havebeen studied using molecular markers [14], also anEST library has been constructed for camellia seed[15] and transcriptome sequencing of different seeddevelopmental stages was finished [16]. Many func�tional genes have been cloned, such as the Dehydrin�like Protein [17], metallothionein gene [18], FAD2gene [19], SAD gene [20], calmodulin genes [21], andan aldo�keto reductase [22]. However, analyses ofgene expression in different species, tissues or growthstages could not be performed because no suitable ref�erence genes for genus Camellia were found. There�fore, the identification of reliable reference genes inCamellia has become a crucial factor to allow accuratefunctional gene expression analysis. In this study, weaimed at identifying potential reference genes suitablefor transcript normalization in different tissues of sixcamellia species. These reference genes will allowmore accurate and reliable qRT�PCR normalizationfor gene expression studies in the genus Camellia.
EXPERIMENTAL
Plant materials and biological samples. Six eco�nomically important and widely cultivated oil�teacamellia species were selected from the Camellia Ger�mplasm Collection Park at the Subtropical ForestryResearch Institute of the Chinese Academy of For�estry. These six species were Zhejiang (C. chekiango�leosa Hu), Guangning (C. semiserrata C.W. Chi),Xiaoguo (C. meiocarpa Hu), Wantian (C. polyodontaF.C. How), Youxian (C. yuhsienensis Hu) and Putong(C. oleifera C. Abel). Five different tissues, includingroots, stems, leaves, flowers and seeds, were collectedfrom six adult trees of each species. Ten samples ofeach tissue were picked from those 36 trees. Vegetativetissue samples, such as roots, leaves and stems weretaken from young tissues in March. Flowers were har�vested at full bloom; the flowers of Putong andXiaoguo were harvested in November, and the flowersof the other species were harvested in March the fol�lowing year. Fruits at the fast growth stage were col�lected in July and peeled to extract the seeds immedi�ately. All samples were harvested at about 10:00 in themorning and cut into small pieces. For each speciestissue samples from six trees were mixed together.Plant materials were frozen in liquid nitrogen immedi�ately after being harvested and stored in a freezer at –80°C until the total RNA was isolated.
RNA extraction and cDNA synthesis. The mixedfive tissue samples from each species were taken fromthe –80°C freezer and ground into fine powder in a
mortar with liquid nitrogen. About 100 mg of the pow�der was used for RNA extraction. Total RNA from allsamples was prepared using the RN38 EASYspin plusPlant RNA kit (Aidlab Biotech, Beijing, China)according to the manufacturer’s instructions. ANanodrop 2000 microvolume spectrophotometer wasused to detect the RNA concentration and quality(Table 1). Only RNA samples with a 260/280 wave�length ratio of 1.8–2.1 and a 260/230 wavelength ratiobetween 0.1 and 1 were used for cDNA synthesis, sev�eral samples with the 260/230 values greater than1 were diluted. cDNA synthesis was performed with3–8 μL total RNA (the final content of RNA in thereaction mixture was adjusted to about 1 μg for allsamples) according to the protocols of the Super�Script™ First�Strand Synthesis System (Invitrogen,Carlsbad, CA. USA) in a total volume of 20 μL.Finally, the cDNA was diluted 1 : 15 with nuclease�free water for qRT�PCR [23].
Selection of candidate reference genes based onRNA�Seq and primer design. RNA�Seq of Putongoil�tea camellia seeds from July to October weresequenced by using solexa technology. Total number ofreads, Total Nucleotides, Q20 percentage, gap per�centage, GC percentage, number of contig, Scaffoldand Unigenes in each month are listed in Table 2.There were 8310777 reads in average of RNA�Seqfrom July to October. Only 0.04% of the gaps were notsequenced. Transcriptome assembly was carried out byshort reads assembly software SOAP denovo [24],detailed assembling steps are presented in Fig. 1. Theassembling results showed 43461, 47932, 30022 and39251 unigenes respectively from July to October.After the repeated unigenes were removed, a total of80310 unigenes were detected according to this RNA�Seq of Putong oil�tea camellia seeds. 21789 unigeneswith protein function annotations were generated.The unigenes with function annotations can be classi�fied into twenty�four functional�categories andaccount for 27.13% of all unigenes. All unigenes werequeried against the KEGG pathway database, and42638 unigenes were given the pathway annotationsrelated to 265 pathways, including metabolism,genetic information processing, cell metabolism path�ways and so on.
RPKM values (Reads per kb per Million reads)[25] of unigenes were used to analyze genes expressionin RNA�Seq databases. The least fluctuation inRPKM values among different growth periods or tis�sues may represent the most stably expressed genes.Coefficient variations of the RPKM values from Julyto October were used to determine gene expressionfluctuations. The coefficient variation was calculatedas the Mean divided by the Standard deviation. Allunigenes of the two dozen traditional reference genesand several reported novel reference genes were takenfrom the RNA�Seq database. Fifteen unigenes withthe least RPKM values variations among the four seeddevelopment stages were selected (Table 3), and
838
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
ZHOU et al.
included nine traditional references and two reportednovel references. The nine traditional genes wereActin 7α (АСТ7α), Cyclophilin B (CYPB), Tublin α3(TUBα�3), TATA box banding protein component ofTFIID and TFIIIB (TBP), Cyclin�D4�2 (CYCD4�2),Translation elongation factor 1α (TEF1α), phospho�glycerate kinase (PGK), Glyceraldehyde�3�phosphatedehydrogenase (GAPDH) and RNA polymerase II asses�sory factor (RNAPII). The two reported novel refer�ence genes were clathrin coat associated protein AP�2complex subunit (CAC), Coatomer protein subunit ε1(COPε1). They were also chosen to be novel stablereference genes in Arabidopsis, tomato and Cupres�saceae [26, 27]. Besides all these reported referencegenes, four unigenes with a minimal Coefficient vari�
ation among all unigenes were chosen to be candi�date references in this paper. They were cellulosesynthetase A (CESA), Glyoxylate reductase (GARY),nuclear protein localization protein 4 (NPL4), vacuolar�sorting receptor 3 (VSR3). Primers were designed withPrimer3 (v0.4.0; http://frodo.wi.mit.edu/primer3/)according to the conserved sequences with the follow�ing parameters: optimal length 20–22 nucleotides,melting temperature 60–65°C, and product size rangefrom 160 to 220 base pairs. We then used OligoAnaly�ser to make sure no hairpins or dimmers could beformed.
qRT�PCR conditions. qRT�PCR reactions wereperformed in fast optical 0.1 mL, 96�well plates using anABI 7300 Real Time PCR System (Applied Biosystems,
Table 1. RNA concentration and quality detected by Nanodrop 2000
Species Tissues Nucleic acid conc., ng/µL A260 A280 260/280 260/230
Zhejiang
Roots 208.60 5.21 2.91 1.79 0.73
Stems 308.30 7.70 3.56 2.16 1.46
Leaves 266.50 3.40 1.60 2.13 0.43
Flowers 326.90 8.10 3.95 2.12 0.62
Seeds 496.90 12.42 5.87 2.12 1.14
Guangning
Roots 107.70 2.64 1.34 1.45 0.47
Stems 225.60 5.64 2.83 2.00 1.26
Leaves 647.60 11.36 5.78 1.97 1.24
Flowers 267.30 9.60 4.53 2.12 0.75
Seeds 319.90 8.00 3.82 2.09 1.12
Wantian
Roots 170.60 4.25 2.20 1.93 1.09
Stems 122.80 3.07 1.51 2.03 0.74
Leaves 289.90 4.30 2.06 1.11 0.41
Flowers 384.10 9.60 4.53 2.12 1.06
Seeds 711.20 17.78 8.25 2.15 1.59
Youxian
Roots 136.80 3.42 1.77 1.94 1.08
Stems 242.50 6.06 2.86 2.12 1.4
Leaves 334.90 8.06 3.91 2.06 1.31
Flowers 174.20 4.17 2.16 1.93 0.59
Seeds 590.40 14.77 7.49 1.97 1.21
Putong
Roots 448.10 9.76 4.99 1.96 0.67
Stems 446.80 12.17 6.28 1.78 0.82
Leaves 254.30 5.73 2.84 2.02 1.02
Flowers 158.60 3.97 1.99 1.99 0.31
Seeds 493.00 12.33 5.82 2.12 0.81
Xiaoguo
Roots 378.90 8.69 4.18 2.08 1.01
Stems 410.20 11.63 5.77 2.02 1.21
Leaves 401.50 10.24 5.38 1.90 0.73
Flowers 132.60 3.57 1.84 1.94 0.68
Seeds 276.40 5.89 3.07 1.92 0.92
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
SELECTION OF REFERENCE GENES FOR QUANTITATIVE REAL�TIME PCR 839
Foster City CA, USA). The qRT�PCR reaction vol�umes were 20 μL, each containing 10 μL 2× SYBR®Premix Ex TaqTM (Takara, Tokyo, Japan), 0.4 μL50× RO× reference dye, 2 μL of 15�fold diluted syn�thesized cDNA, 0.4 μL 10 μM forward primer, 0.4 μL10 μM reverse primer, and 6.8 μL sterile distilledwater. PCR was then performed following the opera�tion manual (30 s at 95°C; 40 cycles of 95°C for 5 s and60°C for 34 s). After thermal cycling, melting curveanalysis (60–95°C, fluorescence read once every0.3°C) was performed to verify the specificity of theamplicons. A negative PCR control lacking the cDNAtemplate was run for each primer pair. The thresholdcycle (Ct) values were calculated from the mean of fourtechnical replicates for each sample (every candidatereference gene primer pair).
Analysis of gene stability. qRT�PCR was performedfor each primer pair in a series of 10�fold dilutions (10–1,10–2, 10–3, 10–4, 10–5) of the mixed cDNA template toobtain Сt values. The corresponding PCR amplifica�tion efficiencies (Е) were calculated according to theequation E = (10–1/slope – 1) × 100 [28], the correla�tion coefficients (R) and slope values were calculatedfrom the standard curve of Ct values for each geneusing the Excel software.
To select suitable reference genes, the expressionstability of every candidate reference gene was ana�lyzed with three different Microsoft Excel�based soft�ware tools: geNorm, NormFinder, and BestKeeper.The raw Ct values were transformed into the requireddata input format for geNorm and NormFinder. Themaximum expression level (the lowest Ct value) ofeach gene was set to a value of 1. Relative expression
Table 2. Description of Camellia oleifera C. Abel RNA�seq analysis
Sample Total reads
Total nucleotides, nt
Q20 percentage,
%
Gap percentage,
%
GC percentage,
%Contig Scaffold Unigenes
July 8320068 1248010200 97.02 0.03 47.55 43216 43410 43461
August 8397376 1259606400 97.11 0.02 46.91 47568 47798 47932
September 8385809 1257871350 94.52 0.06 49.03 33827 35543 35589
October 8139855 1220978250 91.20 0.04 48.76 28817 29954 30022
Average 8310777 1246616550 94.96 0.04 48.06 38357 39176 39251
Reads (sample 1)
AssembleContig
Map reads to contigs
Assemble contigs to scaffolds
Scaffold
Gap fillingUnigen
Long sequence clustering
Reads (sample 2)
Contig 1 Contig 2 The same pipeline as sample 1
Unigen
Unigen
NN
NN
NN NN
NNNN
NN
Fig. 1. Detailed assembling steps of Putong oil�camellia seeds RNA�seq from July to October.
840
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
ZHOU et al.
Tabl
e 3.
Exp
ress
ion
an
d co
effi
cien
t va
riat
ion
of e
ach
gen
e fr
om J
uly
to O
ctob
er b
ased
on
RN
A�S
eq d
atab
ase
Uni
gene
ID
Gen
e n
ame
July
Aug
ust
Sep
tem
ber
Oct
ober
Ave
rage
of
RP
KM
Sta
nda
rd
devi
atio
n
of R
PK
MC
VR
anks
of
CV
aver
age
of R
PK
M
stan
dard
de
viat
ion
of
RP
KM
aver
age
of R
PK
M
stan
dard
de
viat
ion
of
RP
KM
aver
age
of R
PK
M
stan
dard
de
viat
ion
of
RP
KM
aver
age
of R
PK
M
stan
dard
de
viat
ion
of R
PK
M
Uni
gene
171
58_A
llA
CT
7α25
821
.17
269
22.7
122
323
.17
174
17.3
784
.42
2.63
0.03
1 8
Uni
gene
207
99_A
llC
YP
B11
913
.44
107
12.4
310
615
.16
111
15.2
656
.29
1.37
0.02
4 5
Uni
gene
101
69_A
llT
UBα
�327
311
6.47
283
124.
1718
398
.83
172
89.2
742
8.73
15.9
70.
037
9
Uni
gene
138
17_A
llT
BP
369
28.7
653
442
.81
305
30.1
032
831
.10
132.
786.
480.
049
12
Uni
gene
575
74_A
llC
YC
D4�
259
4.70
826.
7245
4.54
434.
1720
.14
1.15
0.05
7 14
Uni
gene
361
61_A
llT
EF
1�α
978
244.
5697
625
1.00
774
245.
0299
030
1.17
1041
.74
27.3
10.
026
6
Uni
gene
968
6_A
llP
GK
449
256.
2239
723
2.99
246
177.
7140
327
9.77
946.
6943
.70
0.04
6 10
Uni
gene
112
31_A
llG
AP
DH
327
84.1
321
456
.63
285
92.8
327
485
.76
319.
3515
.93
0.05
0 13
Uni
gene
610
3_A
llR
NA
PII
784
64.6
259
850
.69
523
54.5
739
940
.01
209.
9010
.17
0.04
8 11
Uni
gene
569
5_A
llC
ES
A16
0410
0.84
2100
135.
7899
379
.03
1247
95.3
741
1.02
23.8
90.
058
15
Uni
gene
242
63_A
llG
AR
Y17
1.52
161.
4713
1.47
141.
536.
000.
030.
005
2
Uni
gene
145
32_A
llN
PL
475
065
.71
734
66.1
461
267
.88
627
66.8
326
6.57
0.95
0.00
4 1
Uni
gene
533
0_A
llV
SR
331
712
0.94
325
127.
5225
012
0.74
265
123.
0049
2.20
3.15
0.00
6 3
Uni
gene
122
78_A
llC
AC
554.
9942
3.92
354.
0242
4.64
17.5
70.
510.
029
7
Uni
gene
550
1_A
llC
OPε1
1148
154.
2293
712
9.45
843
143.
3677
712
6.98
554.
0212
.72
0.02
3 4
Uni
gene
635
59_A
llS
QS
373
83.4
362
914
4.69
192
54.3
710
929
.66
312.
1449
.57
0.15
9 16
Uni
gene
477
3_A
llS
QE
405
208.
8015
3281
2.31
329
464.
0571
120
6.35
1691
.52
286.
390.
169
17
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
SELECTION OF REFERENCE GENES FOR QUANTITATIVE REAL�TIME PCR 841
levels were then calculated from Ct values using theformula: 2–ΔgCt, in which ΔgCt is the lowest Ct value sub�tracted from the corresponding Ct value of each sam�ple for every gene. According to the analysis bygeNorm and NormFinder, ten most suitable geneswere chosen to be further analyzed by BestKeeperbased on untransformed Сt values and amplificationefficiencies.
Reference gene validation. Squalene synthase(SQS); [29, 30] and squalene epoxidase (SQE); [31]are enzymes that catalyze the synthesis of squalene,sterols and terpenoids (which are extremely importantingredients in oil�tea camellia seeds and play a crucialrole in biological growth and metabolism). It wasfound that the expression of SQS and SQE changedconsiderably during different seed developmentalstages according to the RNA�Seq database (Table. 3).Therefore, the expression values of SQS and SQE werecompared as target genes in different species and tis�sues of interest to test the validity of the referencegenes analyzed by the three software tools in the cur�rent study.
RESULTS
Characters of Candidate Reference Genes
Fifteen unigenes were chosen to be candidate ref�erence genes because of their low fluctuations of theirRPKM values during different seed development. Theraw reads and RPKM values of each month, RPKMvalues average, standard deviation and coefficientvariation are listed in Table 3. Nine of them were tra�ditional reference genes; others were not commonlyused. It was shown that different candidate referencegenes showed different RPKM values. The sequencesof the unigenes were then aligned using BLASTn inNCBI, this alignment revealed that the sequenceswere highly similar to each particular gene, and theconserved regions were used for primer design. Thegene names, descriptions, primer sequences, ampli�con lengths, amplification efficiencies, Tm values andcorrelation coefficients are listed in Table 4. The Tmvalues varied from 79.9°C (GARY) to 85.9°C (CYPВ),and the amplicon lengths were about 200 bp, Theamplification efficiencies were between 89.4%(COPε1) and 101.8% (PGK), and the correlation coef�ficients were all larger than 0.99. Agarose gel electro�phoresis (2%) showed unique amplicons of expectedlength without primer dimers. The melting curve anal�ysis showed a single peak for each gene, and no non�specific products were detected (Fig. 2).
Comparison of Candidate Reference Gene Expression Stability
Real�time qRT�PCR assays of the fifteen candi�date reference genes were performed using the newlydesigned primer pairs, and cycle threshold (Ct) datawas obtained for all samples. The Ct values revealed
differences in transcript levels between the variouscandidate genes (Fig. 3). The Ct values of the fifteengenes ranged from 19 to 23 cycles. Two genes (PGKand TEF1�α) showed the lowest Ct values (mean Ct =18.3 and 17.8), indicating that they had the mostabundant transcript levels. The gene GARY had thelowest transcript abundance (mean Ct = 22.9). Threesoftware tools, geNorm, NormFinder and Best�Keeper, were used to calculate the expression stabilityof the candidate reference genes.
GeNorm Analysis
GeNorm is a Visual Basic application tool forMicrosoft Excel that operates on the assumption thatthe expression ratio of two ideal reference genes isconstant throughout the different groups of templates.Gene expression stability values (М) for all genes arecalculated and genes with an М value below thethreshold of 1.5 are considered stable [32]. In thisanalysis, most genes in different species and tissueshad М values less than 1.5, which meant that all geneswere stable according to geNorm. When the results ofall 30 samples were combined, CESA and TUBα�3 hadthe highest expression stability (the lowest М value),GAPDH was the least stable, and the other twelve genesvaried between the two extremes (Fig. 4). The analysisresults changed when the samples were classified intodifferent species and tissues (Fig. 4); CAC and PGK hadthe highest expression stability in Zhejiang, TUBα�3 andRNAPII were the most stable in Guangning, CESA andTUBα�3 were the most stable in both Xiaoguo andPutong, CESA and ACT7α had the highest expressionstability in Wantian, and CAC and COPε1 were themost stable in Youxian. GAPDH was the least stablegene in Zhejiang, Guangning and Wantian, while theleast stable genes in Xiaoguo, Youxian and Putongwere GARY, PGK, and COPε1, respectively. Further�more, the reference gene expression stability also var�ied in the five different tissue samples. PGK andGAPDH had the lowest М value (the highest expressionstability) in both roots and stems, CAC and COPε1 hadthe highest stability in leaves, CESA and TUBα�3 werethe most stable in flowers, and PGK and TBP were themost stable in seeds. GAPDH was the least stable genein both leaves and flowers, CYPВ, TEF1�α and GARYwere the least stable genes in roots, stems and seeds,respectively.
Evaluation of the optimal number of referencegenes required accurate normalization. The pairwisevariation (Vn/Vn + 1) between consecutively rankednormalization factors was calculated using geNorm.NFn and NFn + 1 were used to determine the number ofgenes required for reliable normalization. It has beensuggested that if the pairwise variation is below thethreshold value of 0.15, then there is no need for anadditional internal control gene [32]. In the currentresearch, six genes were needed when all 30 sampleswere analyzed together. However, different species
842
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
ZHOU et al.
1.41.21.00.80.60.40.2
0–0.2
60 65 70 75 80 85 90
1.41.21.00.80.60.40.2
0–0.2
60 65 70 75 80 85 90
0.32
–0.0460 65 70 75 80 85 90
0.280.240.200.160.120.080.04
0
0.6
–0.160 65 70 75 80 85 90
0
0.50.40.30.20.1
95
ACT7α CYPB TUBα�3 TBP
1.0
–0.260 65 70 75 80 85 90
CYCD4�20.80.60.40.2
0
1.0
–0.260 65 70 75 80 85 90
TEF1�α0.80.60.40.2
0
0.7
–0.160 65 70 75 80 85 90
PGK0.60.50.40.30.20.1
0
1.2
–0.260 65 70 75 80 85 90
GAPDH
0
95
1.00.80.60.40.2
1.2
–0.260 65 70 75 80 85 90
RNAP II
0.80.60.40.2
0
1.00.7
–0.160 65 70 75 80 85 90
CESA0.60.50.40.30.20.1
0
0.8
–0.160 65 70 75 80 85 90
COPε1
0.60.50.40.30.20.1
0–0.1
60 65 70 75 80 85 90
NPL40.50.40.30.20.1
0
0.7
1.0
–0.260 65 70 75 80 85 90
VSR30.80.60.40.2
0
1.4
–0.260 65 70 75 80 85 90
GARY
0.80.60.40.2
0
1.01.2
1.8
–0.260 65 70 75 80 85 90
CAC
0.80.60.40.2
0
1.01.2
1.61.4
1.2
–0.260 65 70 75 80 85 90
SQS
0.80.60.40.2
0
1.0
95
95T, °C T, °C T, °C
1.0
–0.260 65 70 75 80 85 90
SQE0.80.60.40.2
0
95T, °C
Der
ivat
ive
ACT7α CYPb TUBα�3 TBP CYCD4�2TEF1�αPGK GAPDHRNAP II CESA COPε1 NPL4 VSR3 GARY CAC SQS SQEM
500 bp
200 bp
Der
ivat
ive
Der
ivat
ive
Der
ivat
ive
Der
ivat
ive
Fig. 2. Specificity and melting curves of candidate genes. 2% agarose gel electrophoresis indicated unique amplicons of expectedlength and no primer dimmers. The melting curve analysis showed a single peak in each gene, and no non�specific products weredetected.
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
SELECTION OF REFERENCE GENES FOR QUANTITATIVE REAL�TIME PCR 843
30
28
26
24
22
20
18
16
14
121 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Candidate reference genes
Ct
Fig. 3. Ct values of each reference genes. Expression data displayed as Ct values for each reference gene in all samples from 1 to15 respectively. 1, ACT7α; 2, CYPB; 3, TUBα�3; 4, TBP; 5, CESA; 6, CYCD4�2; 7, TEF1�α; 8, PGK; 9, GAPDH; 10, RNAPII;11, GARY; 12, NPL4; 13, VSR3; 14, CAC; and 15, COPε1. A line across the box is depicted as the median. The box indicates the25th and 75th percentiles, whisker caps represent the maximum and minimum values, dots represent outliers.
needed different numbers of genes; Zhejiang, Youxian,Xiaoguo and Putong needed two genes, Wantianneeded three genes, and Guangning needed fourgenes. The optimal number of reference genes alsovaried in the different tissues. Two genes were neededin leaves and flowers, three in stems and seeds, and fivein roots (Fig. 5).
NormFinder Analysis
NormFinder is another Excel application that usesa model–based approach to identify the most stablereference genes by combining samples into groups.More stably expressed genes should show lower aver�age expression stability values (М values). The stabilityvalue of each gene was calculated by NormFinder(Table 5), and the results indicated that TUBα�3 andCESA were the most appropriate for use as referencegenes over the 30 samples. The least stable genes wereGAPDH and GARY. The most stable and least stablereference genes were different between each speciesand tissues. PGK was most stable in Zhejiang, GARY inGuangning, TEF1�α in Wantian, RNAPII in Youxian,TUBα�3 in Putong, and CESA in Xiaoguo. The leaststable gene in Guangning, Zhejiang and Wantian wasGAPDH, while in Youxian, Putong and Xiaoguo werePGK, TBP and GARY respectively. For the tissues,TUBα�3 was the most stable in roots, PGK in stems,COPε1 in leaves, CAC in flowers, and RNAPII in seeds.The least stable gene in leaves and flowers was GAPDH,
while in roots, stems and seeds it was COPε1, TBP andGARY, respectively.
BestKeeper Analysis
BestKeeper, also an Excel�based tool, estimates theinter�gene relationships between possible referencegene pairs by performing numerous pairwise correla�tion analyses using the raw Сt values of each gene.Most importantly, all genes may be included in thecalculations of the BestKeeper index, which is thegeometric mean of the Ct values of all candidate ref�erence genes and can be used to rank the best refer�ence genes because the stable reference genes show astrong correlation with the BestKeeper index. Best�Keeper also calculates the coefficient of variance(CV) and the standard deviation (SD) of the Ct valuesusing the whole data set, which includes all Ct values.Reference genes are identified as the most stablegenes, i.e., those that exhibit the lowest coefficient ofvariance and standard deviation (CV ± SD). Geneswith SD values greater than 1 are considered to beunacceptable [33, 34].
According to the analysis of geNorm and Norm�Finder, the five least stable genes including TEF1�α,VSR3, CYCD4�2, GARY and GAPDH were abandoned.The left ten candidate genes were ranked according tothe CV ± SD of each species and tissue (Table 6). Theanalysis revealed that all ten genes were acceptable in theXiaoguo and Putong species and the seeds tissues. Severalgenes (NPL4, PGK, COPε1, CAC, RNAPII) were not
844
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
ZHOU et al.
COPε1RNAP II TBP PGK CYPB ACT7α GARY NPL4 TEF1�α VSR3 CAC CYCD GAPDHTUBα�3CESA4�2
1.2
1.0
0.8
0.6
0.4
0.2
(f)
COPε1RNAP II TBP PGK CYPBACT7αGARY NPL4TEF1�α VSR3 CACCYCD GAPDH TUBα�3CESA4�2
0.9
0.7
0.5
0.3
0.1
(e)
COPε1RNAP II TBPPGK CYPBACT7αGARY NPL4TEF1�α VSR3 CACCYCD GAPDH TUBα�3CESA
4�2
1.6
1.2
0.8
0.4
0
(d)
COPε1 RNAP IITBPPGK CYPB ACT7αGARY NPL4 TEF1�αVSR3CAC CYCDGAPDH TUBα�3CESA4�2
1.6
1.1
0.6
0.1
(c)
COPε1RNAP II
TBP PGKCYPB ACT7αGARYNPL4TEF1�αVSR3CAC CYCDGAPDH TUBα�3CESA4�2
1.8
1.3
0.8
0.3
(b)
COPε1RNAP II TBP PGKCYPB ACT7αGARY NPL4 TEF1�αVSR3CAC
CYCDGAPDH TUBα�3 CESA4�2
1.6
1.1
0.6
0.1
(a)
Ave
rage
exp
ress
ion
sta
bili
ty,
М
Least stable genes Most stable genes
Fig. 4. Average expression stability values (M) calculated by geNorm of each species (Lower average expression stability (M value)indicates more stable expression). (a) Zhejiang oil�tea camellia; (b) Guangning oil�tea camellia; (c) Wantian oil�tea camellia;(d) Youxian oil�tea camellia; (e) Xiaoguo oil�tea camellia; (f) Putong oil�tea camellia; (g) roots; (h) stems; (i) leaves; (j) flowers;(k) seeds; (l) all samples.
acceptable in some species and tissues as their SD val�ues were greater than 1. ACT7α, TUBα�3 and CESAwere among the three most stable genes when all sam�ples were combined. These three genes were also stableamong different species and tissues, and only varied in
their rank positions. NPL4, PGK, COPε1 and CACexhibited the high SD in most species and tissues, indi�cating that these were the least stable reference genes,when all tissues and species under analysis were com�bined, these four genes were not acceptable to be ref�
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
SELECTION OF REFERENCE GENES FOR QUANTITATIVE REAL�TIME PCR 845
Fig. 4. Contd.
erence genes. The results of BestKeeper analysisshowed only a small difference from the resultsobtained by geNorm and Normfinder.
Reference Gene Validation
The expression values of SQS and SQE were com�pared among the 5 different tissues of Putong camelliaand among seeds of the 6 different oil camellia species
using the most stable and least stable candidate refer�ence genes as internal controls. CESA, TUBα�3 andPGK were the most stable reference genes, and GARYwas the least stable one for seeds of different oil�teacamellia species, TUBα�3 and CES were the most sta�ble, and COPε1 was the least stable for different tissuesof Putong oil�tea camellia. The results showed thatrelative expression trends were mainly the samebetween the stable reference genes, but there were
COPε1 RNAP II TBPPGK CYPB ACT7αGARY NPL4TEF1�αVSR3 CACCYCDGAPDH TUBα�3CESA4�2
1.5
1.3
1.1
0.9
0.7
(l)
Least stable genes Most stable genes
COPε1 RNAP II TBP PGKCYPBACT7αGARY NPL4TEF1�α VSR3 CACCYCDGAPDH TUBα�3CESA4�2
1.6
1.1
0.6
0.1
(j)COPε1
RNAP IITBP PGKCYPB ACT7αGARY NPL4TEF1�α VSR3 CACCYCDGAPDH TUBα�3CESA4�2
1.0
0.6
0.4
0.2
(i)
0.8
COPε1RNAP IITBP PGKCYPBACT7α GARYNPL4TEF1�αVSR3CACCYCDGAPDH
TUBα�3CESA4�2
1.2
0.8
0.6
0.4
(h)
1.0
COPε1 RNAP II TBP PGKCYPBACT7αGARY NPL4TEF1�α VSR3CACCYCDGAPDH
TUBα�3CESA4�2
1.6
0.8
0.4
(g)
1.2
COPε1 RNAP IITBPPGKCYPBACT7αGARY NPL4 TEF1�αVSR3 CAC CYCDGAPDH TUBα�3CESA
4�2
1.7
1.2
0.7
0.2
(k)
Ave
rage
exp
ress
ion
sta
bili
ty,
М
846
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
ZHOU et al.
Fig. 5. Optimal number of reference genes in each species and tissues. Pairwise variation (Vn/Vn+1) calculated by geNorm, andthe NFn and NFn+1 were used to determine the number of genes required for reliable normalization. It has been suggested that ifthe pairwise variation is below the threshold value of 0.15, then there is no need for an additional internal control gene. The resultsshow that different numbers of reference genes are needed in different species and tissues.
considerable differences between the results obtainedwith stable reference genes versus unstable ones. Forexample, the expression levels of SQE in different tis�sues of Putong camellia were almost the same withTUBα�3 and CES, and both of the relative expressionlevels were highest in seeds, followed by flowers, roots,stems and leaves. However, the results changed whenCOPε1 was used as the internal reference, the relativeexpression in flowers was lower than in roots. Notablediscrepancy of the results of SQS was also shownbetween stable references genes and unstable ones(Fig. 6). Relative expression of SQS and SQE in seedsof six species showed greater difference between stablereference genes and unstable one. For example, therelative expression of SQS showed little differenceamong the six species by stable reference, but showedgreat difference by the least stable reference gene:about 150 in Xiaoguo oil�tea camellia, but greater than7000 in Putong oil�tea camellia. Thus, the applicationof different reference genes resulted in different rela�tive expression levels. Unsuitable references couldcause great deviation from the actual target geneexpression levels. Thus, it is important to validate thereference genes before experimental application.
DISCUSSION
Recently, the quantification of RNA transcripts hasbecome increasingly rapid and precise because ofadvances in gene quantification strategies. To removesampling differences and identify real gene�specificvariation, especially when studying samples with subtlegene expression differences, normalization becomesnecessary for accurate gene expression quantificationby qRT�PCR. Normalizing to a reference gene is asimple and popular method for this. However, the
expression of many traditional housekeeping geneschanges considerably under different conditions or indifferent tissues, which biases the analysis of targetgenes [35]. For example, 18SrRNA, ACTB andRNAPII were shown to be the most stable genes amongsix leaf samples of different citrus genotypes, but whenfurther analyzed in five other tissues the results indi�cated that they were not completely stable [36]. GAPDHand ACT7α were shown to have unacceptable variabilityin peach [37], and two of the most commonly used ref�erence genes, TUBα�3 and ACT7α, are unsuitable fordifferent tissues of Chinese cabbage [11].
Oil�tea camellias are a very important woody oilcrop of the genus Camellia. So far, studies on geneexpression in Camellia have been carried out usingsingle reference genes. For example, the expression ofthe B function CjDEF�1 gene in Camellia japonicaHongshibaxueshi [38] and the chalcone isomerasegene in Camellia nitidissima [39] were analyzed acrossdifferent parts of the flowers. All used 18S rRNA as thereference gene, but none validated their results withany preliminary expression stability analysis. Further�more, the expression analysis of two calmodulin genesof Camellia oleifera did not use any reference genes[21]. To our knowledge, this is the first time suitablereference genes have been assessed for qRT�PCR insix oil�tea camellia species and their different tissues.Therefore, gene expression normalization by qRT�PCR in camellia can now be put into practice based onthis selection of reference genes. Pairwise analysisusing geNorm reveals that the optimal number of ref�erence genes varies from two to six depending on theparticular sample set. Our results suggest that there isno particular gene that is expressed across all speciesand tissues in a consistently stable way, and also showthat the more complex the samples are, the more ref�
0.25
0.20
0.15
0.10
0.05
0
V2/3 V3/4 V4/5 V5/6 V6/7 V7/8 V8/9 V9/10 V10/11 V11/12 V12/13 V13/14 V14/15
Zhejiang CuangningWantian Youxian Xiaoguo Putong Roots Stems Leaves Flowers Seeds All
Pai
rwis
e va
riat
ion
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
SELECTION OF REFERENCE GENES FOR QUANTITATIVE REAL�TIME PCR 847
9000850080007500
500
450
400
350
300
250
200
150
100
50
0
5507000
ZhejiangCuangning
WantianYouxian Xiaoguo
Putong
(a)
Rel
ativ
e ex
pres
sion
160014001200
400
350
300
250
200
150
100
50
0
CESAPGKTUBα�3CESA + PGK + TUBα�3GRAY
1000
ZhejiangCuangning
WantianYouxian Xiaoguo
Putong
(b)
200
150
100
50
0
CESATUBα�3CESA + TUBα�3COPε1
(c)
RootsStems
LeavesFlowers
Seeds
250
150
100
50
0
CESATUBα�3CESA + TUBα�3COPε1
(d)
RootsStems
LeavesFlowers
Seeds
200
Species Species
CESAPGKTUBα�3CESA + PGK + TUBα�3GRAY
Fig. 6. Relative expression levels of SQE and SQS in different tissues/organs of Putong oil�tea Camellia and seeds of different spe�cies. Expression levels of both SQS (a and c) and SQE (b and d) were normalized to stable and unstable reference genes respec�tively. Error bars show the mean standard error calculated from two biological replicates.
erence genes are needed. Thus, it is necessary to selectreference genes according to the samples beingresearched.
18S rRNA has been used as a reference gene inmany gene expression studies of the genus Camellia,but it has been reported that 18S rRNA is an unsuit�able reference gene since its synthesis is executed byRNA polymerase I, whereas mRNA transcription iscarried out by RNA polymerase II. Moreover, thereverse transcription reaction is an oligo(dT) reaction,but rRNA contains no poly(A) tail, so oligo(dT)primed cDNA cannot be synthesized for rRNA [40].It was also reported that target gene expression was
down�regulated when 18S rRNA was used as a refer�ence gene [41]. Based on these results, 18S rRNA wasnot chosen for the current study.
The expression results of fifteen candidate geneswere analyzed for each sample with three differentsoftware tools: geNorm, NormFinder and Bestkeeper.These three software tools use different algorithms andcan give different results. However, the results showedonly a small difference among them, and all of theresults indicated that TUBα�3 and CESA were themost stable reference genes, though they were not themost stable among all of the fifteen candidate genesbased on comparison of the RPKM values in the
848
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
ZHOU et al.
Tabl
e 4.
Des
crip
tion
of c
andi
date
gen
es fr
om C
amel
lia fo
r qR
T�P
CR
Gen
e S
ymbo
lG
ene
nam
eP
rim
er s
eque
nce
(5'
→ 3
')
(for
wor
d/re
vers
e)T
m,
°CA
mpl
icon
len
gth
, bp
Am
plif
icat
ion
ef
fici
ency
, %
R2
AC
T7α
Act
in7
alp
ha
CA
AC
TT
CG
CT
GG
TG
TC
TT
CA
AC
CC
TC
TA
CG
CA
GA
AG
CA
AA
82.4
212
101.
10.
9969
CY
PB
Cyc
lop
hil
in B
AC
AG
GG
AG
CT
CA
CC
AC
AT
TC
TC
TT
AG
CA
TG
GC
AA
AT
GC
AG
85.9
210
93.9
0.99
89
TU
Bα
�3T
ubl
in a
lph
a�3
CC
AT
GC
CT
TG
GA
TC
AC
AT
TT
TG
GG
GC
CA
TT
AA
TG
TA
GA
CG
82.6
319
594
0.99
49
TB
PT
AT
A b
ox
bin
din
g p
rote
in c
omp
on
ent
of
TF
IID
an
d T
FII
IB
GA
AA
AG
GC
AC
CA
TG
GG
AA
TA
G
GA
AG
AT
GG
TT
TG
CA
CT
GG
T81
.819
395
.60.
9955
CY
CD
4�2
Cyc
lin
�D4�
2G
GA
TT
GA
GG
AA
TG
GG
GA
TT
T
AT
AA
AC
AG
GC
CA
CA
GC
CA
AC
82.2
519
690
.70.
9993
TE
F1�α
Tra
nsl
atio
n e
lon
gati
on
fac
tor
1�al
ph
aT
CC
AG
GA
GC
AT
CA
AT
GA
CA
G
AC
CA
CC
AC
TG
GT
CA
CC
TC
AT
83.7
521
999
.40.
9987
PG
KP
ho
sph
ogl
ycer
ate
kin
ase
CC
CA
AG
GG
TA
CT
CA
GT
TG
GA
C
CA
TC
CA
AC
CA
TC
AG
GG
AT
A83
.82
192
101.
80.
9995
GA
PD
HG
lyce
rald
ehyd
e�3�
ph
osp
hat
e d
ehyd
roge
nas
eT
CA
AT
CA
CC
CG
AT
TG
CT
GT
A
CT
GC
TA
TC
AA
GG
AG
GC
TT
CG
82.2
519
996
.90.
9993
RN
AP
II
RN
A p
oly
mer
ase
II a
sses
sory
fac
tor
AA
TG
CT
CG
CT
CT
CA
CA
AC
CT
CG
AA
AT
CG
TT
GT
CG
TC
AT
TG
85.5
519
693
0.99
58
CE
SA
Cel
lulo
se s
ynth
etas
e A
AA
GG
AC
CG
CT
GA
TA
CT
CG
AA
A
CA
CC
AT
GG
CC
TG
GA
AA
TA
A83
.919
599
.10.
9939
GA
RY
Gly
oxy
late
red
uct
ase
TG
CG
GT
TC
TT
GT
GG
AT
GA
TA
GC
AC
TC
AT
GC
TT
TC
CT
GA
CA
79.9
220
100.
90.
9927
NP
L4
Nu
clea
r p
rote
in lo
cali
zati
on
pro
tein
4G
GC
CA
TG
GA
CT
CA
AT
TA
GG
AA
TC
AT
CT
GG
AC
CG
AA
CA
AG
G83
.817
698
.90.
9935
VS
R3
Vac
uo
lar�
sort
ing
rece
pto
r 3
GC
AC
AA
AT
GG
CC
TT
CA
AA
AC
GG
TG
AC
CC
AA
AT
GC
TG
AT
TC
82.3
167
98.3
0.99
04
CA
CC
lath
rin
co
at a
sso
ciat
ed p
rote
in A
P�2
co
mle
x su
bun
itG
GC
AT
TC
CA
GA
AA
GA
AA
GC
AA
GG
AA
GG
AG
TA
CG
CT
CA
CC
A82
235
92.3
0.99
20
CO
Pε1
Co
ato
mer
pro
tein
su
bun
it e
psi
lon
�1G
CC
TT
TC
CA
TT
CA
GG
AT
CA
AA
TG
CG
GA
AA
AA
CA
GT
TG
AG
G81
.717
889
.40.
9900
SQ
SS
qual
ene
syn
thet
ase
TT
TC
GC
CC
TC
GT
AA
TT
CA
AC
CA
TG
AA
AA
AT
GC
CA
GT
CA
CG
81.4
318
010
5.3
0.99
66
SQ
ES
qual
ene
epo
xid
ase
AA
AG
AG
CA
GA
CC
AC
CA
CC
AC
TC
GG
GC
TC
TG
TC
AA
AT
CT
CT
80.1
520
810
0.8
0.99
41
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
SELECTION OF REFERENCE GENES FOR QUANTITATIVE REAL�TIME PCR 849
Tab
le 5
.R
anki
ng
of c
andi
date
ref
eren
ce g
enes
in o
rder
of t
hei
r ex
pres
sion
sta
bili
ty a
s ca
lcul
ated
by
Nor
mF
inde
r
Ran
kZ
hej
ian
gG
uan
gnin
gW
anti
anYo
uxia
nP
uton
gX
iaog
uoR
oots
Ste
ms
Lea
ves
Flo
wer
s S
eeds
All
1P
GK
0.06
1G
AR
Y0.
302
TE
F1�α
0.33
1R
NA
PII
0.10
9T
UBα
�30.
144
CE
SA0.
052
TU
Bα
�30.
263
PG
K0.
163
CO
Pε1
0.13
0C
AC
0.02
6R
NA
PII
0.26
1T
UBα
�30.
358
2C
OPε1
0.18
8R
NA
PII
0.37
1T
UBα
�30.
390
CY
PB
0.35
6C
AC
0.19
4T
UBα
�30.
080
CY
PB
0.36
6C
ES
A0.
225
PG
K0.
199
TU
Bα
�30.
062
TU
Bα
�30.
270
CE
SA
0.49
2
3C
AC
0.21
3A
CT
7α0.
385
CE
SA0.
411
CA
C0.
393
VS
R3
0.20
1C
OPε1
0.15
9C
ESA
0.52
3T
UBα
�30.
345
RN
AP
II0.
227
CE
SA0.
062
CE
SA0.
459
NP
L4
0.52
4
4T
EF
1�α
0.34
6P
GK
0.41
6R
NA
PII
0.43
0C
OPε1
0.42
2G
AP
DH
0.20
5C
AC
0.15
9P
GK
0.52
4G
AP
DH
0.35
4T
UBα
�30.
242
PG
K0.
163
PG
K0.
483
CA
C0.
547
5C
ESA
0.36
6N
PL
4 0.
422
AC
T7α
0.43
8N
PL
40.
430
CY
CD
4�2
0.21
6G
AP
DH
0.33
2A
CT
7α0.
524
GA
RY
0.44
3N
PL
40.
253
TB
P0.
175
TB
P0.
562
AC
T7α
0.55
2
6T
UBα
�30.
447
TU
Bα
�30.
564
CY
PB
0.46
5T
UBα
�30.
469
CE
SA
0.21
9V
SR
30.
340
TB
P0.
530
NP
L4
0.52
7C
AC
0.27
4N
PL
40.
264
TE
F1�α
0.64
5P
GK
0.55
7
7C
YC
D4�
20.
452
CO
Pε1
0.62
0P
GK
0.49
8T
BP
0.48
3N
PL
40.
240
CY
PB
0.37
4N
PL
40.
606
CY
PB
0.52
9A
CT
7α0.
303
AC
T7α
0.39
3C
YP
B0.
717
RN
AP
II0.
587
8N
PL
40.
477
CE
SA
0.63
9V
SR
30.
674
GA
PD
H0.
512
TE
F1�α
0.24
5A
CT
7α0.
386
GA
PD
H0.
711
CO
Pε1
0.58
9T
EF
1�α
0.40
2R
NA
PII
0.40
5C
OPε1
0.71
8T
BP
0.62
7
9A
CT
7α0.
580
TE
F1�α
0.68
5T
BP
0.72
7V
SR
30.
586
AC
T7α
0.24
6N
PL
40.
432
VS
R3
0.76
7T
EF
1�α
0.62
8T
BP
0.41
6C
OPε1
0.41
1C
AC
0.72
7C
YP
B0.
631
10T
BP
0.60
9C
YP
B0.
686
NP
L4
0.73
3A
CT
7α0.
669
CO
Pε1
0.29
0P
GK
0.44
2C
AC
0.77
5C
YC
D4�
20.
711
VS
R3
0.51
7V
SR
30.
607
GA
PD
H0.
767
CO
Pε1
0.64
1
11C
YP
B0.
676
CY
CD
4�2
0.80
9C
AC
0.77
1C
ES
A0.
726
RN
AP
II0.
292
TB
P0.
510
RN
AP
II0.
804
VS
R3
0.72
6C
ES
A0.
519
CY
PB
0.63
2C
YC
D4�
20.
804
TE
F1�α
0.70
8
12R
NA
PII
0.72
8T
BP
0.86
0C
YC
D4�
20.
975
CY
CD
4�2
0.83
4G
AR
Y0.
339
CY
CD
4�2
0.51
7T
EF
1�α
1.01
6C
AC
0.73
7C
YP
B0.
557
TE
F1�α
0.67
0A
CT
7α0.
890
VS
R3
0.74
9
13V
SR
30.
773
VS
R3
0.86
1C
OPε1
1.16
6G
AR
Y0.
849
PG
K0.
340
RN
AP
II0.
622
GA
RY
1.02
9A
CT
7α0.
752
CY
CD
4�2
0.58
8C
YC
D4�
20.
724
NP
L4
0.93
3C
YC
D4�
20.
850
14G
AR
Y1.
005
CA
C0.
996
GA
RY
1.17
4T
EF
1�α
0.90
5C
YP
B0.
383
TE
F1�α
0.62
5C
YC
D4�
21.
129
RN
AP
II0.
778
GA
RY
0.67
6G
AR
Y0.
760
VS
R3
1.24
9G
AR
Y0.
917
15G
AP
DH
1.60
6G
AP
DH
1.17
4G
AP
DH
1.38
3P
GK
1.06
9T
BP
0.64
3G
AR
Y0.
756
CO
Pε1
1.24
4T
BP
0.85
2G
AP
DH
1.09
2G
AP
DH
2.14
2G
AR
Y1.
293
GA
PD
H1.
231
850
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
ZHOU et al.
RNA�seq database of Camellia oleifera seeds from Julyto October. CV values of TUBα�3 and CESA wereranked as the ninth and the fifteenth respectively(Table 3). So it seems that the most stable referencegenes are not the genes with the lowest CV values ofRPKM. This may be because RNA�seq is focused ongene expression during the development of the seed,but not on different species and tissues, furthermore,all of the candidate genes were relatively stable uni�genes, and there was only a little CV differencebetween them. So RNA�seq is still valid and it may besupposed that more suitable reference genes will befound using RNA�seq data in future research.
Finally, reference gene validation was performed.The results showed many relative expression discrep�ancies between the stable and unstable referencegenes, indicating that the inappropriate use of refer�ence genes without validation may produce bias fromthe real expression.
Fifteen genes were selected as candidate referencegenes from the RNA�seq database, all of which werestable among the different seed developmental stages.Ten genes were expressed at moderate levels, threegenes expressed at low levels (GARY, CAC and CYCD4�2),and the other two genes (TEF1�α and PGK) wereexpressed at a much higher level. Thus, their Ct valueswere lower than those of the others. Based on the ref�erence gene validation analysis, we speculated that ifthe reference genes were highly expressed, it wouldlower the relative expression and lead to transcriptabundance discrepancies of the target gene. The twohighest�abundance genes, TEF1�α and PGK, were theleast stable genes in most species and tissues accordingto the three different software tools. Although PGK wasfound to be a stable gene in seeds, the relative expres�sion of SQE in the seeds of Guangning, Wantian andYouxian was almost zero when PGK was used as thereference gene, so differences in target gene expres�sion were hard to distinguish. This result verified ourassumption that a highly expressed reference genecould lead to discrepancies in target gene expression.
CONCLUSION
In summary, fifteen reference genes were analyzedin different tissues of six oil�tea camellia species withgeNorm, NormFinder and BestKeeper. The resultsindicated that TUBα�3 and CESA were the most stablyexpressed when all samples were analyzed together.Suitable reference genes were also evaluated in eachtissue and species specifically. Although no gene wasstable across all of the different species and tissues,TUBα�3, CESA and ACT7α were comparatively stablein many species and tissues. The optimal number ofreference genes required for accurate normalizationvaried from 2–6.
ACKNOWLEDGMENTS
This work was supported by the Creation of HighYield and Quality New Oil�tea camellia Germplasm(no. 2009BADB1B01) and the Research and Devel�opment of Crucial Technology for Oil�tea camelliaIndustrial Upgrading (no. 2009BADB1B00).
REFERENCES
1. Chang Z., Ling C., Yamashita M., et al. 2010. Microar�ray�driven validation of reference genes for quantitativereal�time polymerase chain reaction in a rat vocal foldmodel of mucosal injury. Anal. Biochem. 406 (2), 214–221.
2. Umenishi F., Verkman A.S., Gropper M.A. 1996.Quantitative analysis of aquaporin mRNA expressionin rat tissues by RNase protection assay. DNA CellBiol.15, 475–480 .
3. Zhang L., Zhou W., Velculescu V.E., et al. 1997. Geneexpression profiles in normal and cancer cells. Science.276 (5316), 1268–1272.
4. Demidenko N.V., Logacheva M.D., Penin A.A. 2011.Selection and validation of reference genes for quanti�tative real�time PCR in Buckwheat (Fagopyrum escu�lentum) based on transcriptome sequence data. PloSONE. 6 (5), e19434.
5. Huggett J., Dheda K., Bustin S., et al. 2005. Real�timeRT�PCR normalization: Strategies and considerations.Genes Immun. 6 (4), 279–284.
6. Paolacci A.R., Tanzarella O.A., Porceddu E., Ciaffi M.2009. Identification and validation of reference genesfor quantitative RT–PCR normalization in wheat.BMC Mol. Biol. 10 (1), 11.
7. Maccoux L.J., Clements D.N., Salway F., Day P.J.2007. Identification of new reference genes for the nor�malization of canine osteoarthritic joint tissue tran�scripts from microarray data. BMC Mol. Biol. 8 (1), 62.
8. Migocka M., Papierniak A. 2011. Identification of suit�able reference genes for studying gene expression incucumber plants subjected to abiotic stress and growthregulators. Mol. Breeding. 28 (3), 343–357.
9. Dheda K., Huggett J.F., Chang J.S., et al. 2005. Theimplications of using an inappropriate reference genefor real�time reverse transcription PCR data normal�ization. Anal. Biochem. 344 (1), 141–143.
10. Andersen C.L., Jensen J.L., Ørntoft T.F. 2004. Nor�malization of real�time quantitative reverse transcrip�tion�PCR data, a model�based variance estimationapproach to identify genes suited for normalization,applied to bladder and colon cancer data sets. CancerRes. 64 (15), 5245.
11. Qi J., Yu S., Zhang F., Shen X., Zhao X., Yu Y., Zhang D.2010. Reference gene selection for real�time quantita�tive polymerase chain reaction of mRNA transcript levelsin Chinese cabbage (Brassica rapa L. ss. pekinensis).Plant Mol. Biol. Rep. 28 (4), 597–604.
12. Hruz T., Wyss M., Docquier M., et al. 2011. RefGenes,identification of reliable and condition specific refer�ence genes for RT–qPCR data normalization. BMCGenomics. 12 (1), 156.
MOLECULAR BIOLOGY Vol. 47 No. 6 2013
SELECTION OF REFERENCE GENES FOR QUANTITATIVE REAL�TIME PCR 851
13. Zhuang R.L., Yao X.H. 2008. Oil�Tea Camellia ofChina, 2nd ed. Beijing: Chinese Forestry Publ., pp. 1–10(in Chinese).
14. Vijayan K., Zhang W.J., Tsou C.H. 2009. Moleculartaxonomy of Camellia (Theaceae) inferred from nrITSsequences. Am. J. Bot. 96 (7), 1348–1360.
15. Tan X.F., Hu F.M., Xie L.S., et al. 2006. Constructionof EST library and analysis of main expressed genes ofCamellia oleifera seeds. Sci. Silvae Sinicae (China). 42 (1),43–48.
16. Lin P., Cao Y.Q., Yao X.H., et al. 2011.Transcriptomeanalysis of Camellia oleifera Abel seed in four develop�ment stages. Mol. Plant Breeding (China). 19 (4), 498–505.
17. Hu X., Tan X., Tian X., et al. 2008. cDNA cloning,sequence analysis and physiological role speculation ofa dehydrin�like protein from Camellia oleifera. Acta Bot.Boreali�Occidentalia Sinica (China). 28 (8), 1541–1548.
18. Jiang Y., Tan X.F., Zhang D.Q., et al. 2009. Cloningand sequence analysis of a metallothionein gene fromCamellia oleifera. Acta Agric. Univ. Jiangxiensis (China).31 (4), 699–705.
19. Luo Q., Xie L.S., Tan X.F., et al. 2008. Cloning of full�length cDNA of FAD2 gene from Camellia oleifera. Sci.Silvae Sinicae (China). 44 (3), 70–75.
20. Zhang D., Tan X., Chen H., et al. 2008. Full�lengthcDNA cloning and bioinformatic analysis of Camelliaoleifera SAD. Sci. Silvae Sinicae. 44 (2), 155–159.
21. Wang B., Tan X.F., Chen Y., et al. 2012. Molecularcloning and expression analysis of two calmodulingenes encoding an identical protein from Camellia ole�rfera. Pak. J. Bot. 44 (3), 961–968.
22. Shao G., Tan X.F., Chen H., et al. 2012. Isolation andcharacterization of an aldo�keto reductase cDNA fromCamellia oleifera seed. Adv. Sci. Lett. (China). 10 (1),153–157.
23. Han X.J., Lu M., Chen Y., et al. 2012. Selection of reli�able reference genes for gene expression studies usingreal�time PCR in tung tree during seed development.PloS ONE. 7 (8), e43084.
24. Li R., Zhu H., Ruan J., et al. 2010. De novo assemblyof human genomes with massively parallel short readsequencing. Genome Res. 20, 265–272.
25. Mortazavi A., Williams B.A., McCue K., Schaeffer L.,Wold B. 2008. Mapping and quantifying mammaliantranscriptomes by RNA�Seq. Nature Methods. 5 (7),621–628.
26. Czechowski T., Stitt M., Altmann T., et al. 2005.Genome�wide identification and testing of superiorreferencegenes for transcript normalization in Arabi�dopsis. Plant Physiol. 139, 5–17.
27. Expósito�Rodríguez M., Borges A.A., Borges�Pérez A.,et al. 2008. Selection of internal control genes for quan�
titative real�time RT�PCR studies during tomato devel�opment process. BMC Plant Biol. 8, 131.
28. Chang E., Shi S., Liu J., et al. 2012. Selection of refer�ence genes for quantitative gene expression studies inPlatycladus orientalis (Cupressaceae) using real�timePCR. PloS ONE. 7 (3), e33278.
29. Do R., Kiss R.S., Gaudet D., et al. 2009. Squalene syn�thase, a critical enzyme in the cholesterol biosynthesispathway. Clin. Genet.75 (1), 19–29.
30. Beytia E., Qureshi A.A., Porter J.W. 1973. Squalenesynthetase III. Mechanism of the reacion. J. Biol.Chem. 248 (5), 1856–1867.
31. M’Baya B., Fegueur M., Servouse M., et al. 1989.Regulation of squalene synthetase and squalene epoxi�dase activities in Saccharomyces cerevisiae. Lipids.24 (12), 1020–1023.
32. Vandesompele J., De Preter K., Pattyn F., et al. 2002.Accurate normalization of real�time quantitative RT�PCR data by geometric averaging of multiple internalcontrol genes. Genome Biol. 3 (7), res. 0034.
33. Pfaffl M.W., Tichopad A., Prgomet C., et al. 2004.Determination of stable housekeeping genes,differen�tially regulated target genes and sample integrity: Best�Keeper–Excel�based tool using pair�wise correlations.Biotechnol. Lett. 26 (6), 509–515.
34. Livak K.J., Schmittgen T.D. 2001. Analysis of relativegene expression data using real�time quantitative PCRand the 2–ΔCt method. Methods. 25 (4), 402–408.
35. Benn C.L., Fox H., Bates G. 2008. Optimisation ofregion�specific reference gene selection and relativegene expression analysis methods for pre�clinical trialsof Huntington’s disease. Mol. Neurodegener. 3 (1), 17.
36. Yan J., Yuan F., Long G., et al. 2011. Selection of refer�ence genes for quantitative real�time RT�PCR analysisin citrus. Mol. Biol. Rep. 39 (2), 1831–1838.
37. Tong Z., Gao Z., Wang F., et al. 2009. Selection of reli�able reference genes for gene expression studies inpeach using real�time PCR. BMC Mol. Biol. 10 (1), 71.
38. Zhu G.P., Li J.Y., Fan Z.Q., et al. 2011. Isolation of Bfunction CjDEF�1 gene involved in floral developmentin Camellia japonica Hongshibaxueshi and its expres�sion analysis. J. Agric. Biotechnol. 19 (3), 442–448.
39. Zhou X.W., Li J.Y., Fan Z.Q. 2012. Cloning and expres�sion analysis of chalcone isomerase gene cDNA fromCamellia nitidissima. Forest Res. (China). 25 (1), 93–99.
40. Radonic� A., Thulke S., Mackay I.M., Landt O., Siegert W.,Nitsche A. 2004. Guideline to reference gene selectionfor quantitative real�time PCR. Biochem. Biophys. Res.Commun. 313 (4), 856–862.
41. Raaijmakers M.H., van Emst L., de Witte T., Mensink E.,Raymakers R.A. 2002. Quantitative assessment of geneexpression in highly purified hematopoietic cells usingreal�time reverse transcriptase polymerase chain reac�tion. Exp. Hematol. 30 (5), 481–487.