10
Mammoth and Mastodon collagen sequences; survival and utility M. Buckley a,b,, N. Larkin c , M. Collins b a Manchester Interdisciplinary Biocentre, 131 Princess Street, Faculty of Life Sciences, University of Manchester, Manchester M1 7DN, UK b BioArCh, Departments of Archaeology, Biology and Chemistry, S Block, University of York, York YO10 5YW, UK c Norfolk Museums and Archaeology Service, Shirehall Study Centre, Market Avenue, Norwich, Norfolk NR1 3JQ, UK Received 22 August 2010; accepted in revised form 12 January 2011; available online 26 January 2011 Abstract Near-complete collagen (I) sequences are proposed for elephantid and mammutid taxa, based upon available African elephant genomic data and supported with LC–MALDI-MS/MS and LC–ESI-MS/MS analyses of collagen digests from pro- boscidean bone. Collagen sequence coverage was investigated from several specimens of two extinct mammoths (Mammuthus trogontherii and Mammuthus primigenius), the extinct American mastodon (Mammut americanum), the extinct straight-tusked elephant (Elephas (Palaeoloxodon) antiquus) and extant Asian (Elephas maximus) and African (Loxodonta africana) elephants and compared between the two ionization techniques used. Two suspected mammoth fossils from the British Middle Pleistocene (Cromerian) deposits of the West Runton Forest Bed were analysed to investigate the potential use of peptide mass spectrometry for fossil identification. Despite the age of the fossils, sufficient peptides were obtained to identify these as elephantid, and sufficient sequence variation to discriminate elephantid and mammutid collagen (I). In-depth LC–MS analyses further failed to identify a peptide that could be used to reliably distinguish between the three genera of elephantids (Elephas, Loxodonta and Mammuthus), an observation consistent with predicted amino acid substitution rates between these species. Ó 2011 Elsevier Ltd. All rights reserved. 1. INTRODUCTION Ancient DNA from extinct taxa has been widely tar- geted for phylogenetic reconstructions because of its poten- tial to obtain sequence information that no longer exists in nature. This information may be used to uncover shared traits that are obscured by divergence in extant taxa, but also useful in the absence of closely-related sister taxa for tree-rooting (Yang et al., 1996). The fossil remains of ex- tinct proboscideans lend themselves particularly well to molecular studies not only because of their limited number of extant species, but also due to the availability of large amounts of particularly well-preserved tissues including bone and hair (Gilbert et al., 2007; Miller et al., 2008). Although DNA is more informative, there have already been several reports of the successful sequencing of ancient mammoth and mastodon collagen peptides (Schweitzer et al., 2002; Asara et al., 2007). Collagen is by far the most abundant protein in bone and has long been known to sur- vive well in Quaternary fossils through its use in radiocar- bon dating (Longin, 1971) and stable isotope analyses (Jones et al., 2001). The Proboscidea are an iconic order with a rich fossil re- cord which have been a focus of molecular analysis (Rohland et al., 2007; Miller et al., 2008). In the Pliocene the group was diverse, including the Deinotheriidae, Mammutidae, Gomphotheriidae and Stegodontidae, but today there is only one family, the Elephantidae, composed of three species in two genera, Elephas (E. maximus) in Asia and Loxodonta (containing both L. africana and L. cyclotis) in Africa (Lister et al., 2007). The woolly mammoth (Mammuthus primigenius) and the American mastodon (Mammut americanum) persisted through the Pleistocene until around 10,000 years ago, apart from a couple of 0016-7037/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.gca.2011.01.022 Corresponding author. Tel.: +44 7813667779. E-mail address: [email protected] (M. Buckley). www.elsevier.com/locate/gca Available online at www.sciencedirect.com Geochimica et Cosmochimica Acta 75 (2011) 2007–2016

Mammoth and Mastodon collagen sequences; survival and utility

  • Upload
    york

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Available online at www.sciencedirect.com

www.elsevier.com/locate/gca

Geochimica et Cosmochimica Acta 75 (2011) 2007–2016

Mammoth and Mastodon collagen sequences; survival and utility

M. Buckley a,b,⇑, N. Larkin c, M. Collins b

a Manchester Interdisciplinary Biocentre, 131 Princess Street, Faculty of Life Sciences, University of Manchester, Manchester M1 7DN, UKb BioArCh, Departments of Archaeology, Biology and Chemistry, S Block, University of York, York YO10 5YW, UK

c Norfolk Museums and Archaeology Service, Shirehall Study Centre, Market Avenue, Norwich, Norfolk NR1 3JQ, UK

Received 22 August 2010; accepted in revised form 12 January 2011; available online 26 January 2011

Abstract

Near-complete collagen (I) sequences are proposed for elephantid and mammutid taxa, based upon available Africanelephant genomic data and supported with LC–MALDI-MS/MS and LC–ESI-MS/MS analyses of collagen digests from pro-boscidean bone. Collagen sequence coverage was investigated from several specimens of two extinct mammoths (Mammuthus

trogontherii and Mammuthus primigenius), the extinct American mastodon (Mammut americanum), the extinct straight-tuskedelephant (Elephas (Palaeoloxodon) antiquus) and extant Asian (Elephas maximus) and African (Loxodonta africana) elephantsand compared between the two ionization techniques used. Two suspected mammoth fossils from the British MiddlePleistocene (Cromerian) deposits of the West Runton Forest Bed were analysed to investigate the potential use of peptidemass spectrometry for fossil identification. Despite the age of the fossils, sufficient peptides were obtained to identify theseas elephantid, and sufficient sequence variation to discriminate elephantid and mammutid collagen (I). In-depth LC–MSanalyses further failed to identify a peptide that could be used to reliably distinguish between the three genera of elephantids(Elephas, Loxodonta and Mammuthus), an observation consistent with predicted amino acid substitution rates between thesespecies.� 2011 Elsevier Ltd. All rights reserved.

1. INTRODUCTION

Ancient DNA from extinct taxa has been widely tar-geted for phylogenetic reconstructions because of its poten-tial to obtain sequence information that no longer exists innature. This information may be used to uncover sharedtraits that are obscured by divergence in extant taxa, butalso useful in the absence of closely-related sister taxa fortree-rooting (Yang et al., 1996). The fossil remains of ex-tinct proboscideans lend themselves particularly well tomolecular studies not only because of their limited numberof extant species, but also due to the availability of largeamounts of particularly well-preserved tissues includingbone and hair (Gilbert et al., 2007; Miller et al., 2008).Although DNA is more informative, there have already

0016-7037/$ - see front matter � 2011 Elsevier Ltd. All rights reserved.

doi:10.1016/j.gca.2011.01.022

⇑ Corresponding author. Tel.: +44 7813667779.E-mail address: [email protected] (M. Buckley).

been several reports of the successful sequencing of ancientmammoth and mastodon collagen peptides (Schweitzeret al., 2002; Asara et al., 2007). Collagen is by far the mostabundant protein in bone and has long been known to sur-vive well in Quaternary fossils through its use in radiocar-bon dating (Longin, 1971) and stable isotope analyses(Jones et al., 2001).

The Proboscidea are an iconic order with a rich fossil re-cord which have been a focus of molecular analysis(Rohland et al., 2007; Miller et al., 2008). In the Pliocenethe group was diverse, including the Deinotheriidae,Mammutidae, Gomphotheriidae and Stegodontidae, buttoday there is only one family, the Elephantidae, composedof three species in two genera, Elephas (E. maximus) in Asiaand Loxodonta (containing both L. africana and L. cyclotis)in Africa (Lister et al., 2007). The woolly mammoth(Mammuthus primigenius) and the American mastodon(Mammut americanum) persisted through the Pleistoceneuntil around 10,000 years ago, apart from a couple of

2008 M. Buckley et al. / Geochimica et Cosmochimica Acta 75 (2011) 2007–2016

isolated instances where they lasted a few thousand yearslonger. An earlier mammoth, the Steppe mammoth(Mammuthus trogontherii) is thought to have survived wellinto the Middle Pleistocene. Morphological comparisonindicates that the Mammutidae likely diverged from thelineage leading to the Elephantidae during the earlyMiocene or before (approximately 24 million years ago);molecular analyses indicate that Elephas and Loxodonta

within the Elephantidae diverged from a common ancestoraround the Miocene–Pliocene boundary (5–8 million yearsago) (Rohland et al., 2007; Miller et al., 2008).

Morphologically, Elephas and Mammuthus are consid-ered to form a monophyletic clade with Loxodonta as a sis-ter group (Shoshani and Tassy, 2005). This has beensupported by numerous molecular analyses, primarily usingDNA sequences (Yang et al., 1996; Rohland et al., 2007;Miller et al., 2008). The only previous work attempting touse proteins to identify the relationships between Elephas,Mammuthus and Loxodonta was carried out using immuno-logical assays, such as radioimmunoassay, but these werenot able to resolve the phylogenetic relationships betweenthese three elephantid genera (Shoshani et al., 1985;Lowenstein and Scheuenstuhl, 1991). We have attemptedto sequence Middle and Late Pleistocene fossil probosci-dean bone collagen in order to assess if it is possible to

Fig. 1. Sequence coverage of African elephant collagen against the Loxo

unmarked text indicates matched sequence from ESI search, underlined teMALDI matched peptides with ion score >32 and black shading indicinformation including sequence coverage of Asian elephant collagen). B

recover useful phylogenetic information from extinctproboscideans beyond the reach of DNA.

2. MATERIALS AND METHODS

Cortical bone from modern African elephant (Loxodonta

africana) and Asian elephant (Elephas maximus) tarsalbones, a sample of vertebra of the Cromerian (�650 Ka)West Runton Elephant (Steppe mammoth; M. trogontherii;(see Lister and Stuart, 2010 for details of taphonomy andpreservation)), along with two other indeterminate WestRunton Forest Bed ‘large mammal’ specimens suspectedas being from Steppe mammoth, a woolly mammoth(M. primigenius) specimen dredged from the North Sea(�20–60 Ka), a straight-tusked elephant (Elephas (Palaeol-

oxodon) antiquus) from Aveley (Essex; Stuart, 1976) anda mammoth specimen (M. trogontherii) from Latton(Wiltshire; Lewis et al., 2006) (both �200 Ka) were pow-dered using a diamond-tipped dremel drill. A sample ofAmerican mastodon (M. americanum) “Elmer” (dated to10,200 ± 170 B.P.) was supplied as a powder (see Shoshaniet al., 1985).

Bone powder (�100 mg) was demineralized (1 mL 0.6 MHCl, 4 h, 4 �C) and the acid-insoluble pellet centrifuged(13,000g, 5 min) and rinsed five times with 2 mL distilled

donta collagen (I) alpha 1 (top) and alpha 2 (bottom) chain wherext indicates ESI matches with ion score >32, grey shading indicatesates unmatched sequence (see Supplementary material for further= variable Hyp, O = Hyp, J = variable Hyl and U = fixed Hyl.

Mammoth and Mastodon collagen 2009

deionized H2O until the pH reached neutral. The rinsed pel-let was then freeze-dried overnight to allow for accurateweighing. Acid-insoluble collagen (1 mg) was then resus-pended in 80 lL 50 mM ammonium bicarbonate and gela-tinized for 3 h at 65 �C. Following gelatinization, thesample was centrifuged (13,000g, 15 min) and the superna-tant removed for tryptic digestion. 1 lL of 1 lg/lL trypsinsolution was added to each sample, which was then incu-bated at 37 �C for 18 h. Following digestion the samplewas centrifuged and acidified with 8 lL 1% TFA. Prior toLC–MALDI and LC–ESI analysis, the digest was centri-fuged (13,000g, 10 min) and an aliquot diluted 10-fold in0.1% TFA.

Peptide mass fingerprints (PMFs) were obtained follow-ing Buckley et al. (2009) using the remaining supernatant.In brief, 10 lL C18 Millipore Ziptip� pipette tips were pre-pared with one bed volume of 50% ACN/0.1% TFA, fol-lowed by one bed volume of 0.1% TFA. Followingenzymatic digestion, the samples were centrifuged and theacidified supernatant applied to the pipette tip. A steppedgradient of increasing ACN concentration following severalwash steps (2 � 0.1% TFA) was applied to the pipette tip andthe 10% and 50% ACN in 0.1% aqueous TFA fractions inwhich the peptides eluted were collected. Samples were thendried on a centrifugal evaporator and resuspended with10 lL 0.1% TFA. Sample solution (1 lL) from each fractionwas spotted onto a Bruker ultraflex III target plate, mixed to-gether with 1 lL of a-cyano-4-hydroxycinnamic acid matrixsolution (1% in ACN/H2O 1:1 v:v) and allowed to dry. Thefractions of each collagen digest were analysed in triplicateby Matrix Assisted Laser Desorption Ionization MassSpectrometry (MALDI-MS) in reflectron mode using aBruker ultraflex III MALDI TOF/TOF mass spectrometer

Fig. 2. Amino acid composition profiles of the seven proboscidean fo

equipped with a Nd:YAG smart beam laser. MS spectrawere acquired over a mass range of m/z 800–4000. Final massspectra were externally calibrated against an adjacent spotcontaining six peptides (des-Arg1-Bradykinin, m/z 904.681;Angiotensin I, m/z 1296.685; Glu1-Fibrinopeptide B, m/z1750.677; ACTH (1–17 clip), m/z 2093.086; ACTH (18–39clip), m/z 2465.198; ACTH (7–38 clip), m/z 3657.929).Monoisotopic masses were obtained using a SNAP avera-gine algorithm (C 4.9384, N 1.3577, O 1.4773, S 0.0417, H7.7583) and a S/N threshold of 2. De novo sequencingof selected collagen peptides was carried out by manu-ally interpreting the spectra obtained on collision-induceddissociation product ion (MS/MS) of selected peptides.

For LC–MALDI analyses, 3 lL of the diluted samplewas applied to an Ultimate nanoLC (LC Packings) usinga Dionex 0.2 mm � 50 mm polystyrene divinylbenzene(PSDVB) monolith column with a 1–50% (solvent B in A)continuous gradient in 20 min (solvent A = 2% ACN,98% H2O, 0.1% HFBA; solvent B = 100% ACN, 0.1%HFBA) at a flow rate of 3.0 lL/min. Six second fractionswere plated onto a 180 spot MALDI target plate simulta-neously with matrix solution (6 mg/mL a-cyano hydroxy-cinnamic acid (Sigma), 26 lL/mL of 5 mg/mL ammoniumcitrate (Sigma) in 60% ACN/40% H2O) using a Dionex Pro-bot sample spotter. Following the gradient, the concentra-tion of solvent B (in A) was raised to 85% in preparationfor the following sample. The MALDI analysis was carriedout using an Applied Biosystems 4700 Proteomics Discov-ery System in reflectron detector mode (m/z range 900–4000 for peptide analysis). Following reflectron detectormode analysis, the 15 MS peaks of greatest S/N (above40) were selected for product ion tandem MS (MS/MS)analysis. The resulting product ion spectra were then

ssils compared with modern African and Asian elephant bone.

2010 M. Buckley et al. / Geochimica et Cosmochimica Acta 75 (2011) 2007–2016

collectively converted to peak lists by Applied Biosystems’4000 Explorer version 3.6 with a S/N cut-off of 15, andsearched against a local database by Mascot version 2.2.For online LC–ESI (electrospray ionization) analysis,1 lL of the diluted sample digest was applied to an Ulti-mate nanoLC system using a Dionex 0.1 mm � 50 mmPSDVB monolith column with a 1–50% (solvent B in A)continuous linear gradient over 7.5 min (solvent A = 2%ACN, 98% H2O, 0.1% formic acid; solvent B = 100%ACN, 0.1% formic acid) at a flow rate of 1.2 lL/min (col-umn temperature = 70 �C) followed by a 5 min wash at95% B and finally 11 min reconditioning in 1% B. Follow-ing LC separation of peptides, an Applied Biosystems/MDS Sciex QSTAR� API Pulsar i Hybrid LC–MS/MS sys-tem with a MicroIonSpray source (fitted with a 20 lm IDcapillary). Positive ESI-MS and MS/MS spectra were ac-quired using information dependent acquisition (IDA) withan ionspray voltage of 5200 V, and an m/z range 300–2000.IDA settings were 0.5 s for the acquisition of survey MSspectrum, 0.5 s for product ion spectra on the 1st and 2ndmost abundant ions that meet the switch criteria (when ionsm/z 300–2000 and a charge state of two to four exceeds 20

Fig. 3. Reflectron MALDI-TOF mass spectra, obtained on digestion ofinto two fractions (10% and 50% acetonitrile in 0.1% TFA).

counts, switch after one spectrum, excluding former targetions for 60 s), and a cycle time of 1.5 s with the collision en-ergy (CE) calculated automatically from the IDA CEparameter table. Peptide MS and MS/MS data were ob-tained directly from IDA files using the vendor-providedMascot script (version 1.6b21). The data were then submit-ted to Mascot and searched against a local database.Mascot’s MudPIT scoring was employed and separate de-coy databases were included in each search with therequirements for matched proteins to contain at least onehighest scoring unique peptide match (Mascot’s ‘bold red’selection) and a probability threshold of 95% (p = 0.05).Variable modifications included deamidation (N/Q),hydroxylation (P/K), pyro-glutamic acid (N-term Q) andoxidation (M) as well as for one missed cleavage site. ForLC–MALDI analyses 0.5 Da error margins were allowedfor both MS and MS/MS precursor masses whereas forLC–ESI analyses 50 ppm and 0.2 Da error margins were al-lowed for MS and MS/MS, respectively. Searches were car-ried out with and without a peptide ion score cut-off at 32 inorder to indicate peptide matches with extensive homology(as determined by Mascot).

mastodon and woolly mammoth collagen with trypsin and elution

Mammoth and Mastodon collagen 2011

To create a local database containing type I collagen se-quences from additional species to those publicly availablein UniProt or NCBI databases, an alignment of the totalgenomic sequences from 28 species was carried out (seeBuckley et al., 2008a). The 28 genome sequences includedtwo completed genome projects (human and mouse), withan estimated coverage of over 99% of the euchromatin,and an error rate of 1 in 100,000 (International HumanSequencing Consortium 2004). Seventeen high quality draftsequences from different species based on whole-genomeshotgun assemblies with five to eightfold coverage of thegenome (including the African elephant) and nine lowerquality draft sequences from different species based onwhole-genome shotgun assemblies with two-fold or lowercoverage. Following genome alignment, the collagen (I) se-quences were obtained by comparison to the human gen-ome collagen (I) loci. The sequence for African elephant(renamed elephantid) was improved (gaps and obvious er-rors manually corrected) using LC–MALDI-TOF–TOF-MS/MS and LC–ESI-qTOF-MS/MS analyses of modernAfrican elephant collagen digests. The collagen sequencefor mastodon was inferred from LC–MALDI and LC–ESI analyses of mastodon collagen tryptic digests; all can-didate peptides for gaps in the sequence matches to theAfrican elephant sequence were investigated manually.Only sequence changes that coincided with the absence ofits equivalent African elephant peptide precursor were con-

1095.4

1435.5

1831265.3998.3 1660.4

0

250

500

750

1000

1250

1095.4

1435.4

183

1265.4

200

400

600

800

1000 1200 1400 1600 180m

Mammoth peptide

Mastodon peptide

998.4

y10

y10

y11

y11y12

y12

y13

y13 y15

y15

y17

y171660.4 y1

GS E O TG GA

y25 y24

OP

y23

GT

y20 y19

G

y33

S

y17

G

y18

POs

Inte

nsity

(A

.U.)

y21y28

Fig. 4. MALDI-MS/MS spectra of the homologous peptide at m/z 3015 aamino acid sequence variation occurs at the y20 ion.

sidered. Approximately 7161 other bone-related protein se-quences taken from UniProt (resulting from a search for‘bone’) and the common Repository of Adventitious Pro-teins (cRAP) were added to the local collagens databasein order to reduce the probability of obtaining false positivematches. For comparative purposes, the African elephantand proposed ‘Mammut’ collagen sequences from Organet al. (2008) were also added to this local database.

Amino acid composition analyses were carried out onthe proboscidean bone samples following the methods de-scribed in Buckley et al. (2008b). In brief, approximately1 mg of bone powder was treated with 100 lL of 7 MHCl under N2 at 110 �C for 18 h to demineralize thehydroxyapatite and hydrolyse peptide bonds in the proteinto release amino acids. Samples were then dried with a cen-trifugal evaporator and rehydrated with 500 lL of 0.01 mML-homo-arginine solution. A 2 lL sample was then injectedand mixed online with 2.2 lL of derivitizing reagent(260 mM n-iso-butyryl L-cysteine and 170 mM o-phthaldi-aldehyde in 1 M potassium borate buffer, adjusted to pH10.4 with potassium hydroxide pellets). The amino acidswere separated on a C18 HyperSil BDS column(5 mm ± 250 mm) at 25 �C with a gradient elution of threesolvents: sodium acetate buffer (23 mM sodium acetate tri-hydrate, 1.5 mM sodium azide, 1.3 mM ethyldiamine tetra-acetic acid, adjusted to pH 6.00 ± 0.01 with 10% acetic acidand sodium hydroxide), methanol and acetonitrile.

2635.00.6

2255.5

1931.6

2584.22241.7

0.71974.8

0 2000 2200 2400 2600 2800/z

9

y19

y28

2620.9

2598.2

GQ

y15

I L

y13

PG

y12 y11

GO

y10 y9

LI

y8 y7

G

y6

L

y5

GO

y4

S

y2

R

y1

y20

y211988.8

y20

1917.6

y21

y24

y24

y23

2198.7

y232184.7

y252327.0

y252312.9

y28

nd 3001 for mastodon and mammoth, respectively, showing that the

2012 M. Buckley et al. / Geochimica et Cosmochimica Acta 75 (2011) 2007–2016

3. RESULTS

3.1. Elephant collagen (I) sequence

Modern African elephant collagen sequence obtainedfrom publically available databases was added to a localdatabase which was used to search against LC–MALDI-MS/MS and LC–ESI-MS/MS data from tryptic digests ofAfrican and Asian elephant bone collagens (Fig. 1). Errortolerant searches were used to investigate potential se-quence variations between the two taxa but no variationscould be identified in the portions of the sequencesanalysed.

3.2. Fossil bone collagen

Amino acid composition analyses were carried out onseven fossil samples (Fig. 2) and modern African and Asianelephant bone. The amino acid concentrations of the mas-todon fossil used are as high as those recorded for modernproboscidean bone samples (total amino acid concentra-tions �1.5 lmol/mg) indicating the exceptional preserva-tion of this 10 Ka fossil. The concentrations of the woollymammoth sample are at approximately half that of themodern samples, in line with the suspected greater age(20–60 Ka) of the sample and ideal preservation conditionsoffered by the North Sea deposits. The remaining 200–650 Ka fossils all have much lower recorded concentra-tions, 50-fold lower in the case of the Latton mammoth.However, even at these concentrations, both the LC–MSand HPLC analyses indicate collagen to be the dominantmolecule in fossils.

0

10

20

30

40

50

60

70

80

African Elephant

Indian Elephant Mastodon Woolly Mammoth

Palaeo

Perc

enta

ge S

eque

nce

Cov

erag

e (%

)

Modern ~10 Ka ~20-60 Ka

236 411 753 581493

Fig. 5. Bar chart comparing the high-quality sequence coverage (ion sNumbers above bars indicate number of ions acquired for analysis by ta

The remarkable similarity of the relative amounts ofeach of the detected amino acids in the fossils to modernproboscidean collagen indicates that the collagen (I) de-tected in the ancient samples is the dominant protein; anabundance of exogenous microbial proteins or the enrich-ment of endogenous non-collagenous proteins would resultin deviations from the collagen-dominated compositionshown. The consistent pattern is the more remarkable whenit is appreciated that some of these samples date to theCromerian (�650 Ka).

The presence of collagen was confirmed by collagen fin-gerprinting analysis of the seven fossil samples usingmethods described in Buckley et al. (2009). Although the fin-gerprint of woolly mammoth collagen was almost identicalto modern African and Asian elephant standards, a cleardifference was observed (Fig. 3) between the mastodon (atm/z 3001) and all other elephantids (m/z 3015). The woollymammoth peptide (Fig. 1) has the same m/z as previouslyreported for African and Asian elephants (Buckley et al.,2009). The peak at m/z 3015.5 represents GSSGEOGTAGPPGTOGOQGILGPOGILGLOGSR with five hydroxy-lated proline residues (O) and the peak at 2999.5 represent-ing the same peptide with only four hydroxylated prolineresidues (underlined O represents the variably hydroxylatedproline residue).

The mastodon sample has a similar (16 Da shifted)hydroxylation pattern to elephant, but with an m/z 14 Dalower. Manual interpretation of MS/MS spectra obtainedfor these two peptides (Fig. 4) indicate that the 14 Da massdifference is the result of a T! S transition that occurred atposition 14 along the 33 amino acid peptide (residue 779 inalpha 2 (I) chain). No other peaks were observed in the

loxodon Steppe Mammoth

(Latton)

Steppe Mammoth

(WR4)

Steppe Mammoth

(WRE)

Steppe Mammoth (WR14)

All ESI peptide matches

ESI peptide matches with ion score >32

All MALDI peptide matches

~200 Ka ~650 Ka

242

430 278 550

cores >32) to the total sequence coverages obtained per sample.ndem mass spectrometry.

Mammoth and Mastodon collagen 2013

PMFs that appear useful for further discrimination betweenany of the proboscideans analysed.

The LC–MALDI and LC–ESI data files for the sevenfossil bone collagen digests were then searched against a lo-cal database containing 32 sequences of collagen (I) (seeSupplementary material), including the proposed Africanelephant sequence (Fig. 1) and >7000 other sequences fromproteins that could be expected in bone or from sample con-tamination. As expected, the most dominant (and oftenonly) protein matched was collagen (I). In each of the nineelephantid samples analysed, ranging from the modern Afri-can and Asian elephant samples, to the �650 Ka Steppemammoth samples, the top match was to the African ele-phant collagen (I) sequence.

Collagen sequence coverage was compared for the twodifferent ionization methods (LC–MALDI and LC–ESI)in two modern and seven fossil samples analysed (Supple-mentary material). Comparison of LC–MALDI and LC–ESI revealed little difference between modern samples, butwith the ancient samples the LC–ESI technique produceda much greater sequence coverage (Fig. 5). Due to thelow number of peptides unique to LC–MALDI, subsequentanalysis considered only the LC–ESI datasets. Despite typ-ical sequence coverage of 60–70%, no amino acid variationscould be confirmed between the elephantid species. This is

Fig. 6. Sequence coverage of mastodon collagen against the modifiedunmarked text indicates matched sequence from ESI search, underlined teMALDI matched peptides with ion score >32, black shading indicates unnot supported by this or any other report (e.g., Organ et al., 2008). Aminogrey text. B = variable Hyp, O = Hyp, J = variable Hyl and U = fixed H

consistent with the immunological data carried out byShoshani et al. (1985), and somewhat lower than the aver-age amino acid mutation rate (0.002) proposed by Milleret al. (2008), but consistent with the conserved nature ofcollagen.

Comparative examination of the mastodon LC–MSdata revealed three unique tryptic peptides (m/z 1874.8,2284.0/2285.0 and 3001.5), the latter of which is the samealpha 2 (I) chain peptide previously observed by the MAL-DI-ToF-MS fingerprint analysis. There are two differencesin the alpha 1 (I) chain both observed in the MS/MS spec-tra from LC–ESI analyses at Val294-Ile/Leu294 (the latterbeing isobaric amino acids and therefore indistinguishableusing this technique) and Ile693-Ala693 (from elephant tomastodon, respectively; Supplementary material; Fig. 6).The observed masses are unique to the mastodon samples,and are also found in the mastodon dataset released byAsara et al. (2007). A fourth sequence difference was no-ticed between the partial mastodon collagen (I) sequence re-ported by Organ et al. (2008) and our African elephantcollagen (I) sequence derived from genomic data, Ala427-Thr427, but despite the African elephant variant of this tryp-tic peptide being observed in the LC–ESI analyses of Asianelephant, straight-tusked elephant, steppe mammoth andwoolly mammoth (but not African elephant; Supplemen-

‘elephantid’ (I) alpha 1 (top) and alpha 2 (bottom) chain wherext indicates ESI matches with ion score >32, grey shading indicatesmatched sequence and strikethrough indicates unmatched sequenceacid differences from African elephant collagen (I) are boxed and inyl.

2014 M. Buckley et al. / Geochimica et Cosmochimica Acta 75 (2011) 2007–2016

tary material) the published mastodon variant was not ob-served in the LC–MALDI or LC–ESI analysis of our mas-todon sample. From this comparative data we propose thefollowing mastodon collagen (I) sequence (Fig. 6).

4. DISCUSSION AND CONCLUSIONS

There are several obvious limitations of sequencing pro-teins from extinct organisms by inference from MS/MSspectra in comparison to DNA sequencing; the first beingthe ambiguity of isobaric amino acids, and the potentialfor errors in sequence identification due to incomplete frag-ment ion series. A second limitation is that due to the smallnumber of known complete collagen sequences (limited tocow, human, mouse, chicken, zebrafish and rainbow troutin the UniProt database) in public databases, informationis lost due to an inability to match the sequences to publicdatabases and can only be recovered by (slow) de novo

sequencing. Another limitation is that, due to the natureof current approaches to peptide sequencing, sequencesneed to be thoroughly interrogated manually rather thansolely relying on automated sequencers, as are available

Fig. 7. Relationships and estimated divergence between the elepha

in DNA sequencing. Despite the fact that a substantial por-tion of the sequence (�30%) was not covered by mass spec-trometric analyses, the observed differences are consistentwith the divergence of mastodon at �24 Ma (Miller et al.,2008; Rohland et al., 2007; Fig. 7). In the near future, withthe ever-improving genomic data obtained for modern ele-phant and well-preserved woolly mammoth and mastodonspecimens, as well as ever-improving proteomics techniques,we anticipate that our proposed sequences will indeed beimproved upon.

We have aimed to show a thorough analysis of probos-cidean collagen (I) sequences, in light of data from nineproboscidean samples. The results indicate that we cannotcurrently use collagen sequencing to separate between var-ious taxa from the family elephantidae, and are unlikely tobe able to differentiate between different mammoth speciesthat are known to be present at sites such as West Runton(e.g., M. trogontherii and Mammuthus meridionalis). How-ever, we were able to confirm the identity of two ‘largemammal’ specimens suspected as being from Steppe mam-moth as deriving from a member of the taxonomic familyelephantidae. We also show (in three separate individual

ntids and mastodon with amino acid substitutions labelled.

Mammoth and Mastodon collagen 2015

specimens) that collagen (I) clearly survives in British Mid-dle Pleistocene fossils �650 Ka. However, the West RuntonFreshwater Bed (WRFB; part of the Cromer Forest Bed),from which these fossils were recovered, offers extraordi-nary preservation in that the deposits are extremely com-pact with low permeability. The tendency of freshlyexposed bone to darken in colour within hours of excava-tion indicates the anoxic nature of the deposit (Makridou,1996) that leads to such exceptional preservation of biomol-ecules. Collagen sequences are currently being retrievedfrom the Lower Pleistocene deposits of the WeybourneCrag (�1.5 Ma), immediately below the WRFB (Buckleyand Collins, submitted for publication). Although furtherLC–MS analyses may reveal reliable collagen peptide differ-ences between the elephantids, the data in its current stateindicates that the simpler peptide mass fingerprinting meth-od (e.g., Fig. 3) provides a similar level of information at amuch reduced cost. The ability to distinguish between mas-todons and mammoths may be ideal for studies into UpperPliocene/Lower Pleistocene deposits where both co-existedand fragmentary material is difficult to identify morpholog-ically and not possible using ancient DNA, e.g., separatingmastodons Anancus arvernensis and Zygolophodon borsoni

from the mammoth M. meridionalis from within the RedCrag deposits that occur over much of south-east Suffolk(Stuart, 1982).

We have demonstrated the ability to obtain a largeamount (60–70%) of sequence coverage of the large(>280 kDa) protein collagen (I) from several Middle Pleis-tocene temperate British fossils. We also provide supportfor a proposed elephantid collagen (I) sequence (wherethe analysed portion of African and Asian elephant se-quences were found to be indistinguishable from themammoth sequences) and a proposed mastodon collagen(I) sequence containing at least four amino acid differencesfrom the elephantid sequence. The amino acid compositionanalyses and the LC–MS analyses of the seven differentproboscidean fossils add support to the longevity of colla-gen as a protein capable of surviving hundreds of thousandsof years.

ACKNOWLEDGEMENTS

The authors would like to thank Jerry Herman (NationalMuseum of Scotland) for modern samples, Camilla Nichols(Yorkshire Museum) for the dredged woolly mammoth sample,Andy Currant and Adrian Lister (Natural History Museum,London, UK) for supplying sample of the Aveley elephant andthe late Hezy Shoshani for the American mastodon (“Elmer”)sample. We also would like to thank David Ashford and JaneThomas-Oates (Centre of Excellence in Mass spectrometry, Univer-sity of York) for analytical support. This work was supported by theNatural Environment Research Council [NER/S/J/2004/13017 andNE/G000204/1].

APPENDIX A. SUPPLEMENTARY DATA

The supplementary material of this article includes thecollagen sequence coverage as well as the processed LC–MALDI and LC–ESI data text files for each sample ana-

lysed. Supplementary data associated with this article canbe found, in the online version, at doi:10.1016/j.gca.2011.01.022.

REFERENCES

Asara J. M., Schweitzer M. H., Freimark L. M., Phillips M. andCantley L. C. (2007) Protein sequences from mastodon andtyrannosaurus tex revealed by mass spectrometry. Science

316(5822), 280–285.

Buckley M., Walker A., Ho S. Y., Yang Y., Smith C., Ashton P.,Thomas-Oates J., Cappellini E., Koon H., Penkman K.,Elsworth B., Ashford D., Solazzo C., Andrews P., Strahler J.,Shapiro B., Ostrom P., Gandhi H., Miller W., Raney B., ZylberM. I., Gilbert M. T., Prigodich R. V., Ryan M., Rijsdijk K. F.,Janoo A. and Collins M. J. (2008a) Comment on “proteinsequences from mastodon and tyrannosaurus rex revealed bymass spectrometry”. Science 4:319(5859), 33c.

Buckley M., Anderung C., Penkman K., Raney B., GotherstromA., Thomas-Oates J. and Collins M. J. (2008b) Comparing thesurvival of osteocalcin and mtDNA in archaeological bonefrom four European sites. J. Archaeol. Sci. 35, 1756–1764.

Buckley M., Collins M. J., Thomas-Oates J. and Wilson J. (2009)Species identification of bone collagen using matrix-assistedlaser desorption/ionization mass spectrometry. Rapid Commun.

Mass Spectrometry 23, 3843–3854.

Buckley M. and Collins M. J. (submitted for publication) Bonecollagen, a molecular barcode for Quaternary fossils. Journal ofQuaternary Science.

Gilbert M. T. P., Tomsho L. P., Rendulic S., Packard M., DrautzD. I., Sher A., Tikhonov A., Dalen L., Kuznetsova T.,Kosintsev P., Campos P. F., Higham T., Collins M. J., WilsonA. S., Shidlovskiy F., Buigues B., Ericson P. G. P., GermonpreM., Gotherstrom A., Iacumin P., Nikolaev V., Nowak-KempM., Willerslev E., Knight J. R., Irzyk G. P., Perbost C. S.,Fredrikson K. M., Harkins T. T., Sheridan S., Miller W. andSchuster S. C. (2007) Whole-genome shotgun sequencing ofmitochondria from ancient hair shafts. Science 317, 1927–1930.

Jones A. M., O’Connel T., Young E. D., Scott K., Buckingham C.M., Iacumin P. and Brasier M. D. (2001) Biochemical datafrom well preserved 200 ka collagen and skeletal remains. Earth

Planet. Sci. Lett. 193(1), 143–149.

Lewis S. G., Maddy D., Bucklingham C., Coope G. R., Field M.H., Keen D. H., Pike A. W. G., Roe D. A., Scaife R. G. andScott K. (2006) Pleistocene fluvial sediments. Palaeontologyand archaeology of the Upper River Thames at Latton,Wiltshire, England. J. Quat. Sci. 21(2), 181–205.

Lister A. and Stuart A. J. (2010) The West Runton mammoth(Mammuthus trogontherii) and its evolutionary significance.Quatern. Int. 228(1–2), 180–209.

Lister A., Bahn P. G. and Auel J. M. (2007) Mammoths: Giants of

the Ice Age. Frances Lincoln, London.Lowenstein J. M. and Scheuenstuhl G. (1991) Immunol1ogical

methods in molecular palaeontology. Philos. Trans. R. Soc.

Lond. B 333, 375–380.

Longin R. (1971) New method of collagen extraction for radio-carbon dating. Nature 230, 241–242.

Makridou E. (1996) The West Runton Elephant: research andconservation programme. Unpublished thesis for Master ofArts, Department of Archaeology, University of Durham.

Miller W., Drautz D. I., Ratan A., Pusey B., Qi J., Lesk A. M.,Tomsho L. P., Packard M. D., Zhao F., Sher A., Tikhonov A.,Raney B., Patterson N., Lindblad-Toh K., Lander E. S., KnightJ. R., Irzyk G. P., Fredrikson K. M., Harkins T. T., SheridanS., Pringle T. and Schuster S. C. (2008) Sequencing the nuclear

2016 M. Buckley et al. / Geochimica et Cosmochimica Acta 75 (2011) 2007–2016

genome of the extinct woolly mammoth. Nature 20; 456(7220),

387–390.

Organ C. L., Schweitzer M. H., Zheng W., Freimark L. M.,Cantley L. C. and Asara J. M. (2008) Molecular phylogeneticsand tyrannosaurus rex. Science 320(5875), 499.

Rohland N., Malaspinas A. S., Pollack J. L., Slatkin M., MatheusP. and Hofreiter M. (2007) Proboscidean mitogenomics:chronology and mode of elephant evolution using mastodonas an outgroup. PLoS Biol. 5, e207.

Schweitzer M., Hill C. L., Asara J. M., Lane W. S. and Pincus S. H.(2002) Identification of immunoreactive material in mammothfossils. J. Mol. Evol. 55(6), 696–705.

Shoshani J., Lowenstein J. M., Walz D. A. and Goodman M.(1985) Proboscidean origins of mastodont and woolly mam-moth demonstrated immunologically. Paleobiology 11, 429–

437.

Shoshani J. and Tassy P. (2005) Advances in proboscideantaxonomy and classification, anatomy and physiology, andecology and behavior. Quatern. Int. 126–128, 5–20.

Stuart A. J. (1976) The history of the mammal fauna during theIpswichian/Last Interglacial in England. Philos. Trans. R. Soc.

Lond. B 276(945), 221–259.

Stuart A. J. (1982) Pleistocene Vertebrates in the British Isles.Longman, London.

Yang H., Golenburg E. M. and Shoshani J. (1996) Phylogeneticresolution within the elephantidae using fossil DNA sequencefrom the American mastodon (Mammut americanum) as anoutgroup. Proc. Natl. Acad. Sci. USA 93, 1190–1194.

Associate editor: Graham A. Logan