12
Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal 1,2 , O. Sarobidy Rakotonarivo 3 , Julie H. Razamanahaka 4 , Neal Hockley 1 & Julia P. G. Jones 1 The Government of Madagascar is trying to reduce deforestation and conserve biodiversity through creating new protected areas in the eastern rainforests. While this has many benets, forest use restriction may bring costs to farmers at the forest frontier. We explored this through a series of surveys in ve sites around the Corridor Ankeniheny Zahamena new protected area and adjacent national parks. In phase one a stratied random sample of 603 households completed a household survey covering demographic and socio-economic characteristics, and a choice experiment to estimate the opportunity costs of conservation. A stratied sub-sample (n = 171) then completed a detailed agricultural survey (including recording inputs and outputs from 721 plots) and wild-harvested product survey. The data have been archived with ReShare (UK Data Service). Together these allow a deeper understanding of the household economy on the forest frontier in eastern Madagascar and their swidden agricultural system, the benets households derive from the forests through wild-harvested products, and the costs of conservation restrictions to forest edge communities. Design Type(s) Cross-sectional study Stratied Random Sampling Measurement Type(s) Study of Socioeconomic and Demographic Variables agricultural process foraged harvest Cost/Benet Analysis Technology Type(s) survey method Factor Type(s) conservation area status Sample Characteristic(s) Homo sapiens Andasibe-Mantadia National Park area protected according to IUCN guidelines Zahamena National Park Madagascar 1 School of Environment, Natural Resources and Geography, Bangor University, Bangor, UK. 2 Ecosystem Services for Poverty Alleviation (ESPA) Programme Directorate, Edinburgh, UK. 3 Biological and Environmental Sciences, Faculty of Natural Sciences, University of Stirling, Stirling, UK. 4 Madagasikara Voakajy, Antananarivo, Madagascar. Correspondence and requests for materials should be addressed to M.P. (email: [email protected]) OPEN Received: 24 April 2018 Accepted: 29 August 2018 Published: 23 October 2018 www.nature.com/scientificdata SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 1

Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

Data Descriptor: Householdeconomy, forest dependency &opportunity costs of conservationin eastern rainforests ofMadagascarMahesh Poudyal1,2, O. Sarobidy Rakotonarivo3, Julie H. Razafimanahaka4, Neal Hockley1 &Julia P. G. Jones1

The Government of Madagascar is trying to reduce deforestation and conserve biodiversity throughcreating new protected areas in the eastern rainforests. While this has many benefits, forest use restrictionmay bring costs to farmers at the forest frontier. We explored this through a series of surveys in five sitesaround the Corridor Ankeniheny Zahamena new protected area and adjacent national parks. In phase one astratified random sample of 603 households completed a household survey covering demographic andsocio-economic characteristics, and a choice experiment to estimate the opportunity costs of conservation.A stratified sub-sample (n = 171) then completed a detailed agricultural survey (including recording inputsand outputs from 721 plots) and wild-harvested product survey. The data have been archived with ReShare(UK Data Service). Together these allow a deeper understanding of the household economy on the forestfrontier in eastern Madagascar and their swidden agricultural system, the benefits households derive fromthe forests through wild-harvested products, and the costs of conservation restrictions to forest edgecommunities.

Design Type(s) Cross-sectional study • Stratified Random Sampling

Measurement Type(s)Study of Socioeconomic and Demographic Variables • agricultural process •foraged harvest • Cost/Benefit Analysis

Technology Type(s) survey method

Factor Type(s) conservation area status

Sample Characteristic(s)Homo sapiens • Andasibe-Mantadia National Park • area protectedaccording to IUCN guidelines • Zahamena National Park • Madagascar

1School of Environment, Natural Resources and Geography, Bangor University, Bangor, UK. 2Ecosystem Servicesfor Poverty Alleviation (ESPA) Programme Directorate, Edinburgh, UK. 3Biological and Environmental Sciences,Faculty of Natural Sciences, University of Stirling, Stirling, UK. 4Madagasikara Voakajy, Antananarivo, Madagascar.Correspondence and requests for materials should be addressed to M.P. (email: [email protected])

OPEN

Received: 24 April 2018

Accepted: 29 August 2018

Published: 23 October 2018

www.nature.com/scientificdata

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 1

Page 2: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

Background & SummaryTropical forests provide vitally important ecosystem services and support valued biodiversity1–3. As aresult there are many national and global policies seeking to reduce deforestation4,5. However, forestconservation can result in real costs to forest frontier communities, many of whom are poor andmarginalised, by preventing agricultural expansion and restricting access to valued wild-harvestedproducts6,7. International conservation policies recognise that conservation should not make local peopleworse off, and conservation funded by multilateral donors is also subject to stringent standards to ensurepoor people are not made worse off by such investments8. However, there is relatively little empirical dataallowing a robust evaluation of the costs of conservation for comparison with any benefits distributed.

While forest-dependent people are difficult to define and therefore to count9, many millions of peopleliving on the forest frontier in tropical countries make their living from small-scale swidden agricultureand harvesting products from the wild.10,11 In many areas these livelihoods face rapid change becauseof increasing land constraints and reductions in the availability of wild-harvested products due toincreases in human populations and over exploitation of previously-abundant species12–15 as well asconservation restrictions. In other areas, agricultural intensification and increased salaries for off-farmjobs is resulting in a shift away from forest-dependant livelihoods16. Collecting data on the householdeconomy of forest frontier residents in low income countries can be difficult due to poor marketintegration and highly diversified income sources17. However detailed data on the economy of suchhouseholds is needed to understand the likely impact of potential policies, including protected areas andREDD+ (Reducing Emissions from Deforestation and forest Degradation).

The datasets and surveys described in this paper were part of the project Can Paying 4 GlobalEcosystem Services Reduce Poverty? (p4ges) (http://p4ges.org/). The p4ges project ran from 2013 to 2018and was funded under the Ecosystem Services for Poverty Alleviation (ESPA) programme (www.espa.ac.uk). P4ges aimed to influence the development and implementation of international ecosystem servicepayment schemes in the interests of poverty alleviation. The project focused on the case study of a newprotected area and REDD+ pilot project in eastern Madagascar: the Corridor Ankeniheny Zahamena(CAZ). The World Bank funded the establishment of the CAZ protected area18. This meant thathouseholds identified as Project Affected Persons (PAPs) were eligible for compensation under the WorldBank’s social safeguard scheme for economic displacement caused by the conservation of the area19.

This paper presents the datasets created for estimating the magnitude and distribution of net localwelfare impacts from the conservation approaches taken in and around CAZ. The data was collected intwo phases. In the first phase we conducted household surveys with a stratified random sample of 603households across five sites. We collected information on demographic and basic socio-economiccharacteristics of the households, including land holdings, general information on the use of wild-harvested products, whether the household had received compensation under the World Bank socialsafeguard scheme, assets and wealth indicators, and information on social and human capital. Alongsidethese interviews we also conducted a choice experiment, designed to estimate the opportunity cost ofconservation restrictions which prevent households from clearing further agricultural land fromforests.20,21 The second phase collected much more detailed information on land use, agriculturalpractices, off-farm income and wild-harvested products with a stratified random sub-sample of 171households across four sites.

The datasets described here are already providing rich information for the analysis of opportunitycosts of forest use restriction,20–22 evaluating the implementation the World Bank social safeguards,22,23

and patterns of migration and the role of migration in land use change24. They will also allow opportunitycost estimates using the household production function approach, in-depth analysis of forest dependencyand the swidden agricultural system practiced in this area, including how crop production varies with theland use history of a plot.

MethodsThis section introduces the case study, the rationale for the selection of study sites, the sampling strategy,the sampling frame, ethical considerations and the implementation of both phases of data collection.Figure 1 provides a map of the study area with pilot and study sites (a), and an overview of the siteselection and sampling strategy (b) while Table 1 summarises the study sites selected.

The case studyThe Corridor Ankeniheny Zahamena (CAZ) is a 382,000 ha belt of rainforest linking a number ofexisting protected areas (most notably Zahamena and Mantadia National Parks) in eastern Madagascar.The CAZ, which is managed by Conservation International on behalf of the Malagasy government, wasinitially granted temporary protected status in 2006. Its status as an IUCN category VI protected area wasconfirmed in April 2015. More than 60,000 people live in more than 450 villages in and around thisprotected area and rely primarily on swidden agriculture, and on collecting wild-harvested products fortheir livelihoods18. As an IUCN category VI protected area, local residents are allowed to collect someforest products for personal use from parts of the protected area. However, swidden agriculture iscompletely prohibited meaning that the majority of households will be affected by conservationrestrictions imposed due to the new protected status.22,23 The CAZ is considered a REDD+ pilot projectas carbon finance is part of the long-term funding plan for CAZ25. Because the establishment of the CAZ

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 2

Page 3: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

was funded by the World Bank, the CAZ environmental and social safeguards plan18 follows the WorldBank guidelines in identifying and compensating households considered as Project Affected Persons(commonly known as PAPs). This mandates that anyone who will be economically displaced by theproject through restriction of access to the natural resources should be compensated. Compensation tookthe form of micro-development projects such as improved agriculture, small-scale livestock andbeekeeping projects26. These were distributed to households identified as PAPs in 2014 (soon after thefirst phase of our field work).

Selection of study sitesFive study sites, two adjacent to the CAZ new protected area, two next to adjacent long-establishedprotected areas, and one away from the forest frontier (Fig. 1a), were purposively selected for this studyfollowing an extensive reconnaissance during January to March 2014. The reconnaissance includedcollecting site-level information on which to base the selection of the study sites and extensive keyinformant interviews (with local leaders, elders and school teachers) to ensure we understood the localcontext and to explore the availability of appropriate sampling frames at each site.

The two sites adjacent to the CAZ new protected area were selected to assess the impacts of the recentconservation restriction on the households. In these sites, although the forest was officially protectedunder a temporary conservation order (from 2006), there was little enforcement of conservation rules atthe time of our data collection. This allowed us to explore livelihoods (and estimate the opportunity costof conservation restrictions) where the enforcement of conservation restrictions has not yet been fullyimplemented. To allow us to investigate the effectiveness of compensation provided to households underthe World Bank social safeguard process18,26 we intentionally selected one site which had been identifiedto receive compensation (Ampahitra; south-west on the map in Fig. 1a), and one site which had not(Sahavazina; north-east on the map in Fig. 1a). We selected two study sites next to the existing protectedareas (Zahamena National Park and Mantadia National Park on the map in Fig. 1a) to allow us toexplore livelihoods (and estimate the opportunity costs of conservation restrictions) in communities withlong history of conservation. All four sites were similar in terms of the households’ reliance on swiddenagriculture as primary mode of production, lack of basic services, access to markets, and accessible roads.

Reconnaissance visit around CAZ to identify suitable in-depth study sites Piloting of survey

instruments

Selection of study sites

Fokontany level: Collection of locally available information on villages (sketch map with local help)

Village level: Collected information on households and hamlets (sketch map & GPS)

Hamlets level: Visited hamlets in person to cross check

Sampling frame prepared random sampling

Figure 1. Selection of study sites and sampling. (a) Map of the Corridor Ankeniheny-Zahamena (CAZ) new

protected area indicating the location of existing protected areas and our study sites. (b) Schematic showing the

systematic site selection process and development of sampling frame in each site.

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 3

Page 4: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

Only one site, Ampahitra, had a secondary road passing through the fokontany centre but the road wasvirtually unpassable during rainy season, and did not have regular scheduled transport. Moreover, thevillages selected for study at this site were away from the road ranging from one to four hours walkingdistance. Finally, the fifth site (Amporoforo; east on the map in Fig. 1a) was selected to allow us to explorelivelihoods and the land and forest use pattern in a site where forest had been lost (in this case more-or-less complete deforestation by the year 2000, with most of the deforestation occurring during the period1973-1990)27. This final site was important for understanding how livelihoods might change in responseto the loss of forest over time. Each site represented one or two fokontany (the smallest administrativeunit in Madagascar).

Developing the sampling frameThe sampling unit in this work was the household — defined as one person living alone or a group ofpeople living together, who pool some, or all, of their resources (labour, income and wealth) and whomake common provision for food and other essentials for living28. Information on the location of ruralcommunities in Madagascar and the number of households in each is not easily available, and where thedata exist, they tend to be poor and not up-to-date23. In order to build a robust sampling frame for thisstudy, most of our study sites required a thorough census of the households within these sites. As outlinedin Fig. 1b, once the study sites were selected, we worked through various levels of local communitiesstarting from the fokontany, down to hamlets to conduct a census of the households in each site for oursampling frame. This process involved starting with available maps and drawing sketch maps with localinformants (including the fokontany president, village chiefs, elders and school teachers) and then visitingvillages and hamlets to record the location of households with a handheld GPS. Given the difficulty inaccessing many of our sites, and the location of individual households in difficult, hilly terrain, thisprocess took a lot of time (approximately fifty person-days per site).

Stratified random samplingOnce we established the sampling frame at each site, stratified by village as settlements were groupedwithin each fokontany in small villages, the households for the initial household survey and choiceexperiment (phase one) were identified through random sampling from each village stratum inproportion to get a set sample size for each site. We did not stratify by wealth status as it was not locallyappropriate to discuss wealth status of households in focus groups for us to be able to collect thisinformation. The sample size for each site was based on the minimum number of households required forthe choice experiment analysis29, as this exceeded the minimum number of households required for othertypes of data analysis planned. As a rule of thumb, surveying 50 individuals per experimentally designedalternative is acceptable30 in choice experiment surveys. This implies that a minimum of 150 respondentswere required for our choice experiment design of three alternatives per choice set. This is the lowest limitwhich provides adequate variation in the variables of interest for which robust models may be fitted.However, other aspects also determined the sample size: the number of choice sets per respondent, thenumber of attributes and levels, the task complexity, and the field conditions. We administered six choicesets per household in total (with four attributes of varying levels per choice set) and used a minimumsample of around 100 households per site where choice experiment was conducted. We needed aparticularly large sample for the choice experiment in Ampahitra as we were testing willingness-to-payand willingness-to-accept formulations20 before settling on the best performing approach for roll outacross sites. As choice experiments were not being conducted in Amporoforo (there is no forest easily

Sites Fokontany(s) (Commune)DISTRICT

Protected status History of conservation Enforcement ofconservation rules∗

Compensation provided

Mantadia Volove & Vohibazaha(Ambatavola)MORAMANGA

Established Protected Area Long history of conservation(since 1989) on periphery ofMantadia National Park

Relatively strong Park entry fees shared withcommunity and used tofund local developmentprojects

Zahamena Antevibe & Ambodivoangy(Ambodimangavalo)VAVATENINA

Established Protected Area Long history of conservation(since 1927) on periphery ofZahamena National Park

Relatively strong Park entry fees shared withcommunity and used tofund local developmentprojects

Ampahitra Ampahitra (Ambohibary)MORAMANGA

New Protected Area (limitedexperience of conservation)

Granted temporary protectedstatus in 2006, formallygazetted in 2015.

Weak Yes (World Bank socialsafeguards for newprotected area)

Sahavazina Sahavazina (Antenina)TOAMASINA II)

New Protected Area (limitedexperience of conservation)

Granted temporary protectedstatus in 2006, formallygazetted in 2015.

Very weak No

Amporoforo Amporoforo (Amporoforo)(TOAMASINA II)

Not applicable (not on forestfrontier).

The forest at this site was lostin the 1950s and there is noconservation effort.

Not applicable Not applicable

Table 1. Characteristics of the study sites and rational for their selection. ∗Information based on fieldobservation during site reconnaissance and our knowledge based on interactions with conservationorganisations over the past five years.

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 4

Page 5: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

accessible from this site meaning it did not make sense to explore the opportunity cost of conservation inthis way), the sample size in this site reflects the number of households needed in phase two of the survey.We randomly selected 10–15% more households than needed for our survey in each site to allow forrefusals. If a household did not want to be interviewed, we turned to the next household from our sample.Refusals and dropout rates were very low (less than 4% across all sites).

Ethical considerations and proceduresThe surveys covered potentially sensitive subjects such as wealth indicators and land holdings. As clearingforest for swidden agriculture is forbidden in the study area, questions about land use and the history ofswidden agricultural plots are particularly sensitive. Given we needed to record individual identifyinginformation (to allow us to relocate households and tie their information together for the in-depth phasetwo interviews with a sub-sample of those initially interviewed), we had to be particularly careful toensure that informants’ data was protected. Bangor University's College of Natural Sciences ethicscommittee approved this study, including the information sheet, consent protocol and the data collectionprotocol. All members of the survey team received ethics training before carrying out fieldwork, whichcovered topics such as the principle of voluntary participation, informed consent, and anonymity anddata protection. At the beginning of each survey, we introduced the research team and the projectobjectives to the respondents, why they had been selected for interview, and explained that theirparticipation in the research was voluntary and they could leave at any time. We explained that noinformation that could identify households or individuals would be shared outside the research teams atBangor University and the University of Antananarivo and that the research would be used to help othersunderstand about life in their village and how decisions made about conservation might influence them.We gave each household a leaflet to keep which explained the aims of the research in the local languageand contained contact details for those responsible for the project with photos and names of the researchteam overleaf. Details of the consent forms and the information sheet are provided with the archiveddata. Both archived datasets described in this paper are completely anonymised. We collected somequalitative information alongside the quantitative data recorded in the archive. Given the sensitive natureof the issues discussed and difficulty in anonymising these qualitative information, they have not beenarchived alongside the quantitative datasets.

Participants in the household survey and choice experiment (phase one) were given a small gift ofuseful items to a total value of 3,000 ariary (approximately $1) as a gesture of appreciation at the end ofthe survey. Before giving this gift we asked households if they would be willing to be included in oursample for follow up interviews. The detailed agricultural surveys (phase two) took a full day and requiredthe household head to work as a guide for the day taking us to their fields so we paid the daily wage rateof 5,000 ariary (approximately $1.85). The follow up wild-harvested product survey was compensated inthe same way as the initial household survey and choice experiment.

Survey implementationFigure 2 shows images of the data collection. These are useful for understanding the context.

Phase one (household survey and choice experiment) and the first round of phase two (agriculturalsurveys) were implemented collaboratively by Bangor University (School of Environment, NaturalResources and Geography) and the University of Antananarivo (Ecole Supérieure des SciencesAgronomiques). The follow up wild-harvested product survey was implemented by MadagasikaraVoakajy in close collaboration with the Bangor University and University of Antananarivo teams. Ourteam spent more than 150 person days in each site, usually with two or three people in the field together.We stayed with members of the community rather than camping separately or travelling in every day.This helped greatly in building relationships and trust.

Extensive field testing of the survey instruments was carried out in order to refine and polish thequestions and ensure they could be well understood by our target population and that respondents werecomfortable answering them. The pilot sites for testing of the survey instruments were chosen such thatthese resembled the actual study sites in their characteristics (see Fig. 1a–indicated by orange-colouredsquares). Initial survey instruments for all phases were first piloted during February and March 2014 inthe pilot study sites. The survey questionnaires were tested both on individual respondents (over 20 intotal) and relevance of the questions discussed with focus groups (3-4 in each pilot site). Based on thesepilots, survey questionnaires were modified, particularly the sections related to the wealth indicators, landuse and types of crops farmed, collection of wild-harvested products and social capital. For the DCEsurvey, initial piloting helped decide on the key attributes and their levels for the final design of the choicesets. The CE presentation and framing (lengthy warm up steps and use of dolls and large images todesensitise forest clearance) were also significantly informed by the piloting20. The final surveyinstruments were tested again in the pilot sites during June 2014 before the main surveys. The finaltesting was also important for the training of the research assistants.

The bulk of the fieldwork was carried out by a relatively small team (see acknowledgements) of highlytrained research assistants who remained involved in the project from the development of the surveyinstruments, through data collection, cleaning, archiving and analysis. Three additional short-termassistants helped with the initial household survey, and we had one additional assistant helping with theagricultural survey in one site. Interviews were conducted in person with the household heads (or another

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 5

Page 6: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

adult member of the household where household heads were not available). Interviews were conducted inMalagasy by native speakers comfortable with the dialect of the region.

The surveys were implemented between July 2014 and November 2015 (Table 2). We collected thedata on paper forms in both phases and for all sites. While we experimented with data entry onto tablet

Figure 2. Images showing the field work context. (a) Many of the communities visited during this work were

remote and only accessible on foot. (b) Due to lack of a robust sampling frame we used local knowledge to

update available maps and then visited every village and hamlet. (c) Most people in the area are poor with most

houses made of locally available materials. (d) A swidden agricultural system (known locally as ‘tavy’ is

practiced throughout the study area. (e) Conservation restrictions prevent expansion of agricultural land or

collection of most wild-harvested products. (f) Micro-development projects such as improved bean cultivation

have been offered by the authorities in compensation for conservation costs. (g) & (h) Wild-harvested products

including freshwater crabs and products used for medicine and flavouring are important to local livelihoods.

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 6

Page 7: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

computers in the field, problems with keeping them charged and the challenges of working on smallscreens meant we entered the majority of data collected after returning from the field. Data was enteredby the research assistants themselves after they returned to the capital after each round of interviews ineach site. Survey data and accompanying documents including all survey instruments in English andMalagasy are available on the ReShare repository (Data Citation 1; Data Citation 2).

Phase one (household survey and choice experiment)The purpose of phase one interviews was to collect demographic and basic socio-economic characteristicsof the households, including migration status, land holdings, wealth indicators, general information onthe use of wild-harvested products and to estimate the opportunity cost of conservation from a relativelylarge sample using discrete choice experiment, a stated preference technique (further details below). Wealso recorded whether the household had received compensation under the World Bank social safeguardscheme. A secondary purpose was to provide information from which to identify a representative samplefor the more in-depth interviews in phase two. Interviews in phase one took between one hour and twohours each.

The socio-demographic and wealth indicators in the household survey were developed from acombination of questions from the Poverty and Environment Network (PEN) household surveys31 andWorld Bank’s Living Standards Measurement Surveys (LSMS). Many of the assets and wealth questionshad to be adapted by the team to the rural Malagasy context as assets owned by many households are solimited, the standard LSMS items would not separate our households sufficiently. This was based onexpert judgement of our team (among our winder team we had people with many decades of experienceof field research in rural Madagascar), as well as from the information gathered during pilot surveys.Project-specific questions were added about land use and access to and use of forests. The key variablescovered in the survey are shown in Table 3.

The choice experiment aimed to assess the opportunity costs experienced by households preventedfrom clearing forest for swidden agriculture due to the introduction of conservation restrictions. Thechoice experiment used the willingness to accept (WTA) format in all sites, with willingness to pay(WTP) format also being conducted in one site to compare the two formats20. Choice experiment isrooted in Lancaster’s model of consumer choice, which proposed that consumers derive satisfaction notfrom goods themselves but from the attributes they provide32. For example, a lake can be decomposed in

Phase one: household survey and choice experiment (Data citation 1; N= 603)

Site Period of survey Recall period∗ Sampling frame/sample size† Householdssurveyed

Ampahitra Jul‒Aug 2014 Sep 2013‒Jun 2014 431/260 203

Mantadia Aug‒Sep 2014 Sep 2013‒ Jun 2014 241/141 104

Zahamena Oct‒Nov 2014 Sep 2013‒Jun 2014 673/175 152

Sahavazina Feb 2015 Sep 2013‒Jun 2014 409/125 95

Amporoforo‡ Feb‒Mar 2015 Sep 2013‒Jun 2014 175/50 49

Phase 2a: Agricultural survey (Data citation 2; n= 171)

Site Period of survey Recall period∗ Householdssurveyed

Ampahitra Aug‒Nov 2014 Sep 2013‒Jun 2014 50

Zahamena Nov 2014‒May 2015 Sep 2014‒May 2015 41

Sahavazina May‒Jun 2015 Sep 2014‒May 2015 40

Amporoforo Jun 2015 Sep 2014‒May 2015 40

Phase 2b: Wild-harvested product survey (Data citation 2; N= 169)

Site Period of survey Recall period Householdssurveyed

Ampahitra Oct‒Nov 2014 Last 12 months 50

Zahamena Aug‒Sep 2015 Last 12 months 40

Sahavazina Oct 2015 Last 12 months 40

Amporoforo Nov 2015 Last 12 months 39

Table 2. Timing, recall period, sampling frame, and sample size (number of households surveyed) ineach survey phase. ∗Recall period refers to the period of inputs and outputs relating to agricultural and off-farm livelihood activities. For householdsʼ demographic and other socio-economic data, such as wealthindicators data collected related to the situation at the time of survey. †This includes number of householdsrandomly sampled including 10–15% replacement allowance. ‡In Amporoforo, choice experiments were notplanned so we limited the sample size to what was required for phase 2 of the surveys in this site.

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 7

Page 8: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

terms of its turbidity, its recreational facilities (fishing, swimming), and its ecological quality. Changingattribute levels will essentially result in a different good being valued, choice experiments therefore focuson changes in these attributes. The attributes and levels in our surveys were informed by three focusgroup discussions and pilot testing of the design with 50 respondents. Based on the focus groups and pilottests, four attributes with varying levels were included in the final choice tasks: i) Total cash donations(framed as development assistance) or total payments made to the government (in WTP format); ii)Number of annual instalments over which the household would receive/pay the cash; iii) Support forimproved rice farming; and iv) Clearance of new forestlands for agriculture. The forest clearance attributehad three levels: free clearance (forest protection is lifted), permit for one hectare of clearance, and noclearance (strict enforcement of forest conservation). We combined alternative levels of the fourattributes in choice tasks using an efficient design that seeks to minimize the standard error of thecoefficients to be estimated33, and optimised the fractional factorial design for d-efficiency for themultinomial logit model based on information on the signs of the parameters obtained from thepiloting34. The design generated 12 choice tasks which were divided into two blocks of six choice taskseach; with each respondent randomly assigned one of the two blocks in the experiment. An example of achoice task is presented in Fig. 3, and more detail of the choice experiment is available in documentationincluded with Data Citation, 1 and in citations 20, 21 and 29.

At the end of the survey respondents were asked if they would be willing to be interviewed in furthersurveys. 94% agreed to be included in our sampling frame for phase two.

Phase two (in-depth agricultural survey and wild-harvested product survey)Phase two interviews were carried out with a stratified random sample of the phase one households. Westratified households according to household size and land-holding and randomly sampled 40–50households to represent these strata. The agricultural survey often took a whole day to complete while thewild-harvested product follow-up survey took between 30 minutes and 1.5 hours.

Agricultural survey. The agricultural survey comprised of two major sections – (i) general household-level information, including land ownership, land access, land use, livestock inputs and outputs, off-farmincome, and information on clearance of primary forest; and (ii) field (plot) level data on the history ofeach individual plot (when it was cleared from forest, crop cycles) and inputs and outputs over the lastcomplete agricultural year. We visited all accessible plots and mapped the boundary of over 80% of thosewith a handheld GPS in order to understand the distribution of land holdings relative to the forestfrontier and to calculate the size of each plot (649 visited, 520 mapped; out of 898 plots recorded). Fieldlevel data for all the accessible plots was recorded at the plot location to aid recall, and in-depthagricultural inputs-outputs data was collected for all the plots cultivated by the households during theprevious full farming year (721 in total). Overall, we collected plot level data for 898 plots belonging tothe 171 households.

Cognisant of the potential limitations of the recall surveys, we identified a number of households in thesite Zahamena who agreed to keep agricultural logs (daily or weekly) for the ongoing agricultural year toallow us to investigate the reliability of the recall in the agricultural surveys. These logs covered the sameinformation as in the agricultural inputs-outputs and off-farm income surveys, however, recorded eitherdaily or weekly. The formats used by the households to keep these logs and the data produced arearchived alongside the main data. Where inputs and outputs were recorded in local units, these have alsobeen converted to standard units in the dataset. Information on specific conversion factors were gathered

Section Information included

Information about thehousehold

Place of birth and ethnicity of HH head, when household was formed, marital status of household head, detailed roster of all members ofthe household including age, sex, relation to the household head, education and occupation

Land ownership, landaccess and land use

Households are asked to list all the plots they had access to in the last agricultural year and for each to provide information on: the type offield, the tenure situation, plot age, fertility and size. For plots which are loaned or rented there are additional questions about who theyare loaned or rented to. They were also asked whether they could have cultivated more land with the resources available and if not, whythey didn’t cultivate more land.

Wild-product harvest For each of the categories (firewood, roofing material, material for walls, timber, material for floor and weaving materials) we asked themto report on their preferred species, the type of land they collect them from, and whether they were used in the household or sold. Theywere also invited to provide such information for other categories of wild-harvested products important to them.

Compensationprovided for the costsof conservation

Households were asked whether they took part in the safeguard assessment in 2009/2010, whether they were identified as a projectaffected person, whether they received a micro-development project under the social safeguard scheme, and what the project was.

Assets and wealthindicators

We collected information on the number of rooms, roof and floor construction of their main house and any ‘field houses’, the number ofeach category of livestock owned, and whether they owned each of a list of household items. We also asked the number of months theyhad enough to eat in the last farming year, the type of light used in the house and how often they had sufficient light.

Social and humancapital

They were asked about training courses attended by any member of the household and membership of community groups. They were alsoasked a series of questions aimed to evaluate support networks e.g. If I have serious problems (losing crops/illness etc.), I get help from myneighbour/wider community.

Table 3. The main sections of the household survey showing the groups of variables collected.

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 8

Page 9: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

at the community level from local markets and through measurement by researchers in the field asrequired.

Finally, we carried out yield measurement for hill rice for a selected number of plots during the harvestperiod in Zahamena. The selection of these plots was opportunistic rather than random based on theplots ready for harvesting during our team’s visit to the site at harvest period. A total of 11 hill rice plotswere selected for yield measurement. To estimate the plot yield, we randomly selected ten (10) 1m2

quadrats in each plot, harvested the rice from these quadrats with the help of members of the householdsfarming the plot, and measured the harvest both in standard and local units. The average yield from the10 sample quadrats were used to extrapolate for the whole plot using the plot area.

Wild-harvested product survey. We attempted to reach all households who were interviewed in theagricultural survey in the wild-harvested product survey (carried out a few months later in each site). Wemanaged to reach all but two households meaning that the sample size in these follow up surveys covered169 of the 171 households targeted.

Before conducting the wild-harvested products survey in each site we carried out focus groupdiscussions to identify a prioritised list of the most important wild-harvested products in each site. Thesurvey then gathered data on each of the wild-harvested products on the prioritised list plus additionalproducts that a household considered important. Data collected included seasonal collection of wild-harvested products, their use and sale, their importance to the livelihood of the household, andabundance historically and perception regarding their likely availability in the future.

Code availabilityThis study did not use any computer codes to generate the dataset. Microsoft Excel was used to enter,store and quality check the collected data. Where new variables and data associated are calculated/derivedfrom other survey variables, these have been detailed in metadata with the functions used to derive thesevariables.

Data RecordsTwo archived datasets, both containing primary data and accompanying documentation, relate to thisdata paper (see Table 2). The first covers phase one (the household survey and choice experiment)(P4GES_HHS_CE.zip, Data Citation 1). The data was collected between June 2014 and June 2015 andcomprises of 603 households. The second covers phase two (the agricultural survey and wild-harvestedproduct survey) (P4GES_AGR_WHP.zip, Data Citation 2).

Technical ValidationA number of steps were employed from the beginning of the study to ensure quality of the data collectedand that of the datasets deposited in the archive. First we conducted extensive reconnaissance includingkey informant interviews to ensure our research team (many of whom had experience of the study regionprior to this project) had a good understanding of the local context to inform development of the surveyinstruments. We also conducted an extensive pre-testing of questions to ensure they were relevant andwell understood locally.

Secondly, our research team included native Malagasy speakers and only one of the non-Malagasyinvolved in designing the survey and developing protocols in the field does not speak Malagasy. The

Figure 3. Example of a choice card in the willingness to accept (WTA) format of the choice experiment.

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 9

Page 10: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

Malagasy version of the survey instruments were therefore developed alongside the English version andwe constantly discussed how particular concepts would be communicated in Malagasy. The leadresearcher from Bangor University who developed the choice experiment (OSR) is a native Malagasyspeaker which was vitally important in ensuring that this challenging survey was developed in a waywhich could be effectively understood by respondents. The validity of the choice experiment wasexplicitly tested before being rolled out across study sites.20,21 The archived English version of theinstruments were reverse translated from the Malagasy version used in the field.

Third, we made a conscious decision not to depend on a large team of enumerators, hired after thesurveys were developed, trained and then deployed to collect data. Instead we worked closely with arelatively small group of research assistants, the majority of whom already had experience in the studyarea and experience of social research and the majority of which stayed with the project for at least threeyears (through reconnaissance, development of the survey instruments, piloting, data collection, cleaningand archiving). Many of these research assistants are co-authors on the papers coming out of this data set.In phase one surveys; this core team were helped by between one and three assistants (depending on thesite). We provided comprehensive training to everyone involved in data collection. The acknowl-edgements section provides the names of everyone involved in data collection.

Fourth, the senior researchers (the authors of this paper) closely supervised data collection in the field,particularly at the beginning of the work in each phase at each site. The lead researcher who developedthe choice experiment (OSR) was in the field for the whole of phase one (household survey and choiceexperiment) at two of the sites (Ampahitra and Mantadia; three months in total). MP was also regularlyin the field for quite extended periods of time; spending eight weeks in the field in total while JPGJ, NHand JHR made shorter visits.

Fifth, at the end of each survey our team recorded answers to questions relating to their perception ofhow well the interview had gone including how open and engaged the respondent seemed and how wellthey had understood the process. They also kept daily logs based on their interviews and interactions withthe respondents, and other general observations in the field, and compared notes with each other on adaily basis. These were helpful in learning from each other, and in resolving any minor issues during thesurveys.

Finally, a great deal of time and effort was put into data entry, data checks and validation. For majorityof the cases, the research assistants who carried out the surveys entered the data for the households thatthey had interviewed. This raw data was then checked systematically by another member of the team forany errors or inconsistencies. The data was then coded for clarity and consistency. The coded data waschecked further by senior members of the research team - checking specially for potential typing errorsand extreme values. Where inconsistencies and potential errors were located, these were checked againstthe original raw data as well as the original paper survey sheets to get to a resolution.

Usage NotesData access conditionsIndividual household-level data has been anonymised before submitting to the archives and we have notincluded GPS locations of the household (converting this data to ‘distance to the forest frontier’).However, given the detailed nature of the data collected at the household level, particularly regardingtheir household demography, and on- and off-farm income activities, there exists certain level ofsensitivity surrounding the data. This potentially creates a situation where households within each surveysites, particularly those with small sampling frames, can be identified from the survey data. For thisreason, both datasets described in this paper have been made available as safeguarded data on the UKData Archive’s data repository ReShare. Anyone wishing to download and use these data must registerwith the UKDA and agree to the conditions of their End User Licence, outlined at https://www.ukdataservice.ac.uk/get-data/how-to-access/conditions#/tab-end-user-licence. For commercial use, pleasecontact the UK Data Service at [email protected].

Variables included in both phase one and twoSome questions relating to land use and land holdings and the collection and use of wild-harvestedproducts were asked to the respondents in both phase one and phase two so for the subsampleinterviewed in both phases, some of this information will be repeated in the two datasets. For a generalanalysis using a large sample size, phase one data is most appropriate, but phase two data allows for morein-depth analysis.

References1. Baccini, A. et al. Estimated carbon dioxide emissions from tropical deforestation improved by carbon-density maps. Nat. Clim.Change 2, 182–185, doi:10.1038/nclimate1354 (2012).

2. Newbold, T. et al. A global model of the response of tropical and sub-tropical forest biodiversity to anthropogenic pressures. P.Roy. Soc. B-Biol. Sci. 281, doi:10.1098/rspb.2014.1371 (2014).

3. Lawrence, D. & Vandecar, K. Effects of tropical deforestation on climate and agriculture. Nat. Clim. Change 5,27–36, doi:10.1038/nclimate2430 (2015).

4. Nepstad, D. et al. Slowing Amazon deforestation through public policy and interventions in beef and soy supply chains. Science344, 1118–1123, doi:10.1126/science.1248525 (2014).

5. Gray, C. L. et al. Local biodiversity is higher inside than outside terrestrial protected areas worldwide. Nat. Commun. 7,12306, doi:10.1038/ncomms12306 (2016).

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 10

Page 11: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

6. Balmford, A. & Whitten, T. Who should pay for tropical conservation, and how could the costs be met? Oryx 37, 238–250,doi:10.1017/S0030605303000413 (2003).

7. Adams, V. M., Pressey, R. L. & Naidoo, R. Opportunity costs: Who really pays for conservation? Biol. Conserv. 143,439–448, doi:10.1016/j.biocon.2009.11.011 (2010).

8. IFC. Performance standards on environmental and social sustainability International Finance Corporation (IFC), (2012).9. Newton, P., Miller, D. C., Byenkya, M. A. A. & Agrawal, A. Who are forest-dependent people? A taxo nomy to aid livelihood andland use decision-making in forested regions. Land Use Policy 57, 388–395, doi:10.1016/j.landusepol.2016.05.032 (2016).

10. Vedeld, P., Angelsen, A., Bojö, J., Sjaastad, E. & Kobugabe Berg, G. Forest environmental incomes and the rural poor. Forest PolicyEcon. 9, 869–879, doi:10.1016/j.forpol.2006.05.008 (2007).

11. Angelsen, A. et al. Environmental Income and Rural Livelihoods: A Global-Comparative Analysis. World Dev. 64, S12–S28,doi:10.1016/j.worlddev.2014.03.006 (2014).

12. Coomes, O. T., Takasaki, Y. & Rhemtulla, J. M. What fate for swidden agriculture under land constraint in tropical forests?Lessons from a long-term study in an Amazonian peasant community. J. Rural Stud. 54, 39–51,doi:10.1016/j.jrurstud.2017.06.002 (2017).

13. Jiao, X., Smith-Hall, C. & Theilade, I. Rural household incomes and land grabbing in Cambodia. Land Use Policy 48, 317–328,doi:10.1016/j.landusepol.2015.06.008 (2015).

14. Cramb, R. A. et al. Swidden transformations and rural livelihoods in Southeast Asia. Hum. Ecol. 37, 323–346,doi:10.1007/s10745-009-9241-6 (2009).

15. Alexander, J. S., McNamara, J., Rowcliffe, J. M., Oppong, J. & Milner-Gulland, E. J. The role of bushmeat in a West Africanagricultural landscape. Oryx 49, 643–651, doi:10.1017/S0030605313001294 (2015).

16. Lambin, E. F. & Meyfroidt, P. Global land use change, economic globalization, and the looming land scarcity. Proc. Natl. Acad.Sci. USA 108, 3465–3472, doi:10.1073/pnas.1100480108 (2011).

17. Angelsen, A., Larsen, H. O., Lund, J. F., Smith-Hall, C. & Wunder, S. (Eds.) Measuring Livelihoods and EnvironmentalDependence: Methods for Research and Fieldwork. (Routledge, Abingdon: Oxon and New York, NY, 2011).

18. World Bank. Plan de gestion environnementale et de sauvegarde sociale de la reserve de ressources naturelles AnkenihenyZahamena (2012).

19. Hall, A. Social policies in the World Bank: Paradigms and challenges. Global Social Policy 7, 151–175,doi:10.1177/1468018107078160 (2007).

20. Rakotonarivo, O. S., Bredahl Jacobsen, J., Poudyal, M., Rasoamanana, A. & Hockley, N. Estimating welfare impacts whereproperty rights are contested: methodological and policy implications. Land Use Policy 70, 71–83,doi:10.1016/j.landusepol.2017.09.051 (2018).

21. Rakotonarivo, O. S. et al. Qualitative and quantitative evidence on the true local welfare costs of forest conservation in Mada-gascar: Are discrete choice experiments a valid ex-ante tool? World Dev 94, 478–491, doi:10.1016/j.worlddev.2017.02.009 (2017).

22. Poudyal, M. et al. Who bears the cost of forest conservation? PeerJ 6, e5106, doi:10.7717/peerj.5106 (2018).23. Poudyal, M. et al. Can REDD+ social safeguards reach the 'right' people? Lessons from Madagascar. Global Environ. Chang. 37,

31–42, doi:10.1016/j.gloenvcha.2016.01.004 (2016).24. Jones, J. P. G. et al. Human migration to the forest frontier: Implications for land use change and conservation management. Geo:

Geography and Environment 5 e00050, doi:doi:10.1002/geo2.50 (2018).25. Rainforest Alliance. Reduced emissions from deforesation in the Ankeniheny-Zahamena Corridor, Madagascar - Validation

Report: VCS Version 3 (2013).26. MEEF/SAPM/CI. Etude de faisabilité des sous-projets de compensation des personnes affectées par le projet, nouvelle aire protégée

du Corridor Ankeniheny-Zahamena (NAP CAZ). Programme Environmental III: Madagascar, (2013).27. Harper, G. J., Steininger, M. K., Tucker, C. J., Juhn, D. & Hawkins, F. Fifty years of deforestation and forest fragmentation in

Madagascar. Environ. Conserv. 34, 325–333, doi:10.1017/S0376892907004262 (2007).28. United Nations. Principles and Recommendations for Population and Housing Censuses, Revision 2. (UN Department of Economic

and Social Affairs, Statistics Division, 2008).29. Rakotonarivo, O. S. Improving the choice experiment method in assessing the local welfare impacts of forest conservation in low-

income countries: Empirical evidence from Madagascar, PhD thesis. (Bangor University and University of Copenhagen 2016).30. Hensher, D. A., Rose, J. M. & Greene, W. Applied Choice Analysis: A Primer. (Cambridge University Press, 2005).31. Poverty Environment Network. The PEN prototype questionnaire in English, version 4.4 https://www.cifor.org/pen/the-pen-

prototype-questionnaire/ (2008).32. Lancaster, K. J. A new approach to consumer theory. J. Polit. Econ. 74, 132–157 (1966).33. Ferrini, S. & Scarpa, R. Designs with a priori information for nonmarket valuation with choice experiments: A Monte Carlo study.

J. Environ. Econ. Manage. 53, 342–363, doi:10.1016/j.jeem.2006.10.007 (2007).34. Scarpa, R. & Rose, J. M. Design efficiency for non-market valuation with choice modelling: how to measure it, what to report

and why*. Aust. J. Agr. Resour. Ec. 52, 253–282, doi:10.1111/j.1467-8489.2007.00436.x (2008).

Data Citations1. Poudyal, M. et al. Colchester, Essex: UK Data Archive https://dx.doi.org/10.5255/UKDA-SN-852435 (2018).2. Poudyal, M. et al. Colchester, Essex: UK Data Archive https://dx.doi.org/10.5255/UKDA-SN-852790 (2018).

AcknowledgementsThe authors gratefully acknowledge the residents in all the study sites who gave up several hours of theirtime to be interviewed on multiple occasions. Research permission was granted by the Ministry ofEnvironment, Ecology, Sea and Forests (45/14/MEF/SG/DGP/DCB. SAP/SCBref). The work wasconducted under Bangor University research ethics framework. A brilliant team were involved in datacollection and cleaning; we thank them all for their hard work in difficult conditions. The phase onesurveys were primarily conducted by Alexandra Rasoamanana, Rina Mandimbiniaina and NilsenAndrianantenaina (with help from Domoina Tiaray Rafanomezantsoa, Soatata Razafimandimby and RioHeriniaina). The agricultural surveys were carried out by Alexandra Rasoamanana, Rina Mandimbiniainaand Nilsen Andrianantenaina (with help from Ntsiva Andriatsitohaina). The team for the wild-harvestedproduct surveys was Manjakarivo Randrianarisoa, Victor Rakotomboavonjy, Jean Charles Rabakoson andRoma Randrianavelona (and Jacyntha Ambinintsoa helped clean this dataset). We thank BrunoRamamonjisoa for insights and advice and Harifidy Rakoto Ratsimba for vital logistical support. Thissurvey was part of the project Can Paying 4 Ecosystem Services reduce poverty? (p4ges, NE/K010220/1),

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 11

Page 12: Household economy, forest dependency & …Data Descriptor: Household economy, forest dependency & opportunity costs of conservation in eastern rainforests of Madagascar Mahesh Poudyal1,2,

funded through Ecosystem Services for Poverty Alleviation (ESPA) programme. The ESPA programme isfunded by the Department for International Development, the Economic and Social Research Counciland the Natural Environment Research Council. We thank all the p4ges project partners for excellentcollaboration.

Author ContributionsM.P.: study sites selection and sampling strategy, questionnaire design, survey coordination andsupervision, data management and archiving, led in writing article; O.S.R.: study sites selection andsampling strategy, questionnaire and choice experiment design, household and choice experiment survey,data entry and data management; J.H.R.: design of qualitative and household survey instruments andcoordination of wild-harvested products survey, supervision of wild-harvested survey data entry andmanagement including data checks; N.H.: sites selection and sampling strategy, questionnaire and choiceexperiment design, piloting of survey instruments, survey supervision, data checks; J.P.G.J.: sites selectionand sampling strategy, questionnaire design, piloting of survey instruments, survey supervision, datachecks. M.P. and J.P.G.J. wrote the article with specific comments and contributions from other authors.

Additional InformationCompeting interests: The authors declare no competing interests.

How to cite this article: Poudyal. M. et al., Household economy, forest dependency & opportunity costsof conservation in eastern rainforests of Madagascar. Sci. Data. 5:180225 doi: 10.1038/sdata.2018.225(2018).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published mapsand institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution 4.0 Interna-tional License, which permits use, sharing, adaptation, distribution and reproduction in any

medium or format, as long as you give appropriate credit to the original author(s) and the source, provide alink to the Creative Commons license, and indicate if changes were made. The images or other third partymaterial in this article are included in the article’s Creative Commons license, unless indicated otherwise ina credit line to the material. If material is not included in the article’s Creative Commons license and yourintended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtainpermission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article.

© The Author(s) 2018

www.nature.com/sdata/

SCIENTIFIC DATA | 5:180225 | DOI: 10.1038/sdata.2018.225 12