19
Census 2000 symposium, se ssion 4 paper 26 1 Archiving Census Archiving Census Documentation and Microdata: Documentation and Microdata: Preserving Memory, Preserving Memory, Increasing Stakeholders Increasing Stakeholders * * * * * * Wendy L. Thomas and Robert Wendy L. Thomas and Robert McCaa McCaa Minnesota Population Center Minnesota Population Center http://www.ipums.org/International http://www.ipums.org/International IPUMS International, funded by IPUMS International, funded by The National Science Foundation of the United The National Science Foundation of the United States States

Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

1

Archiving Census Archiving Census Documentation and Microdata:Documentation and Microdata:

Preserving Memory, Preserving Memory, Increasing StakeholdersIncreasing Stakeholders

* * ** * *Wendy L. Thomas and Robert Wendy L. Thomas and Robert

McCaaMcCaaMinnesota Population CenterMinnesota Population Center

http://www.ipums.org/Internationalhttp://www.ipums.org/InternationalIPUMS International, funded byIPUMS International, funded by

The National Science Foundation of the United The National Science Foundation of the United StatesStates

Page 2: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

2

» Microcomputer revolution --> new uses Microcomputer revolution --> new uses for census data, specifically microdatafor census data, specifically microdata

» Effective use or microdata requires Effective use or microdata requires systematic preservation of metadata systematic preservation of metadata

» Availability of microdata --> enhances the Availability of microdata --> enhances the value of censuses and increases value of censuses and increases stakeholdersstakeholders

» IPUMS International consortium promotes IPUMS International consortium promotes preservation and use of census microdatapreservation and use of census microdata

Subtext: Subtext: Preserving census metadata and Preserving census metadata and

microdatamicrodataenhances value of census enhances value of census

and increases stakeholdersand increases stakeholders

Page 3: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

3

16th century Aztec census 16th century Aztec census (in Nahuatl, (in Nahuatl, 1530s1530s): “Here is the ): “Here is the

home of...”home of...”

translatransla

tedted

(from Museum of Antropology, (from Museum of Antropology, Mexico City)Mexico City) original original

ms.ms.

digitizdigitizeded

transcritranscribedbed

Page 4: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

4

121001026007007200000112100001042220020260070072000001121000010432300100600700720000012123000000423002004007000000000000000000005230020020070000000000000000000062300200000700000000000000000000

Census microdata Census microdata of the 21st (and late 20th) of the 21st (and late 20th)

century: century: Who will preserve them?Who will preserve them?

Will they be made usable?Will they be made usable?

Census Census microdata:microdata:

Public goods should Public goods should be usedbe used

Censuses are Censuses are costlycostly

Where microdata are available, Where microdata are available, they are usedthey are used

Page 5: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

5

……official statistics that meet the official statistics that meet the

test of practical utility are to be test of practical utility are to be

compiled and made available on compiled and made available on

an impartial basis by official an impartial basis by official

statistical agencies to honor statistical agencies to honor

citizens’ entitlement to public citizens’ entitlement to public

information.information.

-- UN Statistical Commission, -- UN Statistical Commission,

19941994

Page 6: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

6

How anonymized census samples How anonymized census samples became a standard statistical became a standard statistical

product:product:

» USA: 1960, 1970, 1980, 1990: varying densities; USA: 1960, 1970, 1980, 1990: varying densities; gaining on CPS as most widely used demographic gaining on CPS as most widely used demographic microdatamicrodata

» Canada:Canada:- 1971, 1976, 1981, 1986, 1991, 1996: varying - 1971, 1976, 1981, 1986, 1991, 1996: varying

designsdesigns- 1996: Data Liberation Initiative led to an - 1996: Data Liberation Initiative led to an

explosion in of usage in research and teachingexplosion in of usage in research and teaching» UK: UK:

- 1991: 2% individuals, 0.5% households- 1991: 2% individuals, 0.5% householdshundreds of publications, thousands of users hundreds of publications, thousands of users

- 2001: double the densities.- 2001: double the densities.

Page 7: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

7

IPUMSIPUMSii helps five ways: helps five ways:

» 1. 1. InventoryInventory the world’s census microdata the world’s census microdata» 2. 2. PreservePreserve endangered microdata and endangered microdata and

documentationdocumentation* * ** * *

» 3. 3. AnonymizeAnonymize census microdata to preserve census microdata to preserve statistical confidentiality, using highest statistical confidentiality, using highest standards (Stat. Nether.)standards (Stat. Nether.)

» 4. 4. IntegrateIntegrate datasets of selected countries using datasets of selected countries using UN, Eurostat and other standardsUN, Eurostat and other standards

» 5. 5. DisseminateDisseminate database free with complete database free with complete copies to all partnerscopies to all partners

IIntegrated ntegrated PPublic ublic UUse se MMicrodata icrodata SSeries - eries - IInternationalnternational

Page 8: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

8

PPAAYYSS

II

PP

UU

MM

SSii

» Assemble microdata and Assemble microdata and documentationdocumentation

» Develop samples to minimize Develop samples to minimize confidentiality risks and maximize confidentiality risks and maximize robustnessrobustness

» Design national integration planDesign national integration plancensus-by-censuscensus-by-censusconcept-by-conceptconcept-by-conceptcode-by-codecode-by-code

» Write integrated documentation Write integrated documentation

National experts in each National experts in each country are contracted country are contracted

to:to:

Page 9: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

9

» Microdata...for any population Microdata...for any population or administrative division: or administrative division: Nation, province, district, city, Nation, province, district, city, ethnic group, etc.ethnic group, etc.

» Example: Latin America, Example: Latin America, - 20 countries- 20 countries- 67 censuses inventoried- 67 censuses inventoried- 1% - 100% sample densities- 1% - 100% sample densities- 100,000 to 150 million cases- 100,000 to 150 million cases19th century19th century: : 2 censuses 2 censuses1960s1960s:: 1414 1970s1970s::17171980s1980s:: 1616 1990s1990s::1717

» Found: complete census data for Found: complete census data for Colombia 1973 and 16 other countriesColombia 1973 and 16 other countries

II

PP

UU

MM

SSii

IINNVVEENNTTOORRIIEESS

Page 10: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

10

PPRREESSEERRVVEESS

UN Demographic Center for Latin UN Demographic Center for Latin America America

(CELADE, Santiago, Chile)(CELADE, Santiago, Chile)~3000 microdata tapes to be ~3000 microdata tapes to be

preservedpreserved

IIPPUUMMSSii

and metadata (documentation)and metadata (documentation)

Page 11: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

11

Preserve against accident, Preserve against accident, deterioration and technological deterioration and technological

obsolescenceobsolescence» Microdata:Microdata:

- transfer to stable media- transfer to stable media- use standard data storage protocols- use standard data storage protocols- entrust copies with at least two depositories - entrust copies with at least two depositories

» Metadata: collect, catalogue, and reproduceMetadata: collect, catalogue, and reproduce- Enumeration forms (preserve all versions - Enumeration forms (preserve all versions

used)used)- Enumerator and data processing instructions - Enumerator and data processing instructions - Codebooks (photocopies and scanned images)- Codebooks (photocopies and scanned images)- Technical studies, evaluations, reports- Technical studies, evaluations, reports

UN Stat. Div.: entire archive to be UN Stat. Div.: entire archive to be preserved, cataloguedpreserved, catalogued

Page 12: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

12

AANNOONNYYMMIIZZEESS

II

PP

UU

MM

SSii

Using the highest Using the highest standards currently standards currently

available:available:technical (Eurostat technical (Eurostat

workshops)workshops)administrative (license administrative (license

agreement)agreement)

Imagine a new statistical Imagine a new statistical product: product: a scientifically anonymized a scientifically anonymized census microdata sample made census microdata sample made up of unidentifiable individuals...up of unidentifiable individuals...

Page 13: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

13

Anonymized census microdata Anonymized census microdata samplessamples

available for European countriesavailable for European countries(* = in IPUMS(* = in IPUMSii consortium, consortium, * * = =

negotiating)negotiating)» 16 countries available via PAU, 1990 round 16 countries available via PAU, 1990 round

(3 in IPUMS(3 in IPUMSii, 4 negotiating):, 4 negotiating):» Belgium, Czech Republic, Estonia, Belgium, Czech Republic, Estonia,

**Finland, *Hungary, Finland, *Hungary, **Italy, Latvia, Italy, Latvia, Lithuania, Lithuania, **Norway, Poland, *Spain, Norway, Poland, *Spain, Sweden, Switzerland, Sweden, Switzerland, **Russia, Turkey, *UKRussia, Turkey, *UK

» 11 countries not available via PAU (2 in 11 countries not available via PAU (2 in IPUMSIPUMSii):):» *Austria, Croatia, Denmark, *France, *Austria, Croatia, Denmark, *France,

Germany, Iceland, Ireland, Germany, Iceland, Ireland, **Netherlands, Netherlands, Portugal, Slovak Republic, SloveniaPortugal, Slovak Republic, Slovenia

Page 14: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

14

International Monetary Fund’s International Monetary Fund’s General Data Dissemination General Data Dissemination

SystemSystem52 countries with uniform 52 countries with uniform

standardsstandards» All embrace strict standards of statistical All embrace strict standards of statistical

confidentialityconfidentiality» Prohibit disclosure of information which Prohibit disclosure of information which

may identify individuals or entitiesmay identify individuals or entities» 37 of 52 countries distribute anonymized 37 of 52 countries distribute anonymized

census microdata samplescensus microdata samples» Microdata samples are becoming standard Microdata samples are becoming standard

statistical products statistical products

Page 15: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

15

IINNTTEEGGRRAATTEESS

Photos from Colombia Photos from Colombia integration projectintegration project, February, February--

March, 2000:March, 2000:4 experts from DANE (census 4 experts from DANE (census

office)office)+7 academics (3 universities)+7 academics (3 universities)

IIPPUUMMSSii

Standard:UN/Standard:UN/Eurostat Eurostat Principles & Principles & Recs...Recs...

Census Census documentation documentation compiled for compiled for Colombian Colombian microdatamicrodata

Page 16: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

16

DDIISSSSEEMMIINNAATTEESS

II

PP

UU

MM

SSii

» End-User license agreement End-User license agreement » protects privacy and confidentialityprotects privacy and confidentiality» assures proper use assures proper use

» User selects User selects » countries, countries, » cases, cases, » variables, and variables, and » samples--makes chronological &/or samples--makes chronological &/or

cross-national research possible cross-national research possible using census microdatausing census microdata

» Open architecture software and Open architecture software and mirror sites available to all mirror sites available to all partnerspartners

International web-based International web-based access system access system

Page 17: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

17

153 countries with 1 million + pop. in 2000153 countries with 1 million + pop. in 2000

2000 round figures are provisional2000 round figures are provisional

Population censuses became Population censuses became universal in the 20th century. universal in the 20th century.

Will census microdata ... in Will census microdata ... in the 21st?the 21st?

Page 18: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

18

additional information at:additional information at:http://www.ipums.org/internatihttp://www.ipums.org/internati

onalonal

* * * * * ** * * * * *

Thank youThank you

Page 19: Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas

Census 2000 symposium, session 4 paper 26

19

Preserving Memory, Increasing Preserving Memory, Increasing StakeholdersStakeholders

» 1. Introduction: Well-preserved documentation 1. Introduction: Well-preserved documentation and data -->effective data collection, and data -->effective data collection, dissemination, usedissemination, use

» 2. Long-term preservation of documentation 2. Long-term preservation of documentation and dataand data

» 3. Determining What to Preserve3. Determining What to Preserve» 4. Assessing Future Value4. Assessing Future Value» 5. Inventory of available technology/ personnel/ 5. Inventory of available technology/ personnel/

knowledgeknowledge» 6. Conclusion: Preserve and make accessible 6. Conclusion: Preserve and make accessible

census microdata to enhance value of census census microdata to enhance value of census (IPUMS(IPUMSi i ))