Upload
ciser-edina
View
351
Download
2
Embed Size (px)
Citation preview
Data Archive - Collection and Services
•Established over 30 years ago
•Collection of numeric datasets to support quantitative research
c. 27,000 online files in addition to thousands of studies on CD/DVD
•Emphasis on demography (state/federal censuses), economics, health, labor, election studies, attitudinal and behavioral studies, family life etc.
•Consulting services to match user needs with appropriate data• finding, accessing and using data
•Current Cornell researchers can download archive files from online catalog (search & browse) in formats conversant with statistical software
•Data files are identified by a ‘traffic light’ icon that indicates usage level:
• Green – downloadable by anyone
• Yellow – downloadable from links in the catalog with CUWebAuth authentication (for use within the CISER research computing environment - CISERRSCH) – Cornell researchers can apply for a computing account
• Red – data to be used in restriction ( via conditions imposed by data provider)
• Cornell Restricted Access Data Center
•Provides Cornell social science researchers with a repository for sharing and providing long-term preservation of their numeric/statistical research data
•Participates in Cornell’s Research Data Management Service Group
•Assist Cornell social science researchers with Research Data Management (RDM) plans
•Provide Cornell social science researchers with support and expertise in obtaining and using restricted data
• Data means different things to different people (informatics, geography, art history, system biology, architecture, archaeology etc)
• Definition of data / value of data in a commercial sense is different to that in an academic sense
• Data requirements differ for the undergraduate, postgraduate, teacher, researcher
• Data catalogs, data libraries, gateways, portals exist for a range of disciplinary domains
Believe it or not – not all data are the same …
Research data may include all of the following:• Text or Word documents, spreadsheets• Laboratory notebooks, field notebooks, diaries• Questionnaires, transcripts, codebooks• Audiotapes, videotapes• Photographs, films• Slides, artifacts, specimens, samples• Database contents including video, audio, text, images• Models, algorithms, scripts• Contents of an application such as input, output, log files for analysis software, simulation software, schemas• Methodologies and workflows• Standard operating procedures and protocols
Formats, size, volume, open, confidentiality, complexity, flat files – factors to consider as part of the reference interview (computing capabilities, software dependencies, copyright and ethical considerations)
Data Reference Interview - establish what the user actually needs (not what they think they may need!) :
• Statistics or data? Summary statistics, secondary use datasets, raw or derived data• Software requirements, contingencies
• What is the subject or topic? Health, unemployment, deprivation• Type of analysis? Visualization, map, statistical analysis, modelling • What is the unit of analysis? Individual, family, county-level, country-level• Geographic constraints?• Time constraints? Range of years, daily, monthly, quarterly, annual• Cross-sectional or longitudinal? • Data type? Historic, demographic, financial, administrative, geospatial
Sets the goals and structure for the data interview and helps articulate any decisions made by the data librarian
Establishes the ‘learning stage of the user’ and helps put them at ease
Observations:
Establish time-line for research and data needs (can buy data librarian time, set priorities, allow time for further investigation)
Fine balance between assistance and exploitation!!
Recognition that data finding, data handling etc may be the learning objective itself (e.g. identifying variables and using a codebook)
All data queries should be viewed as new. It will soon become evident if the request has similarities with previous enquiries.
Important not to use too much jargon and to double-check understanding of unfamiliar terms – often we use the same word to mean something different, conversely we can use different words but mean the same thing
Sometimes users will say they understand but often don’t. If there’s any doubt ask and explain again.
Supply of up-to-date user guides to hand
Call Management Systems are great knowledge banks
Be familiar with available expertise (colleagues, organization, national, international)
Google is a friend. A very good friend.
Two recent examples:
Q. Grad student wanting # of plastic surgery clinics in Seoul, South Korea from 1990-2009
A. the International Society of Aesthetic Plastic Surgery (ISAPS - http://www.isaps.org/ ) in particular the ISAPS International Survey on Aesthetic/Cosmetic Procedures – there’s data for 2010 and 2011 (http://www.isaps.org/isaps-global-statistics.html ).
Process:Check NGO sources (World Bank, UN etc)Check Google – deep searching in to results using a variety of related terms. Time consuming but often productive. Searches often find references in literature which can be followed up or discussion forums.
11
User needs statistical data about agrarian violence (originated by land disputes) variables include: food riots, assassinations (if occurred as result of land dispute), imprisonments etc
unit of investigation is country-year; area of interest: Latin American countries; period: from 1960 until now, yearly
Process:Not likely to available through NGO sourcesTry deep searching through Google – find literature sources with summary statistics about land disputes for individual countries – no time series
Responded: Check Latin America Network Information Center (LANIC) at Univ. Taxas at Austin Speak with our Cornell Colleague Sean Knowlton who has expertise in Latin American statistical resources. Check CEPALSTAT - gateway to statistical information of Latin America and the Caribbean countries published by Economic Commission for Latin America and the Caribbean
Social Science research data resources•Inter-University Consortium for Political and Social Research (ICPSR)•National Archive of Criminal Justice Data•Minority Data Resource Center•National Archive of Computerized Data on Aging
•Roper Center for Public Opinion Archives•International Data Archives e.g. CESSDA, UKDA, Eurostat• CESSDA catalog (DDI) provides a multi-lingual interface to datasets from member social science data archives across Europe • Study description and online documentation are free
•Non-Govenmental Organizations•National / Governmental Statistical Agencies
Social science statistical data on the internet:
CISER Internet Data Sources:https://ciser.cornell.edu/info/datasource.shtml
MIT Data Sources:http://libguides.mit.edu/ssds/any-subject
Columbia University Social Science Datahttp://library.columbia.edu/locations/dssc/data/socsc.html
University California, San Diego – Data on the Webhttp://3stages.org/idata/
Most research-driven universities have similar listings via Data Library webpages
CISER Data Archive is located at 391 Pine Tree Road, Ithaca
CISER is open 8.30am – 4.30pm (Mon-Fri) – walk-in assistance is not always available – so appointments are recommended
Location & hours:
Contacts:
Tel.: (607) 255 4801Email: [email protected]