14
Stuart Macdonald CISER Data Services Librarian Email: [email protected] Data Reference Interview

CISER & the Data Reference Interview

Embed Size (px)

Citation preview

Page 1: CISER & the Data Reference Interview

Stuart MacdonaldCISER Data Services Librarian

Email: [email protected]

Data Reference Interview

Page 2: CISER & the Data Reference Interview

Data Archive - Collection and Services

•Established over 30 years ago

•Collection of numeric datasets to support quantitative research

c. 27,000 online files in addition to thousands of studies on CD/DVD

•Emphasis on demography (state/federal censuses), economics, health, labor, election studies, attitudinal and behavioral studies, family life etc.

Page 3: CISER & the Data Reference Interview

•Consulting services to match user needs with appropriate data• finding, accessing and using data

•Current Cornell researchers can download archive files from online catalog (search & browse) in formats conversant with statistical software

•Data files are identified by a ‘traffic light’ icon that indicates usage level:

• Green – downloadable by anyone

• Yellow – downloadable from links in the catalog with CUWebAuth authentication (for use within the CISER research computing environment - CISERRSCH) – Cornell researchers can apply for a computing account

• Red – data to be used in restriction ( via conditions imposed by data provider)

• Cornell Restricted Access Data Center

Page 4: CISER & the Data Reference Interview

•Provides Cornell social science researchers with a repository for sharing and providing long-term preservation of their numeric/statistical research data

•Participates in Cornell’s Research Data Management Service Group

•Assist Cornell social science researchers with Research Data Management (RDM) plans

•Provide Cornell social science researchers with support and expertise in obtaining and using restricted data

Page 5: CISER & the Data Reference Interview

• Data means different things to different people (informatics, geography, art history, system biology, architecture, archaeology etc)

• Definition of data / value of data in a commercial sense is different to that in an academic sense

• Data requirements differ for the undergraduate, postgraduate, teacher, researcher

• Data catalogs, data libraries, gateways, portals exist for a range of disciplinary domains

Believe it or not – not all data are the same …

Page 6: CISER & the Data Reference Interview

Research data may include all of the following:• Text or Word documents, spreadsheets• Laboratory notebooks, field notebooks, diaries• Questionnaires, transcripts, codebooks• Audiotapes, videotapes• Photographs, films• Slides, artifacts, specimens, samples• Database contents including video, audio, text, images• Models, algorithms, scripts• Contents of an application such as input, output, log files for analysis software, simulation software, schemas• Methodologies and workflows• Standard operating procedures and protocols

Formats, size, volume, open, confidentiality, complexity, flat files – factors to consider as part of the reference interview (computing capabilities, software dependencies, copyright and ethical considerations)

Page 7: CISER & the Data Reference Interview

Data Reference Interview - establish what the user actually needs (not what they think they may need!) :

• Statistics or data? Summary statistics, secondary use datasets, raw or derived data• Software requirements, contingencies

• What is the subject or topic? Health, unemployment, deprivation• Type of analysis? Visualization, map, statistical analysis, modelling • What is the unit of analysis? Individual, family, county-level, country-level• Geographic constraints?• Time constraints? Range of years, daily, monthly, quarterly, annual• Cross-sectional or longitudinal? • Data type? Historic, demographic, financial, administrative, geospatial

Page 8: CISER & the Data Reference Interview

Sets the goals and structure for the data interview and helps articulate any decisions made by the data librarian

Establishes the ‘learning stage of the user’ and helps put them at ease

Observations:

Establish time-line for research and data needs (can buy data librarian time, set priorities, allow time for further investigation)

Fine balance between assistance and exploitation!!

Recognition that data finding, data handling etc may be the learning objective itself (e.g. identifying variables and using a codebook)

All data queries should be viewed as new. It will soon become evident if the request has similarities with previous enquiries.

Page 9: CISER & the Data Reference Interview

Important not to use too much jargon and to double-check understanding of unfamiliar terms – often we use the same word to mean something different, conversely we can use different words but mean the same thing

Sometimes users will say they understand but often don’t. If there’s any doubt ask and explain again.

Supply of up-to-date user guides to hand

Call Management Systems are great knowledge banks

Be familiar with available expertise (colleagues, organization, national, international)

Google is a friend. A very good friend.

Page 10: CISER & the Data Reference Interview

Two recent examples:

Q. Grad student wanting # of plastic surgery clinics in Seoul, South Korea from 1990-2009

A. the International Society of Aesthetic Plastic Surgery (ISAPS - http://www.isaps.org/ ) in particular the ISAPS International Survey on Aesthetic/Cosmetic Procedures – there’s data for 2010 and 2011 (http://www.isaps.org/isaps-global-statistics.html ).

Process:Check NGO sources (World Bank, UN etc)Check Google – deep searching in to results using a variety of related terms. Time consuming but often productive. Searches often find references in literature which can be followed up or discussion forums.

Page 11: CISER & the Data Reference Interview

11

User needs statistical data about agrarian violence (originated by land disputes) variables include: food riots, assassinations (if occurred as result of land dispute), imprisonments etc

unit of investigation is country-year; area of interest: Latin American countries; period: from 1960 until now, yearly

Process:Not likely to available through NGO sourcesTry deep searching through Google – find literature sources with summary statistics about land disputes for individual countries – no time series

Responded: Check Latin America Network Information Center (LANIC) at Univ. Taxas at Austin Speak with our Cornell Colleague Sean Knowlton who has expertise in Latin American statistical resources. Check CEPALSTAT - gateway to statistical information of Latin America and the Caribbean countries published by Economic Commission for Latin America and the Caribbean

Page 12: CISER & the Data Reference Interview

Social Science research data resources•Inter-University Consortium for Political and Social Research (ICPSR)•National Archive of Criminal Justice Data•Minority Data Resource Center•National Archive of Computerized Data on Aging

•Roper Center for Public Opinion Archives•International Data Archives e.g. CESSDA, UKDA, Eurostat• CESSDA catalog (DDI) provides a multi-lingual interface to datasets from member social science data archives across Europe • Study description and online documentation are free

•Non-Govenmental Organizations•National / Governmental Statistical Agencies

Page 13: CISER & the Data Reference Interview

Social science statistical data on the internet:

CISER Internet Data Sources:https://ciser.cornell.edu/info/datasource.shtml

MIT Data Sources:http://libguides.mit.edu/ssds/any-subject

Columbia University Social Science Datahttp://library.columbia.edu/locations/dssc/data/socsc.html

University California, San Diego – Data on the Webhttp://3stages.org/idata/

Most research-driven universities have similar listings via Data Library webpages

Page 14: CISER & the Data Reference Interview

CISER Data Archive is located at 391 Pine Tree Road, Ithaca

CISER is open 8.30am – 4.30pm (Mon-Fri) – walk-in assistance is not always available – so appointments are recommended

Location & hours:

Contacts:

Tel.: (607) 255 4801Email: [email protected]