22
Finding and Using Secondary Data and Resources for Research Karen Whiteman, PhD June 10, 2014 1

Finding and Using Secondary Data and Resources for Research

Embed Size (px)

DESCRIPTION

Finding and Using Secondary Data and Resources for Research

Citation preview

Page 1: Finding and Using Secondary Data  and Resources for Research

Finding and Using Secondary Data and Resources for Research

Karen Whiteman, PhDJune 10, 2014

1

Page 2: Finding and Using Secondary Data  and Resources for Research

Overview of Presentation

•What is Secondary Data? • Finding and Accessing Data • Online Demo • Creating a Personalized Dataset• Support and Resources

2

Page 3: Finding and Using Secondary Data  and Resources for Research

Secondary Data Myths• Secondary data in research is more time consuming and

complicated than other methodologies.

False. While there is a certain degree of difficulty using secondary data, working with secondary data can and should be adapted to the skill level of the researcher.

• Secondary data is inferior to the alternative of collecting one’s own data.

False. Using secondary data is not a replacement for personal data collection; it is most useful in conjunction with other methodologies, such as experimentation, survey research, or clinical research.

3

Page 4: Finding and Using Secondary Data  and Resources for Research

Weighing the Pros and ConsPros• Secondary data are often collected using well-established

measures with known psychometric properties for the specific population being studied

• Many secondary datasets contain, or can be created to provide, diverse samples that are likely to be representative of more broad populations

• Secondary datasets are often large enough to provide good statistical power for most types of planned analyses

• Cost-effective (i.e., takes less time than collecting your own data)• IRB is likely expedited Cons• Inability to select specific questions or measurements• Lack of control over precise timing of data collection

4

Page 5: Finding and Using Secondary Data  and Resources for Research

Questions to Consider…• Are you interested in looking at data at one point in time or over time? • Are you interested qualitative, quantitative, or mixed methods? • What group of people do you want to study (target population)? • Older adults• LGBT• Specific race/ethnicity• Rural/urban• City, State, National, International

• What topic area are you interested in? • Criminal justice• Education• HIV/AIDS• Mental health• Substance use

5

Page 6: Finding and Using Secondary Data  and Resources for Research

Questions to Consider…

6

Page 7: Finding and Using Secondary Data  and Resources for Research

Approach to Successful Research with Large Datasets1. Define your research topic and research questions2. Select a database3. Get to know your database4. Structure your analysis and presentation of findings in a way that is clinically meaningful

7

Page 8: Finding and Using Secondary Data  and Resources for Research

Data Banks

8

• Inter-university Consortium for Political and Social Research (United States)

• The UK Data Service

• Council of European Social Science Data Archives

• Australian Social Science Data Archive

Page 9: Finding and Using Secondary Data  and Resources for Research

Inter-university Consortium for Political & Social Research

9

Page 10: Finding and Using Secondary Data  and Resources for Research

Types of Data

Quantitative • Micro data are the coded numerical responses to surveys with

a separate record for each individual respondent • Macro data are aggregate figures, for example country-level

economic indicators *data formats include SAS, SPSS, Stata, R

Qualitative Restricted files

10

Page 11: Finding and Using Secondary Data  and Resources for Research

Benefits of Large Scale Government Data • Good quality data • Produced by experienced research organizations • Usually nationally representative with large samples • Good response rates • Very well documented • Can contact agency for question

• Hierarchical data • Treatment model effects on individual• Intra-household effects on individual

• Longitudinal Data• Allows for comparisons over time• Experience working with longitudinal datasets

11

Page 12: Finding and Using Secondary Data  and Resources for Research

What Can I do With the Data?

• Comparative research, restudy or follow-up study

• Re-analysis/secondary analysis

• Research design and methodological advancement

• Replication of published statistics

• Teaching and learning 12

Page 13: Finding and Using Secondary Data  and Resources for Research

Find and Analyze Data

Data search (basic, advanced) • Enter search term• Browse by topic • Browse by series• Browse by geography• Browse by investigator • Browse by data format• Browse international data • View all studies

13

Page 14: Finding and Using Secondary Data  and Resources for Research

Special Conditions • Anyone can access the data, however, there are special conditions:

• Need prior IRB approval from your institution

• Need to complete special licensure

• Complete Approved Researcher forms

14

Page 15: Finding and Using Secondary Data  and Resources for Research

Inter-university Consortium for Political & Social Research

• The Inter-university Consortium for Political and Social Research

15

Page 16: Finding and Using Secondary Data  and Resources for Research

Finding data – Other related catalogues• Health and Retirement Study

• Labor force participation, health transitions at end of work life, income, pension plans, health insurance, cognitive function, assets, disability, health care costs.

• National Epidemiologic Survey on Alcohol and Related Conditions• Collects data on background, alcohol and drug consumption, abuse and

dependence, treatment utilization, family history of alcoholism or drug abuse, tobacco use and dependence, medicine use. Current and family mental health (e.g., depression, anxiety and personality disorders, medical conditions, and victimization).

• Youth Risk Behavior Surveillance System• Responses include behaviors that contribute to unintentional injuries and

violence, tobacco use, alcohol and other drug use, sexual behaviors, unintended pregnancy and sexually transmitted diseases (STDs), unhealthy dietary behaviors, and physical inactivity. 16

Page 17: Finding and Using Secondary Data  and Resources for Research

Now what? Found a database….now what?

Is the database is a format you understand? (SPSS, R, SAS)Check to see if the related variables exist by downloading codebookRun a simple analysis to find out the sample size of the (1) full sample,

(2) control variables, (3) independent variables, (4) dependent variables. Is the sample size adequate?

Has your study been done before?

Special populations Race/ethnicity Age LGBT

Page 18: Finding and Using Secondary Data  and Resources for Research

Creating a Personalized Dataset• Organizing the project• Create a central repository of the following information:

• Contact information of owners of database• Codebooks• Questionnaires• User Guides• Articles of interest published by others who used the data

18

Page 19: Finding and Using Secondary Data  and Resources for Research

Creating a Personalized Dataset• Create a personal variable codebook

19

Variable Basline 3-month 6-month Codes Site identification Site_ID        01 = UCSF   02 = Chinatown   03 = Sunset Park   04 = Rochester   05 = U. Penn Sociodemographics         Age B_X1 Financial situation b_a13 1=can’t make ends meet;

2=just enough to get along; 3=are comfortable; 4=DK

Race 1=White; 2=Latino/Hispanic; 3=African-American/Black

Moderator Gender S_A1 1=male

2=female

Page 20: Finding and Using Secondary Data  and Resources for Research

Creating a Personalized Dataset• Structuring the data• Merge files• Clean variables

• Coding (highest # is the reference)• Create new variables if necessary (restructuring, combining)

• Example: Anxiety Disorders, Time, Depression diagnosis

• Check everything TWICE

20

Page 21: Finding and Using Secondary Data  and Resources for Research

Creating a Personalized Dataset• Statistical Considerations• Weighting the sample

• Weights redistribute the sample to be representative of a larger, well-defined population

• Treatment of missing data• Listwise deletion• Complete case analysis• Multiple imputation• Full maximum likelihood estimation• Restricted maximum likelihood estimation• Last Observation Carried Forward

21

Page 22: Finding and Using Secondary Data  and Resources for Research

Support and Resources

• Statistical consulting • Statistical forums • http://www.bristol.ac.uk/cmm/learning/support/jisc.html

• Youtube.com• https://www.youtube.com/user/ProfAndyField

• Webinars• www.theanalysisfactor.com

• Private consulting

22