40
11. NLTS2 Documentation: Data Dictionaries

11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module 1. Introduction to the NLTS2

Embed Size (px)

Citation preview

Page 1: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

11. NLTS2 Documentation: Data Dictionaries

Page 2: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

2

11. NLTS2 Documentation: Data Dictionaries

Prerequisites• Recommended modules to complete before viewing

this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2 Study Design and Sampling NLTS2 Data Sources, either

• 4. Parent and Youth Surveys or• 5. School Surveys, Student Assessments, and Transcripts

10. NLTS2 Documentation Overview

Page 3: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

3

11. NLTS2 Documentation: Data Dictionaries

Overview Purpose Data dictionary contents File specifications

• Variable prefix• Missing values

Variable documentation Variable documentation details Parent/youth Part 2 documentation distinctions Transcript data documentation distinctions Supplemental documentation Closing Important information

Page 4: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

4

11. NLTS2 Documentation: Data Dictionaries

Purpose

• The data dictionary section of the documentation is the most detailed for individual data items.

• The data dictionary includes specific information about each item such as Which respondents are included in the data element if there is

skip logic applied. Documentation of any modification made to the data element,

such as a logical assignment to change a value. Variable names of corresponding items in other Waves.

• Users should refer to the data dictionary before specifying any analysis.

Page 5: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

5

11. NLTS2 Documentation: Data Dictionaries

Purpose• Why use the data dictionary rather than the data

collection instruments? Data collection instruments are extremely useful.

• Can be a quick reference for finding an item• Show the item in the context of other items• Contain the exact wording of questions that respondents were asked

However, only the data dictionaries describe• Complex skip logic, especially from CATI instruments• Data issues, such as an addition of response categories from one

wave to the next • Any programmatic modifications, assignments, or recoding of the

data, such as setting a value to yes if a prior response is yes

Page 6: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

6

11. NLTS2 Documentation: Data Dictionaries

Data dictionary contents

• There is a data dictionary for every data collection source within each wave.

• Every dictionary begins with a linked contents.• Links go to

File specifications. Variable descriptions by section or topic area.

Page 7: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

7

11. NLTS2 Documentation: Data Dictionaries

Data dictionary contents example

Page 8: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

8

11. NLTS2 Documentation: Data Dictionaries

File specifications

• The first section of the data dictionary is “File Specifications,” which lists The associated file name The data collection source The prefix for variable names in the file Linking variable (always “ID”) Missing values

Page 9: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

9

11. NLTS2 Documentation: Data Dictionaries

File specifications: Example

Page 10: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

10

11. NLTS2 Documentation: Data Dictionaries

File specifications

• Variable prefix The prefix for variable names in the file applies to most but

not all variables.• With a few exceptions, variables found in this file begin with the

variable prefix.– There are specialized variables that have another prefix structure,

such as wave-specific demographic variables.– Example: W2_Age2003 is the age of youth during the

Wave 2 Parent/Youth data collection and W2_Age2004 is the age of youth for the Wave 2 school data collection; the prefix is Wave 2.

Page 11: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

11

11. NLTS2 Documentation: Data Dictionaries

File specifications

• Missing values Can be found in this file.

• Note about missing values User-defined missing values specify why a variable is

missing. Missing values are excluded from calculations in procedures

unless the user specifies options to include them. Data were developed in SAS and converted to SPSS.

• There are differences in how missing values are defined and stored in SAS and SPSS.

Page 12: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

12

11. NLTS2 Documentation: Data Dictionaries

File specifications• Missing values in SAS

System default missing in SAS is a “.” User-defined missing values in SAS can have a value from

“.a” to “.z” Missing values in a numeric variable have a numeric value

in a SAS logical statement.• For example, the logical statement “If npr1B4 < 1” would include all

cases for which the value is “0”, a negative number, or a missing value.

Page 13: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

13

11. NLTS2 Documentation: Data Dictionaries

File specifications• Missing values in SPSS

SPSS system missing is a “.” in the data and appears as “System” in SPSS analysis output under “Missing.”

SPSS allows for three distinct user-defined missing values, fewer than SAS.

With the range option, users can define a range of missing values to work around the limitation of three distinct missing values.

Missing values are represented as negative numbers in the NLTS2 SPSS data.• -980 through -999 are in the missing values range.

Missing values in SPSS do not have a numeric value in a logical statement, unlike in SAS.• For example, “IF (npr1B4 < 1) B4New = 0.” would result in a missing

value in B4New if npr1B4 is missing.

Page 14: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

14

11. NLTS2 Documentation: Data Dictionaries

Variable documentation• After “File specifications,” the dictionary lists all variables in

tabular format.• The variables in the data dictionary are organized by section,

matching the sections in data collection instruments (source data).

• Within each section, there are two sets of variables. Variables that come directly from the data collection instruments. Variables created from source data within that section.

• Variable descriptions include Name, variable type, variable values, source(s), and information about

skip logic, assignments made, and corresponding variable names in other waves.

Page 15: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

15

11. NLTS2 Documentation: Data Dictionaries

Variable documentation

• Variables that come directly from the data collection instruments (source data) Variable names usually have the uniform variable prefix. Source data are drawn from the section, question number,

and subitems in the source instrument. It can be relatively straightforward to find an item in an

instrument and locate it in the dictionary. Example: variable name np4E2c

• The “np4” prefix is NLTS2 Parent/Youth Survey Wave 4.• The “E2c” is Section E of the Parent/Youth Instrument, Question 2,

subitem C.

Page 16: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

16

11. NLTS2 Documentation: Data Dictionaries

Variable documentation• Variables created from source data

Variables that are created using data from the associated section are listed at the end of the section.

Created variables typically have names that describe the variable rather than relate to a data collection source, but with the same prefix as the source variables.• Variable np3_JobCompNow is

[np3] Parent/Youth interview Wave 3 [JobCompNow] currently competitively employed

Collapsed variables, i.e., variables combined from two or more items, sometimes list all contributing variables in the name• Variable np4U8a_J15a is

[np4] Parent/Youth Wave 4 [U8a] question U8a [_] combined with [J15a] question J15a

Page 17: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

17

11. NLTS2 Documentation: Data Dictionaries

Variable documentation

• In addition to variables related to particular items from data collection instruments, there are some other key variables. Demographic variables that are used for many NLTS2

analyses and published Web tables Weights, including replicate weights Linking variable “ID” Preload, CATI, and/or sample variables

• The following slide provides a quick glance at the data dictionary with details in following slides.

Page 18: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

18

11. NLTS2 Documentation: Data Dictionaries

Variable documentation: Quick look

Page 19: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

19

11. NLTS2 Documentation: Data Dictionaries

Variable documentation: Formatting key• Bold text in the dictionary indicates a modification to

questionnaire categories as a result of coding and categorizing verbatim responses.

• Grey text indicates that there are no data for this item in this wave. For example, Question R1b was asked in Waves 2 to 4 but not in Wave

5; in Wave 5, R1b is shaded.

Page 20: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

20

11. NLTS2 Documentation: Data Dictionaries

Variable documentation details• Variable name

Name of the variable as it appears in the data file.

In this example, there is a series of variables for item np4F11b, np4F11b_a through np4F11b_h.• Each variable in the series is

listed separately.

Figure 1-A.

Note: See Figure 1, section C in Module 11 Supporting Materials.

Page 21: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

21

11. NLTS2 Documentation: Data Dictionaries

Variable documentation details• Source

Item from data collection source.

If multiple instrument sources, items from each data source listed.

This example comes from the question F11b, subitems a-g.

Figure 1-B.

Note: See Figure 1, section C in Module 11 Supporting Materials.

Page 22: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

11. NLTS2 Documentation: Data Dictionaries

22

Variable documentation details• Variable description

Describes the variable.• Often the text of the question from

the source instrument

Variable description corresponds with the variable label in the file contents.

Figure 1-C.

Note: See Figure 1, section C in Module 11 Supporting Materials.

Page 23: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

11. NLTS2 Documentation: Data Dictionaries

23

Variable documentation details• Variable description (cont’d)

In this example, the itemis described as types oflife skills training, the subitems are the individual types of life skills traininglisted in this question.

Subitems “a-g” come from the source and “h” is created.

Figure 1-D.

Note: See Figure 1, section C in Module 11 Supporting Materials.

Page 24: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

11. NLTS2 Documentation: Data Dictionaries

24

Variable documentation details• Variable type and values

Shows how the variable is coded and what the codes mean.

Variable type is numeric, date, or character.

The variable values match the variable’s associated format referred to in the SAS contents.

This example is a numeric variable with yes/no values.

Figure 1-E.

Note: See Figure 1, section C in Module 11 Supporting Materials.

Page 25: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

25

11. NLTS2 Documentation: Data Dictionaries

Variable documentation details

Describe any changes made to a variable List logic for making an assignment or

modification to an existing variable. Specify the logic for how new variables

were created. An assignment might increase or

decrease the base. In this example, assignments were made

to subitems np4F14_[a-g] to set values to “no” if np4F11a is “no.”

A new subitem np4F11b_h is created using values from np4F11a and np4F14a_f.

• Notes: Assignments, modifications, or validationsFigure 1-F.

Note: See Figure 1, section C in Module 11 Supporting Materials.

Page 26: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

11. NLTS2 Documentation: Data Dictionaries

26

Variable documentation details• Base: Which respondents asked

Logic is expressed as who isincluded, not who is skipped.

Explains varying n’s due toskip logic.

If “All respondents” is noted, itmeans no one is skipped.

In this example, the respondents asked this item were limited to those who had not been in secondary school in the past year and had specified this service since leaving high school.• However, in the notes column in the previous slide there was an assignment made.• Although they were not asked this question, those who were “no” to np4F11a were

assigned a “no” to np4F14_[a-g].

Figure 1-G.

Note: See Figure 1, section C in Module 11 Supporting Materials.

Page 27: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

11. NLTS2 Documentation: Data Dictionaries

27

Variable documentation details• Variable name by wave

Along with the variablename for the currentwave, correspondingvariable names are listedby all other waves.

There may be minor differences in the variables between waves, or an item may not have been asked in another wave.

In this example, there is no corresponding set of variables for this item in Wave 1, and the item is slightly different in Wave 5.

Figure 1-H.

Note: See Figure 1, section C in Module 11 Supporting Materials.

Page 28: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

28

11. NLTS2 Documentation: Data Dictionaries

Variable documentation details• Some of the columns noted above contain information not

found elsewhere. “Base” and “Notes” columns are key for understanding the

nature of a variable.• Provide documentation about who is included in an item and any

changes made to the data.• Particularly important when using CATI data with complex skip logic.

“Variable name by wave” is a resource for finding longitudinal items.• Provides wave-by-wave variable names.• Indicates if item not collected in a given wave and notes if item differs

in other waves.

Page 29: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

29

11. NLTS2 Documentation: Data Dictionaries Parent/youth Part 2 documentation

distinctions• Waves 2 to 5 Parent/Youth Survey has a Part 2 that is

completed by either the youth or the parent/guardian. Documentation for Part 2 in these waves includes all sources and

variable names. For each item, variables are listed in the following order: youth

item, the parent/guardian item, and a collapsed youth/parent item.

For collapsed items in cases where there is a value for both items, priority is given to the youth value.

Usually there is either a parent/guardian value or a youth value.

Page 30: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

30

11. NLTS2 Documentation: Data Dictionaries

Parent/youth Part 2 documentation: Quick look

Page 31: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

31

11. NLTS2 Documentation: Data Dictionaries

Parent/youth Part 2 documentation• The item is “Youth has done volunteer or community service in the past 12

months”.• np5P8 is the youth item, np5J4 the parent/guardian, and np5P8_J4 is the

combined youth/parent guardian item.• Data come from interviews (youth item P8 and parent item J4) and mail

questionnaires (youth A7a and parent Q20b).Figure 2-A.

Note: See Figure 2, section C in Module 11 Supporting Materials.

Page 32: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

11. NLTS2 Documentation: Data Dictionaries

32

Parent/youth Part 2 documentation• This example is a numeric variable

that has a yes or no value.• Notes: As we have seen in the

previous slide, data come from multiple sources. Youth interview and youth mail

questionnaire, parent/guardian interview, abbreviated interview, and mail questionnaires.

• Coding of combined item is described.

Figure 2-B.

Note: See Figure 2, section C in Module 11 Supporting Materials.

Page 33: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

33

11. NLTS2 Documentation: Data Dictionaries

Parent/youth Part 2 documentation• All youth respondents were

asked this question and all Parent Part 2 respondents were asked.

• There was no youth interview in Wave 1, but otherwise there are corresponding variable names for each wave for youth, parent/guardian, and combined.

Figure 2-C.

Note: See Figure 2, section C in Module 11 Supporting Materials.

Page 34: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

34

11. NLTS2 Documentation: Data Dictionaries

Transcript data documentation distinctions

• Transcript data are in multiple files. Each file is documented in a separate section in the

transcript data dictionary.• Files are either from source data or are summarized

data from course-level transcript data.• Files can have a single record or multiple records per

student depending on the type of transcript data.

Page 35: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

35

11. NLTS2 Documentation: Data Dictionaries

Transcript data documentation• Source data files

Overall: One record per student with any transcript data. By year: Multiple records per student with one record for every school

year recorded in transcripts. Course level: Multiple records per student with one record for every

course within a grading period.• Summary data files

Overall summary: One record per student with complete transcript data summarizing course taking across all grades attended.

By grade summary: Multiple records per student; one record for every grade attended summarizing course taking within a grade.

Page 36: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

36

11. NLTS2 Documentation: Data Dictionaries

Transcript data documentation: Quick look

Page 37: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

37

11. NLTS2 Documentation: Data Dictionaries

Supplemental documentation• Transcript dictionary

List of course codes and course categories Key to composite variable names in summarized data

• Parent/youth survey dictionaries Types of medications Job codes

• Assessment dictionaries Direct and alternate assessment references

• Cross-instrument data dictionary Decision rules for cross-instrument data

Page 38: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

38

11. NLTS2 Documentation: Data Dictionaries

Documentation summary

• The data documentation contains a wealth of information organized in a variety of ways.

• It is good practice to refer to the data dictionary before proceeding with analysis. Finding a question in a data collection instrument does not

provide enough information about that item. The data dictionary describes each item, including

information about skip logic and modifications made to variable values.

Page 39: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

39

11. NLTS2 Documentation: Data Dictionaries

Closing• Topics discussed in this module

Purpose Data dictionary contents File specifications

• Variable prefix• Missing values

Variable documentation Variable documentation details Parent/youth Part 2 documentation distinctions Transcript data documentation distinctions Supplemental documentation

• Next module: 12: NLTS2 Documentation: Quick References

Page 40: 11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2

40

11. NLTS2 Documentation: Data Dictionaries

Important information NLTS2 website contains reports, data tables, and other

project-related information http://nlts2.org/

Information about obtaining the NLTS2 database and documentation can be found on the NCES website http://nces.ed.gov/statprog/rudman/

General information about restricted data licenses can be found on the NCES website http://nces.ed.gov/statprog/instruct.asp

E-mail address: [email protected]