Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week

Instrumentation (cont.)

February 28

Note: Measurement Plan Due Next Week

Unobtrusive Measures

• Many instruments require the cooperation of the respondent in one way or another.

• An intrusion into an ongoing activity could be involved which causes a form of negativity within the respondent.

• To eliminate this, researchers use unobtrusive measures, data collection procedure that involve no intrusion into the naturally occurring course of events.

• In most cases, no instrument is used, however, good record keeping is necessary.

• They are valuable as supplements to the use of interviews and questionnaires, often providing a useful way to corroborate what more traditional data sources reveal.

Types of Scores

• Quantitative data is reported in the form of scores• Scores are reported as either raw or derived scores

– Raw score is the initial score obtained• Taken by itself, a raw score is difficult to interpret, since it has little meaning

– Derived score are scores that have been taken from raw scores and standardized

• They enable researchers to say how well the individual performed compared to others taking the same test

• Examples include:– Age and Grade-level Equivalents– Percentile Ranks

– Standard scores are mathematically derived scores having comparable meaning on different instruments

Four Types of Measurement Scales

Norm-Referenced vs. Criterion-Referenced Instruments

• All derived scores give meaning to individual scores by comparing them to the scores of a group.

• The group used to determine derived scores is called the norm group and the instruments that provide such scores are referred to as norm-referenced instruments.

• An alternative to the use of achievement or performance instruments is to use a criterion-referenced test.

• This is based on a specific goal or target (criterion) for each learner to achieve.

• The difference between the two tests is that the criterion referenced tests focus more directly on instruction.

Descriptive Statistics

Statistics vs. Parameters

• A parameter is a characteristic of a population.– It is a numerical or graphic way to summarize data

obtained from the population

• A statistic is a characteristic of a sample.– It is a numerical or graphic way to summarize data

obtained from a sample

Types of Numerical Data

• There are two fundamental types of numerical data:

1) Categorical data: obtained by determining the frequency of occurrences in each of several categories

2) Quantitative data: obtained by determining placement on a scale that indicates amount or degree

Techniques for Summarizing and Presenting Quantitative Data

• Visual– Frequency Distributions– Histograms– Stem and Leaf Plots– Distribution curves

• Numerical– Central Tendency– Variability

Summary Measures

Central Tendency

Arithmetic Mean

Median Mode

Summary Measures

Variation

Variance

Standard Deviation

Range

Measures of Central Tendency

Central Tendency

Average (Mean) Median Mode

1

1

n

ii

N

ii

XX

n

X

N

Mean

• The most common measure of central tendency

• Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Mean = 5 Mean = 6

Median

• Robust measure of central tendency• Not affected by extreme values

• In an Ordered array, median is the “middle” number– If n or N is odd, median is the middle number– If n or N is even, median is the average of the two

middle numbers

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5 Median = 5

Mode• A measure of central tendency• Value that occurs most often• Not affected by extreme values• Used for either numerical or categorical data• There may may be no mode• There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode

Variability

• Refers to the extent to which the scores on a quantitative variable in a distribution are spread out.

• The range represents the difference between the highest and lowest scores in a distribution.

• A five number summary reports the lowest, the first quartile, the median, the third quartile, and highest score.– Five number summaries are often portrayed graphically by the

use of box plots.

Variance• The Variance, s2, represents the amount of variability of the

data relative to their mean• As shown below, the variance is the “average” of the

squared deviations of the observations about their mean

1

)( 22

n

xxs i

Standard Deviation

• Considered the most useful index of variability.• It is a single number that represents the spread of a

distribution.• If a distribution is normal, then the mean plus or minus 3

SD will encompass about 99% of all scores in the distribution.

Calculation of the Variance and Standard Deviation of a Distribution (Definitional formula)

√

RawScore Mean X – X (X – X)

2

85 54 31 96180 54 26 67670 54 16 25660 54 6 3655 54 1 150 54 -4 1645 54 -9 8140 54 -14 19630 54 -24 57625 54 -29 841

Variance (SD2) =

Σ(X – X)2

N-1

= 3640

9 =404.44

Standard deviation (SD) = Σ(X – X)2

N-1

Comparing Standard Deviations

Mean = 15.5 S = 3.338 11 12 13 14 15 16 17 18 19 20 21

11 12 13 14 15 16 17 18 19 20 21

Data B

Data A

Mean = 15.5 S = .9258

11 12 13 14 15 16 17 18 19 20 21

Mean = 15.5 S = 4.57

Data C

Facts about the Normal Distribution

• 50% of all the observations fall on each side of the mean.

• 68% of scores fall within 1 SD of the mean in a normal distribution.

• 27% of the observations fall between 1 and 2 SD from the mean.

• 99.7% of all scores fall within 3 SD of the mean. • This is often referred to as the 68-95-99.7 rule

The Normal Curve

Different Distributions Compared

Fifty Percent of All Scores in a Normal Curve Fall on Each Side of the Mean

Probabilities Under the Normal Curve

Correlation

Correlation Coefficients

• Pearson product-moment correlation– The relationship between two variables of

degree.• Positive: As one variable increases (or decreases)

so does the other.• Negative: As one variable increases the other

decreases.

– Magnitude or strength of relationship • -1.00 to +1.00

– Correlation does not equate to causation

Positive Correlation

Negative Correlation

No Correlation

Correlations

• Thickness of scatter plot determines strength of correlation, not slope of line.– For example see:

• http://noppa5.pc.helsinki.fi/koe/corr/cor7.html

• Remember correlation does not equate causation.

http://noppa5.pc.helsinki.fi/koe/corr/cor7.html

Negative Correlation

Validity and Reliability

Chapters 8

Validity and Reliability

• Validity is an important consideration in the choice of an instrument to be used in a research investigation– It should measure what it is supposed to measure– Researchers want instruments that will allow them to make

warranted conclusions about the characteristics of the subjects they study

• Reliability is another important consideration, since researchers want consistent results from instrumentation– Consistency gives researchers confidence that the results

actually represent the achievement of the individuals involved

Reliability

• Test-retest reliability

• Inter-rater reliability

• Parallel forms reliability

• Internal consistency (a.K.A. Cronbach’s alpha)

Validity

• Face– Does it appear to measure what it purports to

measure?

• Content– Do the items cover the domain?

• Construct– Does it measure the unobservable attribute

that it purports to measure?

Validity

• Criterion– Predictive – Concurrent

• Consequential

Types of validity (cont.)

The construct

The instrument

Here the instrument samples some and only of the construct

Types of validity

The instrument

The construct

Here the instrument samples all and more of the construct

The construct

The instrument

Here the instrument fails to sample ANY of the construct

The construct

The instrument

Here the instrument samples some but not all of the construct

Perfection!

The construct and the instrument!

Reliability and Validity

Documents

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week