Reliability

TOOLS OF ASSESSMENT

Celine Espada

3.RELIABILTY— IS THE DATA THAT IS COLLECTED RELIABLE ACROSS APPLICATIONS WITHIN THE CLASSROOM, SCHOOL, AND DISTRICT?

Like validity, the term

reliability has been used

for many years to describe

an essential characteristic

of sound assessment.

WHAT IS A

RELIABLE SCORE?

RELIABILITY

Concerned with the

consistency , stability

and dependability of

the scores.

FREE FROM BIAS AND DISTORTION THE ASSESSMENT IS. TEACHERS MIGHT ASK THEMSELVES:

Do I have enough

information about the

learning of this

particular student to

make a definitive

statement?

• WAS THE INFORMATION COLLECTED IN A WAY THAT GIVES ALL STUDENTS AN EQUALCHANCE TO SHOW THEIR LEARNING?

Would another

teacher arrive at the

same conclusion?

Would I make the same

decision if I considered

this information at

another time or in

another way?

ESTIMATES OF RELIABILITY?

1.MEASURE OF STABILITY OR RETEST

Test-retest reliability is

usually measured by

computing the correlation

coefficient between scores of

two administrations.

2.MEASURE OF EQUIVALENCE

The equivalent form of estimate

reliability obtained by giving two

forms of a test to the same group of

individuals on the same day and

correlating the result.

Advantages

• Eliminates the problem of memory effect.

• Reactivity effects (i.e., experience of taking

the test) are also partially controlled.

• Can address a wider array of sampling of the

entire domain than the test-retest method

Possible Disadvantages

• Are the two forms of the test

actually measuring the same thing.

• More Expensive

• Requires additional work to develop

two measurement tools.

3.MEASURE OF INTERNAL

CONSISTENCY• Measures the reliability of a test

solely on the number of items on the test and the inter-correlation among the items. Therefore, it compares each item to every other item.

• If a scale is measuring a construct, then overall the items on that scale should be highly correlated with one another.

• There are two common ways of

measuring internal consistency

1. Cronbach’s Alpha:

.80 to .95(Excellent)

.70 to .80 (Very Good)

.60 to .70 (Satisfactory)

<.60 (Suspect)

2. Item-Total Correlations -

the correlation of the item

with the remainder of the

items (.30 is the minimum

acceptable item-total

correlation).

Split Half - refers to

determining a correlation

between the first half of the

measurement and the

second half of the

measurement (i.e., we

would expect answers to

the first half to be similar to

the second half).

Possible Advantages

• Simplest method - easy to perform

• Time and Cost Effective

Possible Disadvantages

• Many was of splitting

• Each split yields a somewhat different reliability

estimate

• Which is the real reliability of the test?

FACTORS AFFECTING RELIABILITY

• Poor or unclear directions given during administration or inaccurate scoring can affect reliability.

For Example - say you were told that your scores on being social determined your promotion. The result is more likely to be what you think they want than what your behavior is.

• The larger the number of items, the greater

the chance for high reliability.

For Example -it makes sense when you

ponder that twenty questions on your

leadership style is more likely to get a

consistent result than four questions.

• Remedy: Use longer tests or accumulate

scores from short tests.

For Example -If you took an instrument in

August when you had a terrible flu and

then in December when you were feeling

quite good, we might see a difference in

your response consistency. If you were

under considerable stress of some sort or

if you were interrupted while answering

the instrument questions, you might give

different responses.

• The shorter the time, the greater the chance for high

reliability correlation coefficients.

• As we have experiences, we tend to adjust our views a

little from time to time. Therefore, the time interval

between the first time we took an instrument and the

second time is really an "experience" interval.

• Experience happens, and it influences how we see things.

Because internal consistency has no time lapse, one can

expect it to have the highest reliability correlation

coefficient.

THANK YOU

Education

Reliability