8/9/2019 PPT Valid Reliable
1/16
1
VALIDITYVALIDITY
ANDANDRELIABILITYRELIABILITY
Delivered By:Delivered By:
Assoc. Prof. Dr. Othman Md.JohanAssoc. Prof. Dr. Othman Md.Johan
8/9/2019 PPT Valid Reliable
2/16
2
Primary Criteria Of EvaluationPrimary Criteria Of Evaluation
Two primary criteria of evaluation of the inTwo primary criteria of evaluation of the inany measurement or observation are:any measurement or observation are:
Whether we are measuring what we intend toWhether we are measuring what we intend tomeasure.measure.
Whether the same measurement processWhether the same measurement process
yields the same results.yields the same results. These two concepts are validity and reliability.These two concepts are validity and reliability.
8/9/2019 PPT Valid Reliable
3/16
3
ReliabilityReliability
Reliability is concerned with questions ofReliability is concerned with questions ofstability and consistencystability and consistency
-- does the same measurement tool yielddoes the same measurement tool yieldstable and consistent results whenstable and consistent results whenrepeatedrepeated over time.over time.
-- eg. a tape measure is a highly reliableeg. a tape measure is a highly reliablemeasuring instrument.measuring instrument.
8/9/2019 PPT Valid Reliable
4/16
4
ValidityValidity
Validity refers to the extent we are measuringValidity refers to the extent we are measuringwhat we hope to measure (and what we thinkwhat we hope to measure (and what we thinkwe are measuring).we are measuring).
Eg. A tape measure that has been createdEg. A tape measure that has been createdwith accurate spacing for inches, feet, etc.with accurate spacing for inches, feet, etc.should yield valid results as well.should yield valid results as well.
i.e. : Measuring this piece of wood with ai.e. : Measuring this piece of wood with a"good" tape measure should produce a correct"good" tape measure should produce a correctmeasurement of the wood's lengthmeasurement of the wood's length
8/9/2019 PPT Valid Reliable
5/16
5
To apply these concepts to PsychologicalTo apply these concepts to Psychologicalassessment and evaluation, we want to useassessment and evaluation, we want to usemeasurement tools that are bothmeasurement tools that are both reliable andreliable and
valid.valid.
We want questions that yield consistentWe want questions that yield consistentresponses when asked multiple timesresponses when asked multiple times -- this isthis is
reliability.reliability. Similarly, we want questions that get accurateSimilarly, we want questions that get accurate
responses from respondentsresponses from respondents -- this is validitythis is validity
8/9/2019 PPT Valid Reliable
6/16
6
ReliabilityReliability
Reliability refers to a condition where aReliability refers to a condition where ameasurement process yields consistent scoresmeasurement process yields consistent scores(given an unchanged measured phenomenon)(given an unchanged measured phenomenon)over repeat measurements.over repeat measurements.
Three criteria of reliability.Three criteria of reliability.
-- Test Test--retestReliabilityretestReliability
-- InterInter--item reliabilityitem reliability
-- Interobserver reliabilityInterobserver reliability
8/9/2019 PPT Valid Reliable
7/16
7
TestTest--retest Reliabilityretest Reliability
When a researcher administers the same measurementWhen a researcher administers the same measurementtool multiple timestool multiple times -- asks the same question, follows theasks the same question, follows thesame research procedures, and assuming that there hassame research procedures, and assuming that there has
been no change in whatever he is measuring etc.been no change in whatever he is measuring etc. andandhe obtains consistent results.he obtains consistent results.
Eg. When a researcher asks the same person the sameEg. When a researcher asks the same person the samequestion twice ("What's your name?"), and he gets backquestion twice ("What's your name?"), and he gets backthe same results both times. If so, the measure has testthe same results both times. If so, the measure has test--
retest reliability.retest reliability.
Measurement of the piece of wood talked about earlierMeasurement of the piece of wood talked about earlierhas high testhas high test--retest reliability.retest reliability.
8/9/2019 PPT Valid Reliable
8/16
8
InterInter--item reliabilityitem reliability
This is a dimension that applies to casesThis is a dimension that applies to cases
where multiple items are used to measurewhere multiple items are used to measurea single concept.a single concept.
In such cases, answers to a set ofIn such cases, answers to a set of
questions designed to measure somequestions designed to measure somesingle concept (e.g., altruism) should besingle concept (e.g., altruism) should be
associated with each other.associated with each other.
8/9/2019 PPT Valid Reliable
9/16
9
Interobserver reliabilityInterobserver reliability
The extent to which different interviewers or observersThe extent to which different interviewers or observersusing the same measure get equivalent results.using the same measure get equivalent results.
If different observers or interviewers use the sameIf different observers or interviewers use the same
instrument to score the same thing, their scores shouldinstrument to score the same thing, their scores shouldmatch.match.
-- Eg: the interobserver reliability of an observationalEg: the interobserver reliability of an observationalassessment of parentassessment of parent--child interaction is oftenchild interaction is oftenevaluated by showing two observers a videotape ofevaluated by showing two observers a videotape ofa parent and child at play.a parent and child at play.
If the instrument has high interobserver reliability, theIf the instrument has high interobserver reliability, thescores of the two observers should match.scores of the two observers should match.
8/9/2019 PPT Valid Reliable
10/16
10
ValidityValidity
Validity refers to the extent we are measuring what weValidity refers to the extent we are measuring what wehope to measure (and what we think we are measuring).hope to measure (and what we think we are measuring).
A valid measure should satisfy four criteria.A valid measure should satisfy four criteria.
-- Face ValidityFace Validity
-- Content ValidityContentValidity
-- CriterionCriterion--related Validityrelated Validity
-- Construct ValidityConstructValidity
8/9/2019 PPT Valid Reliable
11/16
11
Face ValidityFace Validity
An assessment of whether a measure appears, on the face of it, toAn assessment of whether a measure appears, on the face of it, tomeasure the concept it is intended to measure.measure the concept it is intended to measure.
This is a very minimum assessmentThis is a very minimum assessment -- if a measure cannot satisfyif a measure cannot satisfythis criterion, then the other criteria are inconsequential.this criterion, then the other criteria are inconsequential.
Eg. of observational measures of behavior that would have faceEg. of observational measures of behavior that would have facevalidity.validity.
-- striking out at another person would have facestriking out at another person would have face validity for anvalidity for anindicator of aggression.indicator of aggression.
-- offering assistance to a stranger would meet the criterion ofoffering assistance to a stranger would meet the criterion offace validity for helping.face validity for helping.
-- However, asking people about their favorite movie to measureHowever, asking people about their favorite movie to measureracial prejudice has little face validity.racial prejudice has little face validity.
8/9/2019 PPT Valid Reliable
12/16
12
Content ValidityContent Validity
The extent to which a measure adequatelyThe extent to which a measure adequatelyrepresents all facets of a concept.represents all facets of a concept.
Consider a series of questions that serve as indicatorsConsider a series of questions that serve as indicators
of depression (don't feel like eating, lost interest inof depression (don't feel like eating, lost interest inthings usually enjoyed, etc.).things usually enjoyed, etc.).
If there were other kinds of common behaviors thatIf there were other kinds of common behaviors thatmark a person as depressed that were not includedmark a person as depressed that were not included
in the index, then the index would have low contentin the index, then the index would have low contentvalidity since it did not adequately represent all facetsvalidity since it did not adequately represent all facetsof the concept.of the concept.
8/9/2019 PPT Valid Reliable
13/16
13
CriterionCriterion--related validityrelated validity
CriterionCriterion--related validity applies to instrumentsrelated validity applies to instrumentsthan have been developed for usefulness asthan have been developed for usefulness asindicator of specific trait or behavior, either nowindicator of specific trait or behavior, either now
or in the future.or in the future.
Eg.: driving test as a social measurement thatEg.: driving test as a social measurement thathas pretty good predictive validity.has pretty good predictive validity.
That is to say, an individual's performance on aThat is to say, an individual's performance on adriving test correlates well with his/her drivingdriving test correlates well with his/her drivingability.ability.
8/9/2019 PPT Valid Reliable
14/16
14
Construct ValidityConstruct Validity
There is not necessarily a pertinent criterionThere is not necessarily a pertinent criterionavailable for many things we want to measure.available for many things we want to measure.So we relate it to other measures as specified bySo we relate it to other measures as specified bytheory or previous research.theory or previous research.
So, Construct validity concerns with the extentSo, Construct validity concerns with the extentto which a measure is related to other measuresto which a measure is related to other measuresas specified by theory or previous research.as specified by theory or previous research.
Does a measure stack up with other variablesDoes a measure stack up with other variables
the way we expect it to?the way we expect it to? Eg: selfEg: self--esteem refers to a person's sense ofesteem refers to a person's sense of
selfself--worth or selfworth or self--respect.respect.
8/9/2019 PPT Valid Reliable
15/16
15
Construct ValidityConstruct Validity
Clinical observations in psychology had shownClinical observations in psychology had shownthat people who had low selfthat people who had low self--esteem often hadesteem often had
depression.depression.
Therefore, to establish the construct validity ofTherefore, to establish the construct validity ofthe selfthe self--esteem measure, the researchersesteem measure, the researchersshowed that those with higher scores on theshowed that those with higher scores on the
selfself--esteem measure had lower depressionesteem measure had lower depressionscores, while those with low selfscores, while those with low self--esteem hadesteem hadhigher rates ofhigher rates of
8/9/2019 PPT Valid Reliable
16/16
16
Group WorkGroup Work
Factors affecting test validityFactors affecting test validity
Factors affecting test reliabilityFactors affecting test reliability
State the method/methods use to determineState the method/methods use to determine
-- FaceV
alidityFaceV
alidity-- Content ValidityContentValidity
-- CriterionCriterion--related Validityrelated Validity
-- Construct ValidityConstructValidity
State the method/methods use to determineState the method/methods use to determine-- Test Test--retestReliabilityretestReliability
-- InterInter--item reliabilityitem reliability
-- Interobserver reliabilityInterobserver reliability