Upload
margarita-higdon
View
224
Download
2
Tags:
Embed Size (px)
Citation preview
Development of a Public-Domain Measure of Random Responding
Sarah E. Stegall, Darrin L. Rogers,
Emanuel Cervantes
Abstract
The Semantic Inconsistency Scale (SIS), a no-cost tool for measuring random responding in questionnaire research, was developed and validated in two independent samples. It shows strong initial evidence of validity, able to not only detect computer-generated random responses but also invalid responding caused by more realistic conditions.
Introduction
Invalid responding can threaten validity and interpretation (Huang, 2012; however, see Costa & McCrae, 1997 for alternative views).
Random responding (RR; Archer & Smith, 2008) scales measure participants’ consistency of to pairs of items with similar—or opposite—meanings (e.g., MMPI2 scale VRIN; Butcher et al., 2001; PAI scale INC; Morey, 2007).
Introduction
Methods previously used to develop and evaluate RR scales include: Comparing responses from participants instructed to
answer questionnaires randomly with subjects given standard instructions (Berry et al., 1991; Cramer, 1995; Galen & Berry,
1996) and Comparing real responses with computer-generated
random responses (Charter & Lopez, 2003)
Real-world More ecologically valid manipulations more externally
valid Not been used so far
Introduction
Commercially-marketed assessments only
Semantic inconsistency scale (SIS) Public-domain measure of RR for use with
questionnaires
Method
Participants: 482 undergraduate students 75% female, 25% male 95% Hispanic
Data Collection Phases Phase 1 (February-July, 2012):
N=286, 75% female. Phase 2 (August-December, 2012):
N=196, 81% female.
Method
Procedures & Materials: Anonymous online survey
Big Five Inventory (BFI; John & Srivastava, 1999)
SIS item pool
Fails to notice beauty until others comment on itLeaves things unfinished
I see myself as someone who…
SIS Construction
30 pairs of items From International Personality Item Pool
(Goldberg et al., 2006)
SIS Construction
Judged to be semantically related Very similar in meaning
Apparently opposite in meaning
I need a push to get startedI find it difficult to get down to work.
I spend time thinking about past mistakesI don't worry about things that have already happened.
SIS Construction
Degree of inconsistency in responses RR
Needs a push to get startedFinds it difficult to get down to work.
Spends time thinking about past mistakesDoesn’t worry about things that have already happened.
I see myself as someone who…
Strongly Disagree
DisagreeNeither Agree Nor Disagree
AgreeStrongly Agree
Strongly Disagree
DisagreeNeither Agree Nor Disagree
AgreeStrongly Agree
*Note: reverse coded*
Difference of 1
Difference of 3
Similar items
Opposite items
Method
Experimental Manipulation:
“Quick” condition (Q or “quick”) Subtly encouraged to complete the task quickly In-test messages emphasized importance of
students’ time
Control condition (A or “accurate”) Instructed to complete the survey accurately In-test messages emphasized accuracy
Method
Phase 1 Selection and validation of final item pairs
Maximized correlations Resulting 22-item (11-pair) SIS
Phase 2 SIS scale assessed using responses SIS score = mean discrepancy in SIS pairs
(possible range: 0-4)
Results
Q vs. A comparison Survey completion time Attention to survey content Real vs. random responses
All results calculated on Phase 2 sample only
(unless otherwise specified)
Q vs. A Comparison
Median SIS scores
Q > A(Wilcoxon test z=2.179,
p<.05).0
.50
.60
.70
.80
.91
.01
.11
.2
IRI-22 by Condition in Phase 2
Condition
20
% T
rim
me
d M
ea
n S
IS S
core
a q
Figure 1. Trimmed (20%) means for SIS scores in condition A (“accurate”) versus Q (“quick”).
Survey Completion Time
Correlation SIS scores Time to complete the full survey
Spearman’s rho = -.13 (p = .06)
Attention to Survey Content
Multiple choice questions content of survey items they had just seen & responded to
Number of questions answered incorrectly Prediction: positive correlation with SIS No association
Spearman’s rho = .04 (p > .05)
Real vs. 100% Random Responding
SIS discrimination between 100% random responding (computer-generated) Actual participant responses
Compare Phase 2 responses to 100,000 records of randomly-generated responses.
Score SIS on everything Real scores < Random-response scores
(t=31.56, p<.001; Figure 2).
Figure 2. Distribution of true Phase 2 SIS scores (blue) versus randomly-generated profiles (red).
Real vs. 100% Random Responding
SIS sensitivity of discrimination between True Phase 2 records Equal number of randomly-generated records
Receiver-Operator Characteristic (ROC) analysis Area under the curve (AUC) discrimination ability
of the test AUC = .95 (excellent discrimination ability)
Figure 3. ROC analysis for Phase 2 responses vs. (100%) randomly-generated response records.
Real vs. Partial Random Responding
1. Dataset split in half randomly Control group: original (real) responses Random group: X% of responses replaced with
random Randomly-selected X% of responses X goes from 1% to 100% (i.e., do this process 100 times)
Control GroupOriginal (real) Responses
Random GroupX% replaced with
random
1% < X < 100%
Real vs. Partial Random Responding
2. SIS scored & AUC calculated SIS discrimination between Control & Random
groups
Real vs. Partial Random Responding
3. Result: SIS discrimination between real and partial (from 1 to 100%) random responding
4. We repeated this entire process 100 times, to “even out” random selection
1 run (real vs. 0% to 100% random)
2 runs (real vs. 0% to 100% random)
3 runs (real vs. 0% to 100% random)
5 runs (real vs. 0% to 100% random)
10 runs (real vs. 0% to 100% random)
100 runs
• Each run: • AUCs comparing real responses to real + partial
random• 0% to 100% random
• Mean of 100 AUCs at each point
Figure 4. AUCs for 100 runs of SIS discrimination between original profiles and partially (1% through 100%) random profiles. Light blue lines are AUCs for 100 individual runs; dark blue line indicates mean AUC at each point.
Optimal Cutoff Scores
Discussion
Semantic Inconsistency Scale (SIS) Phase 1: Scale development (22-item/11-pair) Phase 2: Validated
Identification of random responding Excellent with 100% random responses Fair performance even with protocols having less
than 20% random responding. Discriminate between “Quick” & “Accurate”:
Participants primed and instructed to answer hastily Participants given regular instructions
Discussion
Perform as well as (if not better than) comparable tests
Easily inserted into a variety of psychological and personality tests
Modification of item stems or formats may allow use with an even wider range.
Discussion
Limitations and Future Directions: Not appropriate for all test varieties
Very short research Clinical protocols
Random responding is not always a problem Depends on clinical/research situation SIS might help you know whether it is
Conclusion
SIS = Robust and valid measure of random
respondingFREE: Creative Commons licensed
References
Archer, R. P., & Smith, S. R. (2008). Personality assessment. CRC Press.
Berry, D. R., Wetter, M. W., Baer, R. A., Widiger, T. A., Sumpter, J. C., Reynolds, S. K., & Hallam, R. A. (1991). Detection of random responding on the MMPI-2: Utility of F, back F, and VRIN scales. Psychological Assessment: A Journal Of Consulting And Clinical Psychology , 3(3), 418-423. doi:10.1037/1040-3590.3.3.418
Butcher, J. N., Graham, J. R., Ben-Porath,Y. S., Tellegen, A., Dahlstrom,W. G.,&Kaemmer, B. (2001). MMPI-2 (Minnesota Multiphasic PersonalityInventory-2): Manual for administration and scoring (2nd ed.). Minneapolis, MN: University of Minnesota Press.
Charter, R. A., & Lopez, M. N. (2003). MMPI‐2: Confidence intervals for random responding to the F, F Back, and VRIN scales. Journal of clinical psychology, 59(9), 985-990.
Costa Jr., P. T., & McCrae, R. R. (1997). Stability and Change in Personality Assessment: The Revised NEO Personality Inventory in the Year 2000. Journal Of Personality Assessment, 68(1), 86.
Cramer, K. M. (1995). Comparing three new MMPI-2 randomness indices in a novel procedure for random profile derivation. Journal of personality assessment, 65(3), 514-520.
Gallen, R. T., & Berry, D. R. (1996). Detection of random responding in MMPI-2 protocols. Assessment, 3(2), 171-178.
Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. G. (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40(1), 84-96.
Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27(1), 99-114.
John, O. P., & Srivastava, S. (1999). The Big Five trait taxonomy: History, measurement, and theoretical perspectives. Handbook of personality: Theory and research, 2, 102-138.
Morey, L. C. (2007). Personality assessment inventory (PAI).
Contact
http://www.darrinlrogers.com/dissemination/
Sarah Stegall: [email protected] Darrin Rogers: [email protected] Emanuel Cervantes: [email protected]