Upload
nguyentuyen
View
221
Download
0
Embed Size (px)
Citation preview
Psychophysiological and behavioral measures for
detecting concealed information: The role of memory
for crime details
GALIT NAHARIa AND GERSHON BEN-SHAKHARb
aDepartment of Criminology, Bar Ilan University, Ramat Gan, IsraelbDepartment of Psychology, The Hebrew University of Jerusalem, Jerusalem, Israel
Abstract
This study examined the role of memory for crime details in detecting concealed information using the electrodermal
measure, Symptom Validity Test, and Number Guessing Test. Participants were randomly assigned to three groups:
guilty, who committed a mock theft; informed-innocents, who were exposed to crime-relevant items; and uninformed-
innocents, who had no crime-relevant information. Participants were tested immediately or 1 week later. Results
showed (a) all tests detected the guilty in the immediate condition, and combining the tests improved detection
efficiency; (b) tests’ efficiency declined in the delayed condition, mainly for peripheral details; (c) no distinction between
guilty and informed innocents was possible in the immediate, yet some distinction emerged in the delayed condition.
These findings suggest that, while time delay may somewhat reduce the ability to detect the guilty, it also diminishes the
danger of accusing informed-innocents.
Descriptors: Concealed Information Test, Symptom Validity Test, Skin conductance response, Memory
Scientists and forensic experts have attempted for many years to
develop instruments and methods for the purpose of detecting
deception (e.g., Vrij, 2008). One notable approach, which has
spawned several methods over the past century, is the use of
psychophysiological responses (see, e.g., Ben-Shakhar & Fu-
redy, 1990; Marston, 1917; Raskin, 1989; Reid & Inbau, 1977).
In this study, we focus on just one of the two prominent methods
of psychophysiological detection, known as the Guilty Knowl-
edge Test (GKT) or the Concealed Information Test (CIT). This
method, which is designed to detect concealed knowledge, rather
than deception, is based on sound theoretical principles and
proper controls and therefore satisfies the necessary requirements
of an objective test (see Ben-Shakhar, Bar-Hillel, & Kremnitzer,
2002; Ben-Shakhar & Elaad, 2002a; Lykken, 1974, 1998).
The CIT (Lykken, 1959, 1960) utilizes a series of multiple-
choice questions, each having one relevant alternative (e.g., a
feature of the crime under investigation) and several neutral
(control) alternatives, chosen so that an unknowledgeable (in-
nocent) suspect would not be able to discriminate them from the
relevant alternative (Lykken, 1998). These relevant items are
significant only for knowledgeable (guilty) individuals and, thus,
if the suspect’s physiological responses to the relevant alternative
are consistently larger than to the neutral alternatives, knowledge
about the event (e.g., crime) is inferred. As long as information
about the event has not leaked out and assuming that each al-
ternative appears equally plausible to an individual with no guilty
knowledge, the probability that an innocent suspect would pro-
duce consistently larger responses to the relevant than to the
neutral alternatives depends only on the number of questions and
the number of alternative answers per question, and hence it can
be controlled such that maximal protection for the innocent is
provided.
Extensive research conducted since the early 1960s has dem-
onstrated that the CIT can be successfully used for detecting
relevant information and discriminating between knowledgeable
(guilty) and innocent individuals (e.g., Ben-Shakhar & Furedy,
1990; Ben-Shakhar & Elaad, 2003; Elaad, 1998; Lykken, 1959,
1960, 1998). In the last decade, the interest in the CITseems to be
growing, and various studies examining the mechanisms under-
lying this method, as well as applied questions related to its pos-
sible use as an aid in criminal investigations, have been published
(e.g., Gamer, Bauermann, Stoeter, & Vossel, 2007; Gamer &
Berti, 2010; Langleben et al. 2005; Rosenfeld et al., 2008; Rose-
nfeld, Shue, & Singer, 2007; Verschuere, Crombez, De Clercq, &
Koster, 2004; Verschuere, Crombez, & Koster, 2004).
However, in spite of the extensive research conducted on the CIT
and its impressive validity estimates, the method has been applied
extensively only in Japan (seeNakayama, 2002;Osugi, 2010).Many
possible accounts have been offered to explain this gap between
research and practice (e.g., Iacono, 2010; Kraphol, 2010; Podlesny,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
This research was funded by grants from the Israel Science Foun-
dation to Gershon Ben-Shakhar. We thank Keren Maoz, Assaf Breska,
and Tamar Pelet for their assistance in this research and EwoutMeijer for
his helpful comments.Address correspondence to: Galit Nahari, Department of Crimino-
logy, Bar-Ilan University, Ramat Gan, 52900, Israel. E-mail: [email protected]
Psychophysiology, ]]] (2010), 1–12. Wiley Periodicals, Inc. Printed in the USA.Copyright r 2010 Society for Psychophysiological ResearchDOI: 10.1111/j.1469-8986.2010.01148.x
1
P S Y P 0 1 1 4 8 B Dispatch: 4.10.10 Journal: PSYP CE: Bindu
Journal Name Manuscript No. Author Received: No. of pages: 12 PE: Deepa/Mini
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
1993), but one notable limitation of the bulk of CIT research con-
ducted so far is that it has a questionable external validity. Estimates
ofCITvaliditywere based almost exclusively onmock-crime studies,
which differ in many important respects from real polygraph exam-
inations. Roughly, these differences can be classified into twomajor
categories: (a) Motivational-emotional factors, related to the differ-
ences between committing a real crime and following the instructions
of an experimenter to commit a mock-crime, as well as differences
related to the possible consequences of an incriminating polygraph
test compared with failing a laboratory CIT (which typically means
that the participant will not receive a bonus of a few dollars); and
(b) cognitive factors, related to processing the critical information
during the crime and the ability to remember this informationduring
the test.
As the present study focuses only on the second category of
cognitive factors, only this category will be elaborated. In the
typical mock-crime experiment, it is guaranteed that all subjects
learn all the relevant items (e.g., six features of the mock crime,
such as the color of an envelope stolen and the amount of money
it contained). Furthermore, subjects are typically tested imme-
diately after being exposed to this critical information, thus
memory does not play a role in the experimental situation. In real
life, things are typically entirely different. The guilty person is
faced with a complex scene, and it cannot be assumed that all
details were indeed noticed, processed, and stored in memory.
Criminal suspects are very rarely tested immediately after
committing the criminal act. In most cases they are tested
days, weeks, and sometimes months after the crime was com-
mitted (see Ben-Shakhar & Furedy, 1990; Carmel, Dayan,
Naveh, Raveh, & Ben-Shakhar, 2003).
Carmel et al. (2003) were the first to systematically examine
these cognitive aspects of the external validity of CITexperiments
by comparing the standard mock crime procedure with a more
realistic type of mock crime and by comparing immediate and
delayed CITs. The results of this study revealed that the ‘‘real-
istic’’ mock-crime was associated with overall lower recall rates
and weaker detection efficiency than the standard procedure.
However, these effects were mediated by the type of CIT ques-
tions used, such that the decline in memory and detection effi-
ciency was observed mainly for peripheral items that were not
directly related to themock crime (e.g., a picture on the wall), but
not for items that were central to the event (e.g., the amount of
money stolen). The results further indicated that a CIT based
exclusively on the central items was unaffected by the type of
mock-crime procedure. More recently, Gamer, Kosiol and
Vossel (2010) also demonstrated that central items, but not pe-
ripheral ones, are recalled after a 2-week period. Thus, these
studies imply that a careful selection of central items (e.g., modus
operandi, type ofweapon used) can produce high accuracy levels,
not only in the artificial laboratory conditions, but also in more
realistic settings.
Another potential limitation of the CIT is the possibility that,
in actual criminal cases, some critical information may leak out
to innocent suspects. Leakage of information to unaware sus-
pects may lead to enhanced responses to these items and even-
tually to a misclassification of the informed innocent suspects as
guilty (e.g., Bradley, Barefoot, & Arsenault, 2010). Several stud-
ies examined the effects of exposing the critical information to
‘‘innocent’’ subjects in mock-crime experiments (e.g., Ben-
Shakhar, Gronau, & Elaad, 1999; Bradley, MacLaren, & Carle,
1997; Bradley & Rettinger, 1992; Bradley & Warfield, 1984)
and generally demonstrated that, although informed innocent
subjects showed smaller responses to the critical items than guilty
subjects, they did show significantly larger responses to these
items when compared with uninformed innocent subjects. Brad-
ley and his colleagues (e.g., Bradley &Warfield, 1984; Bradley et
al., 1997) proposed a method, labeled the Guilty Action Test
(GAT), in which subjects are asked about their actions rather
than their knowledge. Bradley et al. (1997) demonstrated that,
while the GATwas associated with a smaller rate of false positive
outcomes in informed innocents than the standard version of the
CIT, it still produced a much larger rate of false positive out-
comes in informed innocents compared with uninformed inno-
cents. Recently, Gamer, Verschuere, Crombez, andVossel (2008)
used the GAT and compared ‘‘guilty’’ subjects with ‘‘informed
innocents’’ both when tested immediately after committing a
mock crime and when tested 2 weeks later. They found that,
while ‘‘guilty’’ subjects tended to forget only the peripheral items
during this 2-week period, the informed innocents forgot all
items. Consequently, detection of guilty subjects remained stable
(i.e., the areas under the Receiver Operating Characteristic
(ROC) were 0.89 and 0.90 in the immediate and delayed con-
ditions, respectively), whereas erroneous detection of informed
innocents was significantly reduced in the delayed condition (the
ROC areas were 0.95 and 0.75 in the immediate and delayed
conditions, respectively).
The purpose of the present study is to continue and extend the
line of research initiated by Carmel et al. (2003) and Gamer et al.
(2010). Specifically, we used the more realistic type of mock
crime proposed by Carmel et al. (2003) and a 3 � 2 between-
subjects design with guilt (‘‘guilty,’’ ‘‘informed innocents,’’and
‘‘uninformed innocents’’) and time of testing (immediate vs. de-
layed by 1 week) as the two orthogonal factors. Furthermore, in
addition to measuring skin conductance, which has been dem-
onstrated as the most efficient autonomic measure in CIT re-
search (e.g., Gamer, Verschuere, Crombez, & Vossel, 2008), we
examined two behavioral measures that have been rarely applied
for detecting concealed information.
Both of these measures are based on asking examinees, who
deny knowledge of some critical items, to guess these items.
Effective concealment is possible when guessing is random (i.e.,
where the critical alternative is guessed with the same probability
as all other alternatives), but producing random guesses may be
very difficult for those who are actually aware of the true alter-
natives. Consequently, the outcome of multiple guessing at-
tempts may differentiate knowledgeable (who would not be able
to produce random guessing) and unknowledgeable examinees
(whose guesses will be random). Specifically, we adopted the
Symptom Validity Test (SVT), which is a forced-choice self-re-
port test (with two alternative answers for each question) that has
been used to detect malingering in various contexts (e.g., Me-
rckelbach, Hauer, & Rassin, 2002; Pankratz, Fausti, & Peed,
1975; Verschuere, Meijer, & Crombez, 2008). The SVTmay be a
promising tool for detecting concealed information because it is
based on an entirely different rationale than the physiological
measures and thus may add non-redundant information. Re-
cently, Meijer, Smulders, Johnston and Merckelbach (2007)
demonstrated that the SVT can be a valuable tool for detect-
ing concealed knowledge and, at least in some conditions, it
can increase the validity of CITs based on skin conductance re-
sponse (SCR).
The second measure adopted in this study was derived from
the Number Guessing Test (NGT) proposed by Lieblich and
Ninio (1972) and by Lieblich, Shaham, and Ninio (1976). It is
2 G. Nahari & G. Ben-Shakhar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
based on a similar rationale to the SVT, but relies on guessing
values of continuous variables, rather than guessing which of two
possible alternatives is the correct one. Specifically, this method
utilizes several numerical items (e.g., the house number where a
crime was committed, the day of the month when the event oc-
curred). As in the SVT, examinees are asked to guess the correct
value of each item, and the detection measure is based on the
correlation between the true profile and the profile guessed by
each examinee. It is expected that knowledgeable examinees will
produce larger correlations (either positive or negative) than un-
knowledgeable individuals.
In the present experiment, we examine the utility of the three
detection measures (SCRs based on the CIT, SVT, and NGT) in
differentiating between ‘‘guilty,’’ ‘‘informed innocents,’’ and
‘‘uninformed innocents’’ both when examined immediately after
committing a mock crime and 1 week later. In addition, we ex-
amine whether memory of the critical items and detection effi-
ciency depend on the type of items used (central vs. peripheral).
Methods
Participants
One hundred and twenty Hebrew University of Jerusalem
undergraduate students (86 females and 34 males) participated
in the experiment for course credit or payment (they receive 40
New Israeli Shekels (NIS), which is equivalent to US$10.50) or
course credit. Their mean age was 24.06 (SD5 3.22) years. Par-
ticipants were recruited through ads placed on notice boards
throughout the campus. All participants signed a consent form
indicating that participation was voluntary and that they could
withdraw from the experiment at any time without penalty.
Eleven participants were eliminated due to unusually high skin
resistance levels or excessive movements during the experiment,
and eight additional participants were eliminated because they
did not commit themock crime or failed to show up to the second
part of the experiment. These participants were replaced, so the
total number of participants remained at 120.
Apparatus
Skin conductance was measured by a constant voltage
system (0.5 V Atlas Researches, Hod Hasharon, Israel). Two
Ag/AgCl electrodes (0.8-cm diameter) were used with a 0.05 M
NaCL electrolyte. The experiment was conducted in an air-
conditioned laboratory, and anNECCF-500 computer was used
to control the stimulus presentation and compute skin conduc-
tance changes. The stimuli were displayed on the computer
monitor.
Design
A3 � 2 between-participants designwas used, with the following
two orthogonal factors: (a) group: Participants either performed
a mock-crimeF’’guilty’’ condition, didn’t performed a mock-
crime but were informed about the relevant detailsF’’informed-
innocent’’ condition, or didn’t perform the mock-crime and had
no knowledge of the relevant detailsF ‘‘uninformed-innocent’’
condition; and (b) time of test: immediately after the first stage of
the experiment (see below)Fimmediate condition, or after 1
weekFdelayed condition. The participants were randomly
allocated to the six conditions created by this design, with 20
participants in each condition.
Procedure
The experiment was conducted in two stages:
Stage 1
Participants arrived at the laboratory individually at the pre-
determined time. They were met by an assistant who read out
loud the instructions appropriate for their particular condition.
Guilty participants. Were instructed to go to an office of a
staff member and ask for a particular numbered article. They
have been told that, if the staff member is not in his office, they
should open the office using a key that was handed to them in
advance, enter the room, and find the particular article in a pile of
numbered articles placed on the desk. In addition, they were
requested to take advantage of the situation and steal an envelope
withmoney and a jewel, to hide it in amail box that was indicated
to them, and then to enter the laboratory and hand over the
requested article to the assistant. Actually, the staff member was
never in the office, and thus all participants in this experimental
condition were able to steal the envelope. Upon arrival at the
designated office, participants faced a locked door with the name
of the staff member and the office number typed on it. They
opened the office using the key, and, when they entered the office,
they saw that the light was turned on. On the desk, they found a
pile of numbered articles with a newspaper on the top of it. Beside
the pile were a family photo and a soft drink bottle. They found
the requested article and looked for the envelope in the room.
The envelope was located in the first drawer of a cabinet. It was a
colored envelope with a date on it. The envelope was open, and
contained Euros bills and a jewel. A note with a name was at-
tached to the bills by an office clip. After checking its contents,
the participants stole the envelope, dropped it in the mail box,
and returned to the lab with the requested article.
A total of six profiles of items were used in the CIT. Each of
these profiles was composed of 11 items, described in Table 1.
Role of memory for crime details 3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Table 1. Profiles of Items Used in the Experiments
ProfileEnvelopecolorn
Nameon note
Victim’sfamily namen Soft drink Newspaper Jeweln
Articlenumbern
Officenumbern
Sum ofEurosn Date
Sex ofvictim
Buffer Yellow Lisa Topaz Mineral water Hazofe Brooch 6 5 26 15 –a Green Marsha Koren Sprite Haaretz Earrings 27 15 8 11 Maleb Orange Lora Morag Ice-tea Yediot aharonot Ring 15 10 22 26 Femalec Blue Susan Carmel Coca-cola Maariv Necklace 19 25 6 28 Maled Red Judy Marom Orange juice Israel hayom Neck-pendent 22 20 14 6 Femalee Purple Ashlee Zamir Soda water Globes Bracelet 12 30 4 19 Female
Note: ‘Sex of victim’ is the only item among the 11 items that did not appear in the CIT, but only on the SVT. Thus, it doesn’t have a buffer profile.nThese items were classified as central.
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
One profile, the buffer profile, was used only in the interrogation
phase of the experiment, and was never used as the relevant
profile. One of the other five profiles of items (a–e) was randomly
chosen as the relevant profile for each participant, such that each
profile served as the relevant profile for 20% of the participants.
Eight additional relevant items, which were identical for all par-
ticipants, were used in this experiment for the SVT. These items
are described in Table 2.
Informed-innocent participants. Read an article entitled ‘‘A
Scandal: Theft in the Campus.’’ The article described the mock-
crime and included all the relevant details (according to the par-
ticular profile assigned to the participant). To give the impression
that the article was real, it was embedded among other articles in
a student newspaper. Participants were not asked to memorize
the details in order to preserve the more realistic nature of the
manipulation (See Carmel et al., 2003). After reading the article,
as a control assignment, the participants were requested to go to
the teaching assistants’ mail boxes, where they found a short
questionnaire dealing with personal hobbies and interests. They
were requested to fill out the questionnaire and drop it into an-
other mail box that was pointed out to them (the same mail box
in which the guilty participants were instructed to put the stolen
envelope).
Uninformed-innocent participants. were exposed to the same
procedure as the Informed-innocent participants, except that the
article they read didn’t reveal any of the relevant details.
Stage 2
CIT, SVT, and NGT were administered to all participants.
Participants in the immediate condition took the tests immedi-
ately after stage 1, and those in the delayed condition took it 1
week later. An experimenter, who was unaware of the exper-
imental condition to which the examinee was assigned, informed
the participants that a theft was committed in the Psychology
Department, and that they are suspects in committing this theft.
He/she explained that the experiment was designed to test
whether they could cope with lie detection tests and convince the
examiner that they are innocent of stealing the money and jewel.
It was emphasized that beating these tests is a difficult assignment
that only few people can succeed in, and they were promised a
bonus of 10 NIS (about $2.50) for a successful performance of
the task. Subsequently, the participant was attached to the elec-
trodes, and the CIT examination was conducted. The CITques-
tions were presented after an initial rest period of 2 min, during
which skin conductance baseline was recorded. All examinees
were presented with ten different questions, each targeting a
different relevant detail of the mock crime (the envelope color,
the name of the person written on the note, the family name of
the office owner, the brand of soft drink, the name of the news-
paper, the type of jewel, the number of the requested article, the
sum of money, the office’s number, and the date written on the
envelope). The questions were simultaneously presented on the
computer monitor and heard through the computer speakers.
Each question was followed by a buffer item, designed to absorb
the initial orienting response, and a set of five items (the relevant
item and four neutral control items). The order of the questions
as well as the order of the five items within each question was
randomized. Each questionwas presented for 10 s, and each item
(alternative answer) was presented for 5 s. The inter-stimulus
interval (blank screen) ranged randomly from 16 to 24 s with a
mean of 20 s. Participants were asked to respond verbally, saying
‘‘no’’ to every item. A short, participant-terminated break was
given after presentation of five questions.
Upon completion of the CIT, participants were detached
from the electrodes and performed the SVTandNGT, using a PC
computer. The SVT consisted of 15 questions, each with 2 al-
ternative answersFthe relevant detail (correct answer) and a
non-relevant detail (wrong answer). Six of the SVT questions
resembled those of the CIT (the envelope color, the name of the
person that was written on the note, the family name of the
office’s owner, the brand of soft drink, name of the newspaper,
and type of jewel). For each of these 6 questions, the alternative
to the correct answer was chosen randomly from among the 4
control items, included in the CIT. The other 9 questions ap-
peared only on the SVT, 8 of them had a fixed alternative answer,
while for the 9th (the victim’s gender), the answer depended on
the specific profile. These questions along with the correct and
incorrect alternative answers are displayed in Table 2.
The questions appeared on the screen, one at a time, with the
two alternative answers. The participants were instructed as
follows: ‘‘Please choose one alternative answer each time and if
you do not know the answer, just guess it!’’ Participants were not
aware of the length of the test, and thus would have had difficulty
adjusting their performance in accordance with chance. The
NGT consisted of 4 open questions referring to numerical rel-
evant details, which were included in the CIT (the number of the
requested article, sum of money, the office’s number, and the
date that was written on the envelope). The participants were
informed that answers should be within the range of 1 to 30, and
were instructed as follows: ‘‘Please type your answer by using the
keyboard and if you do not know the answer, just guess it.’’
Before each test, the experimenter indicated to the participants
that the correct answers were known only to the thief. Following
Carmel et al. (2003) and Gamer et al. (2010), the 19 questions
included in this experiment were classified as either central
(questions directly related to the execution of the mock crime) or
peripheral (questions related to items that were present in the
crime scene, but were unrelated to its execution). Tables 1 and 2
specify for each question whether it was classified as central
or peripheral.
At the end of the questioning session, the experimenter
thanked the participants, and asked them to wait until the com-
puter program processed the data of the tests and reached a
decision as to whether they were found ‘‘guilty’’ or ‘‘innocent.’’
4 G. Nahari & G. Ben-Shakhar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Table 2. Items Included Only in the SVT
Question CurrencynVictim’stitlen
Envelope’sconditionn
Familyphoto
Position ofnewspapern Drawern
Glasses indrawer
Light inofficen
Victim’sgender
Correct answer Euro Dr. Open Present On top of the pile First Absent On See Table 1Alternative answer Dollar Professor Closed Absent Not on top Second Present Off See Table 1
nThese items were classified as central.
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
The processing took 1 min, and subsequently two memory tests
were administered to examine whether participants recalled the
relevant items: the first was a recall memory test consisting of the
19 questions used in the three detection tests, and the second was
a recognition memory test in which participants were requested
to choose the correct alternative on a copy of the SVTand NGT
which, together, covered all the 19 questions that were used.
Guilty and informed-innocent participants responded to both
recall and recognitionmemory tests, while uninformed-innocents
responded only to the recognition memory test. All participants
were asked to attempt to recall or recognize the relevant details
and guess only when they didn’t know the answer. Level of con-
fidencewas rated for each answer on a 6-point scale ranging from
1 (not confident at all) through 6 (very confident). In addition,
participants filled up a questionnaire regarding their perfor-
mance in the experiment. Specifically, they were asked about
their motivation to beat the tests, whether or not they used a
strategy during the tests, etc. Finally, all participants were de-
briefed and compensated.
Scoring of the Dependent Measures
SCR
Responses were transmitted in real time to the computer.
SCR was defined as the maximal increase in conductance ob-
tained from the examinee, from 1 s to 5 s after stimulus onset and
computed using an A/D (NB-MIO-16) converter with a sam-
pling rate of 50 Hz. To eliminate individual differences in re-
sponsivity and permit meaningful comparisons of the responses
of different examinees, each participant’s SCR was transformed
into within-examinee standard scores (Ben-Shakhar, 1985). To
minimize habituation effects, within-block standard scores were
used (see Ben-Shakhar & Elaad, 2002b; Elaad & Ben-Shakhar,
1997). The 60 items (see Table 1 for a description of the 10 ques-
tions used in the CIT, with 6 alternative items for each question)
were divided into 2 blocks, each consisting of 30 items. Thus, the
z scores used in this study were computed relative to the mean
and standard deviation of the participant’s responses to the 30
items of each block. Finally, two detection scores were computed
for each participant (one for each item-type category) by aver-
aging the standardized SCRs elicited by the critical items within
each item-type category.
SVT
An unknowledgeable individual (uninformed innocent) is ex-
pected to guess the answers on the SVTand thus give about 50%
correct answers (chance level). It is hypothesized that a person
who is aware of the critical items will be unable to ignore this
information when answering the SVTand consequently deviate
from chance level performance. Although it is reasonable to as-
sume that individuals attempting to conceal critical items will
display below chance level performance on the SVT (e.g., Ve-
rschuere et al., 2008), we defined a detection measure based on
the SVT as the absolute deviation of the percent of correct an-
swers from chance level (50%). Specifically, this measure was
defined as jP� 50%j, where P is the percent of the participant’s
correct answers. We used this measure because in some cases
knowledgeable individuals may use their knowledge to guess
above chance level.
NGT
The NGT-based detection measure was defined as
the absolute value of the Pearson correlation coefficient
between the actual values and the values guessed by the
participant.1
Data Analysis and Statistics
Each dependent measure (rates of correctly recognized items and
the three detection scores constructed for SCR, SVT, andNGT2)
was subjected to a mixed 2 � 3 � 2 analysis of variance
(ANOVA), with item-type (central vs. peripheral) serving as a
within-subjects factor and group (‘‘guilty,’’ ‘‘informed inno-
cents,’’ and ‘‘uninformed innocents’’) and time of CIT (imme-
diate vs. delayed) serving as the 2 between-subjects factors. This
was followed by two sets of orthogonal planned contrasts. The
first, which was designed to examinemore closely the effect of the
item-type factor and its interaction with the other factors by
excluding the ‘‘uninformed innocents’’ (for whom no item-type
effect is expected), included the following contrasts: (1) The de-
pendent measure difference between central and peripheral items
among ‘‘guilty’’ participants was compared with the respective
difference among ‘‘informed innocents’’(i.e., examining the item-
type � group interaction, excluding ‘‘uninformed innocents); (2)
The dependent measure difference between central vs. peripheral
items in the immediate condition was compared with the respec-
tive difference in the delayed condition (i.e., examining the item-
type � time of testing interaction, excluding ‘‘uninformed inno-
cents’’ and (3) A contrast examining whether the item-type
differences reflect a group � time interaction, (i.e., whether the
item-type differences among ‘‘guilty’’ participants are less
affected by delaying the test than the respective differences
among ‘‘informed innocents.’’
The second set of contrasts, which was designed to examine
more closely the effects of the between-subjects factors, included
the following four planned contrasts: (1) Combined ‘‘guilty’’ and
‘‘informed innocents’’ (knowledgeable participants) were com-
pared with the ‘‘uninformed innocents’’; (2) ‘‘Guilty’’ were com-
pared with ‘‘informed innocents’’; (3) The time effect (defined as
the dependent variable difference between the immediate and the
delayed conditions) among knowledgeable participants was com-
pared with the time effect among ‘‘uninformed innocents’’; and (4)
The time effect among ‘‘guilty participants’’ was comparedwith the
time effect among ‘‘informed innocents.’’ A rejection region of
po.05was used for all statistical tests, and effect size estimateswere
computed, using Cohen’s f (Cohen, 1988). One-tailed tests were
used to test directional, a priori formulated hypotheses.
Results
Memory Tests
As the pattern of the results of the recall and recognition tests
were essentially similar, only the results of the recognition tests
Role of memory for crime details 5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
1We used a slightly different measure than the one employed by Lie-blich and Ninio (1972) and Lieblich et al. (1976). They transformed eachnegative correlation into the absolute value of the observed correlationplus one. This measure was inefficient because many uninformed inno-cents produced negative correlations (which were expected when partic-ipants are guessingwith no prior knowledge) and adding 1 to the absolutevalue of these correlations inflated the detection measure among theseparticipants and resulted in a high rate of false positives.
2As the NGT is based on a limited number of numerical items, it wasimpossible to compare central and peripheral items in this context, andthus the item-type factor was not included in the NGTanalysis. Thus, a3 � 2 between-subjects ANOVA, with group and time as the two or-thogonal factors, was conducted.
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
are presented. The recognition results of four participants were
lost, and thus the following analyses are based on 116 partici-
pants. The mean rates of correctly recognized items were com-
puted across the 12 central and the 7 peripheral items, and they
are displayed in Figure 1 as a function of experimental condition.
The ANOVA, conducted on the recognition rates, yielded the
following outcomes: The results of the within-subject factors re-
vealed a statistically significant interaction between item-type
and group (F(2,110)5 12.40, f5 0.31, po.05). The main effect
of item-type was not statistically significant (F(1,110)5 2.23,
f5 0.07), mainly because item-type differences are neither
expected nor observed in the ‘‘uninformed innocent’’ condition.
The interaction of item-type with time of testing as well as the
triple interaction produced very small and non-significant effects
(Fo1 in both tests).
The three orthogonal contrasts conducted after excluding the
‘‘uninformed innocents’’ revealed that, consistent with our
hypothesis, the advantage of central over peripheral items (i.e.,
higher recognition rates) was significantly more pronounced
among ‘‘guilty’’ than among ‘‘informed innocent’’ participants
(t(110)5 3.49, f5 0.27, po.001). However, in contrast to our
hypothesis, the advantage of central over peripheral items was
not more pronounced in the delayed than in the immediate test
(t(110)5 0.17). Finally, the item-type differences did not reflect a
group � time interaction (t(110)5 1.87, f5 0.10).
The analysis of the between-subjects factors revealed statisti-
cally significant results for both main effects (F(2,110)5 78.84,
f5 1.16, po.001 for the group factor and F(1,110)5 17.99,
f5 0.38, po.001 for the time of testing factor). The interaction
between these two factors was also statistically significant
(F(2,110)5 10.07, f5 0.40, po.001). The four orthogonal con-
trasts conducted following this analysis revealed that: (1) Com-
bined ‘‘guilty’’ and ‘‘informed innocents’’ (knowledgeable
participants) displayed significantly larger rates of correctly recog-
nized items than unknowledgeable participants (t(110)5 8.88,
f5 0.82, po.001). (2) The difference in the rate of correctly rec-
ognized items between the ‘‘guilty’’ and the ‘‘informed innocents’’
was not statistically significant (t(110)5 0.95). (3) As expected, the
time effect (i.e., a smaller rate of correctly recognized items in the
delayed than in the immediate condition) was significantly larger
for knowledgeable than for unknowledgeable participants
(t(110)5 4.01, f5 0.36, po.001). (4) Similarly, a significantly
larger time effect was found for ‘‘informed innocents’’ than for
‘‘guilty’’ participants (t(110)5 2.66, f5 0.23, Po.01).
SCR
The means of the SCR detection scores, computed across par-
ticipants within each experimental condition and each item-type
category, are displayed in Figure 2. These data were subjected to
the same ANOVA conducted for the recognition results. The
6 G. Nahari & G. Ben-Shakhar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Figure 1. Means and Standard Errors of the rate of correctly recognized items, computed across the 12 central and 7 peripheral questions within each
experimental condition.
Figure 2. Means and Standard Errors of the Standardized SCRs to the Relevant Items, computed across the 6 central and 4 peripheral within each
experimental condition.
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
item-type factor showed a statistically significant main effect
(F(1,114)5 14.18, f5 0.23, po.001) indicating that central items
elicited larger relative SCRs than the peripheral items, but it did
not show any statistically significant interactions with the other
factors. However, these insignificant interactions may be due to
the inclusion of the uninformed innocents for whomneither item-
type nor time of CIT should make a difference. Indeed, the three
orthogonal contrasts conducted, excluding the ‘‘uninformed in-
nocents,’’ revealed that, consistent with our hypothesis, the ad-
vantage in detection of central over peripheral items was more
pronounced in the delayed CIT than in the immediate test
(t(114)5 1.75, po.05, one-tailed, f5 0.11). On the other hand,
in contrast to our hypothesis, the advantage in detection of cen-
tral over peripheral items was not more pronounced in the
‘‘guilty’’ than in the ‘‘informed innocents’’ (t(114)5 1.46,
f5 0.08). Finally, the contrast examining whether the item-type
differences reflect a group � time interaction did not yield a sta-
tistically significant result (t(114)5 0.49).
The analysis of the between-subjects factors revealed a sta-
tistically significant group effect (F(2,114)5 14.59, f5 0.48,
po.001) and a smaller time effect (F(1,114)5 3.56, po.05, one-
tailed, f5 0.15), reflecting larger relative SCRs in the immediate
than in the delayed condition. The group factor showed also a
statistically significant interaction with time (F(2,114)5 4.49,
f5 0.24, po.05). This interaction was expected as time of CIT
should affect only knowledgeable participants. The four orthog-
onal contrasts, conducted to examine more closely the group
effect and its interaction with time, revealed that: (1) knowl-
edgeable participants showed a significantly larger SCR detec-
tion score than non-knowledgeable participants (t(114)5 4.20,
f5 0.37, po.001); (2) guilty did not differ significantly from in-
formed innocents (t(114)5 .47); (3) the time effect (larger detec-
tion score in the immediate than in the delayed condition) was
significantly larger for knowledgeable participants than for ‘‘un-
informed innocents’’ (t(114)5 1.72, po.05, one-tailed; f5 0.13);
and (4) the comparison of the time effect on ‘‘guilty’’ vs. ‘‘in-
formed innocents’’ did not yield a statistically significant out-
come (t(114)5 .1.02; f5 0.02).
SVT
The means of the SVT detection scores computed across partic-
ipants are displayed in Figure 3 as a function of item-type and
experimental condition.
The data of Figure 3 were subjected to the same analyses
applied for the CIT and the recognition results. Surprisingly,
central items produced a significantly smaller SVT detection than
peripheral items (F(1,114)5 3.86, f5 0.11, p5 .052). However,
an inspection of Figure 3 reveals that this trend was due to
differences between the two item-types in the uninformed inno-
cents, who are obviously guessing and are unable to differentiate
between central and peripheral items. Indeed, when the unin-
formed innocents were excluded, the differences between the two
item-types were no longer significant. In addition, no statistically
significant interactions between item-type and the other factors
were found. The same three planned contrasts involving the item-
type factor were computed as in the previous analyses, and none
revealed a statistically significant outcome (t(114)5 1.26,
f5 0.06 for the item-type � time interaction; t(114)5 0.45 for
the item-type � group interaction; and t(114)5 .1.51, f5 0.09
for the triple interaction).
The analysis of the between-subjects factors revealed that
both the two main effects and their interaction produced statis-
tically significant outcomes (F(2,114)5 8.58, f5 0.36, po.001
for the group factor; F(1,114)5 3.23, po.05, one-tailed,
f5 0.14, for the time factor, reflecting larger SVT detection
score in the immediate than in the delayed condition; and
F(2,114)5 3.46, f5 0.20, po.05 for the group � time interac-
tion, indicating that as expected the reduction over time in the
detection measure was small in the ‘‘guilty’’ condition, but much
more pronounced with the ‘‘informed innocents’’). To examine
more closely these effects, we conducted the same four planned
contrasts computed for the CITand recognition data. The results
of these analyses were generally similar to the SCR results, in-
dicating that, while knowledgeable participants displayed a
larger average value of the SVT detection score than unknowl-
edgeable participants (t(114)5 4.27, f5 0.38, po.001), there
were no significant differences between ‘‘informed innocents’’
and ‘‘guilty’’ participants (t(114)5 .0.12). The effect of time of
testing was larger for knowledgeable participants than for un-
knowledgeable (t(114)5 3.17, f5 0.27, po.001) and, unlike the
SCR results, it was larger for ‘‘informed innocents’’ as compared
with ‘‘guilty’’ (t(114)5 2.44, f5 0.25, po.01).
NGT
The means of the NGT detection scores computed across par-
ticipants within each condition are presented in Figure 4. These
Role of memory for crime details 7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61Figure 3. Means and Standard Errors of the SVT-based detection measure, computed across the 9 central and 6 peripheral within each experimental
condition.
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
data are based on 113 participants, as seven participants guessed
the same value for all 4 questions, and thus it was impossible to
compute a detectionmeasure for them. A 3 � 2 between-subjects
ANOVA, with group and time as the two orthogonal factors,
was conducted on the data of Figure 4. This analysis yielded a
statistically significant group � time of test interaction
(F(2,107)5 3.47, f5 0.21, po.05). To further examine the na-
ture of this interaction and possible group differences, we con-
ducted the same 4 planned contrasts computed for the analysis of
the recognition, CIT, and SVTdata.Knowledgeable participants
showed significantly larger NGT detection scores than unknowl-
edgeable participants (t(107)5 2.24, f5 0.19, po.05), but there
was no significant difference between the ‘‘guilty’’ and the
‘‘informed innocents’’ (t(107)5 1.02, f5 0.02). In addition, the
reduction in the detection score over time of testing was signifi-
cantly larger with knowledgeable than with unknowledgeable
participants (t(107)5 2.57, f5 0.22, po.01), but no time effect
differences were found between the ‘‘guilty’’ and the ‘‘informed
innocents’’ (t(107)5 0.44).
ROC Curves
An additional approach for describing and comparing detection
efficiency was adopted from Signal Detection TheoryFSDT
(e.g., Green & Swets, 1966; Swets, Tanner, & Birdsall, 1961).
This approach is particularly useful for analyzing psychophys-
iological as well as behavioral detection data, and it has been
applied extensively in this area (e.g., Ben-Shakhar & Elaad,
2003; National Research Council, 2003). Typically, detection
efficiency is defined in terms of the relationship between the de-
tectionmeasure and the actual guilt (or knowledge of the relevant
items). In SDTterms, this is measured by a ROC curve reflecting
the degree of separation between the distributions of the detec-
tion score of ‘‘guilty’’ and ‘‘innocent’’ participants. In the present
experiment, there are two groups of knowledgeable participants,
and the ROC for each of these groups was constructed by com-
paring the detection score distribution of the knowledgeable
participants (either ‘‘guilty’’ or ‘‘informed innocents’’) with the
respective distribution of the ‘‘uninformed innocents.’’ These
ROCs were constructed within each experimental condition for
eachmeasure, based on the 12 central, the 7 peripheral, as well as
all 19 items. In addition, we examined the possibility of com-
bining the three detection measures and constructed additional
ROC curves, one based on a combination of the SCR and SVT
and another based on a combination of all three measures. The
measures were combined by using simple averages of the stan-
dardized detectionmeasures (eachmeasure was first transformed
into standard scores based on the entire sample, and then the
three standardized measures were averaged). We did not apply
optimal weights to the three detection measures to avoid
the possibility of inflating detection efficiency estimates due to
capitalization on chance.
Table 3 displays the areas under the ROC curves of the
various measures as a function of item-types and experimental
conditions. An inspection of Table 3 reveals that detection effi-
ciency of ‘‘guilty’’ participants, as reflected by the ROC area,
ranged in the immediate testing from 0.69 to 0.82 when a single
8 G. Nahari & G. Ben-Shakhar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Figure 4. Means and Standard Errors of the NGT-based detection measure computed within each experimental condition.
Table 3. AreasUnder the ROCCurves Computed for EachDetectionMeasure and for 2 Combinations of theseMeasures,Within Each Item
Category, Across Categories, and Within Each Experimental Condition
CIT SVT NGT CIT1SVT All 3 Tests
All Central Peripheral All Central Peripheral All All All
GuiltyImmediate 0.82nn 0.78nn 0.77nn 0.77nn 0.69n 0.73n 0.81nn 0.94nn 0.97nn
Delayed 0.76nn 0.80nn 0.55 0.68 0.77nn 0.62 0.49 0.84nn 0.80nn
Informed-innocentImmediate 0.91nn 0.76nn 0.87nn 0.87nn 0.81nn 0.83nn 0.70n 0.97nn 0.97nn
Delayed 0.64 0.74n 0.44 0.54 0.64 0.50 0.46 0.65 0.66
npo.05; nnpo.01.
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
measure is considered, and it increased to a level of 0.94, or even
0.97, when two or three measures were combined. This increase
in the ability to differentiate ‘‘guilty’’ from ‘‘uninformed inno-
cents’’ with the addition of the two behavioral measures can be
accounted for by the fact that these behavioral measures reflect
different psychological processes than the psychophysiological
measure. Indeed, the Pearson correlation coefficients among the
three measures, computed across all knowledgeable participants,
were nearly zero (ranging between � 0.05 and 0.02).
In the delayed testing condition, detection efficiency generally
decreased, and both the SVTand NGT produced detection effi-
ciency estimates that don’t significantly exceed a chance level of
0.50. The SCR, on the other hand, remained relatively stable and
produced an area of 0.76 when all items were considered and 0.80
when only central items were used. The addition of the SVT
further increased the area to a level of 0.84 in the delayed con-
dition.
While this may be seen as good news, it must be qualified by
the relatively high areas obtained for the ‘‘informed innocents,’’
whichmeans that the risk of false-positive outcomeswhen critical
information is leaked out may be severe. This danger is partic-
ularly severe in immediate testing, but much less when the test is
delayed. In fact, in almost all cases, the areas computed for the
‘‘informed innocents’’ in the delayed testing were not signifi-
cantly larger than chance. For example, the ROC area computed
for the ‘‘informed innocents’’ decreased from 0.91 to 0.64 when
only the SCR was used and from 0.97 to 0.66 when all three
measures were used. To further examine whether ‘‘guilty’’ and
‘‘informed innocents’’ can be differentiated, additional ROC
curves were constructed, such that sensitivity represented the rate
of correctly classifying ‘‘guilty’’ participants and false-positive
rate represented the proportion of ‘‘informed innocents’’ classi-
fied as ‘‘guilty.’’ The results of this analysis revealed that all the
areas under the ROC curves in the immediate condition were
around a chance level of 0.50, but increased somewhat in the
delayed condition (e.g., it increased from 0.48 to 0.65 for the
SCR and from 0.44 to 0.72 for the combination of SCR and
SVT), implying that false-positive errors due to information
leakage may be attenuated when the test is delayed.
Discussion
The results of this experiment join many previous studies in
demonstrating that the CIT can be a powerful tool in differen-
tiating between individuals possessing critical information and
those who were not exposed to this information. However, the
present results also demonstrate that, at least when tested im-
mediately, individuals who actually committed the mock-crime
cannot be differentiated from those who were just exposed to the
critical information in a neutral context. This pattern was re-
vealed in each of the three detection measures employed in this
study as well as when participants’ memory for the critical items
was examined after they took the various tests. It can be argued
that this result reflects the fact that the standard version of the
CIT (the GKT), rather than the GATproposed by Bradley and
his colleagues (e.g., Bradley et al., 1997), was used in this exper-
iment. But, on the other hand, our results with respect to the
informed innocents are quite similar to those reported recently by
Gamer et al. (2010) who used the GAT. Furthermore, in an
additional study, Gamer (2010) directly compared the GATwith
the standard CIT (or GKT) and found that, while both formats
were equally effective in differentiating between knowledgeable and
unknowledgeable individuals, they were also equally ineffective in
differentiating between guilty and informed innocents.
All measures used in this experiment reflect, as expected, an
effect of time among knowledgeable participants. More inter-
estingly, most measures revealed a stronger time effect (a decre-
ment of the detectionmeasure in the delayed condition relative to
the immediate condition) among ‘‘informed innocents’’ as com-
pared with ‘‘guilty’’ participants. However, this tendency was
statistically significant only in the ANOVAs conducted for the
recognition test and the SVT, but not when the SCRs and the
NGTwere used. The differential time effect is also revealed in the
ROC analysis (see Table 3) where the decline in the area statistic
was smaller among ‘‘guilty’’ participants (e.g., from 0.82 to 0.76
for SCR; from 0.97 to 0.80 for all measures combined), than
among ‘‘informed innocents’’ (e.g., from 0.91 to 0.64 for SCR
and from 0.97 to 0.65 for all three measures). This finding is
consistent with the results reported by Gamer et al. (2010) who
used a combination of autonomic measures and demonstrated
that, while the area statistic did not show any decline in the
delayed test for the ‘‘guilty’’ participants (0.89 and 0.90 in the
immediate and delayed tests, respectively), ‘‘informed innocents’’
showed a considerable decline (from 0.95 to 0.75).
This result may reflect the roles of involvement and active
task-participation in memory. Individuals who actually commit-
ted the mock crime took an active part in producing the items to
be remembered, while ‘‘informed innocents’’ became aware of
the critical details through reading a newspaper. This difference
between the two groups does not affect their responses in the
immediate testing, but it does affect memory and, consequently,
differential responding to the critical items shows greater decline
with time among ‘‘informed innocents’’ than among ‘‘guilty.’’
This account is consistent with an extensive literature on the
‘‘generation effect’’ in memory (e.g., Slamecka & Graf, 1978;
deWinstanley, 1995; deWinstanley & Bjork, 2004), demonstrat-
ing that individuals tend to remember information better when
they take an active part in producing it. For example, partici-
pants who generated words by themselves (e.g., generated the
opposite of a given word) subsequently remembered them better
than participants who read the same words (Slamecka & Graf,
1978). Similarly, the superiority of memory for actions (self-per-
formed tasks) over memory for verbally learned material (ver-
bally learned tasks) has been demonstrated to be highly robust
(‘‘the enactment effect’’; Engelkamp, 1998). By the same token,
‘‘guilty’’ participants who actually experienced the event, enacted
the mock crime and had a direct contact with the critical items
were more involved in the task and thus remembered these items
better than ‘‘informed-innocents’’ who were exposed to the con-
cealed items by reading about them.
The practical implication of these results is that, although a
great caution must be exercised against the possibility of infor-
mation leakage, this problem may be less severe in actual appli-
cations of the CIT, because typically CITs are never conducted
immediately after a crime was committed and often it may take a
few weeks to identify potential suspects and design a CIT. Ide-
ally, of course, CIT should not be conducted at all with suspects
who were informed about the critical information, and some-
times such suspects can be identified by a proper pre-test inter-
view. However, suspects in criminal offenses may be reluctant to
disclose knowledge of crime-related items, even when they did
not commit the crime and the critical information was leaked to
them, because they can’t be certain that they will be believed to
Role of memory for crime details 9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
have obtained this guilty knowledge through leakage. For ex-
ample, often such suspects are unable to explain how they be-
came aware of the crime-related details (see Ben-Shakhar et al.,
1999, for a more detailed discussion of this issue). Consequently,
it is impossible to guarantee, in practice, that guilty and only
guilty suspects have knowledge of the critical information, and,
therefore, any means that may minimize the risks involved in
testing informed innocent suspects is important.
Our results differ somewhat from those reported by Gamer et
al. (2010) in demonstrating attenuation in detection efficiency of
‘‘guilty’’ participants after 1 week. As indicated earlier, Gamer et
al. (2010) did not find any time effect on ROC area with ‘‘guilty’’
subjects. Similarly, Carmel et al. (2003), who used only ‘‘guilty’’
subjects, reported identical ROC areas in the immediate and de-
layed conditions when the standard mock crime was applied
(0.84 in both conditions), but with the more realistic version of
the mock crime detection efficiency showed some decline when
the test was delayed (from 0.71 to 0.68). Thus, the present results
and those reported byCarmel et al. (2003) suggest that, in amore
realistic mock crime, some reduction in SCR-CIT detection effi-
ciency may be expected when the test is delayed. However, as
Gamer et al. (2010) reached a different conclusion, this issue may
require further research.
Another important aspect of the present results is the differ-
entiation between central and peripheral items. As predicted,
central items produced more efficient SCR detection efficiency,
and this effect was stronger when the test was delayed. This is
most clearly reflected by the ROC analysis where the two item-
types produced in the immediate CIT, either similar areas (0.78
and 0.77 for central and peripheral items, respectively, with
‘‘guilty’’ participants) or even an advantage of peripheral items in
the ‘‘informed innocents’’ (0.76 vs. 0.87 for central and periph-
eral items, respectively). In the delayed condition, on the other
hand, the areas remained stable for the central items (0.80 and
0.74 for ‘‘guilty’’ and ‘‘informed innocents,’’ respectively) but
declined drastically when only peripheral items were used (0.55
and 0.44, both not significantly different from a chance area of
0.50).
The ROC analysis for the SVT reveals a similar pattern (see
Table 3), although the ANOVA conducted on the SVT detection
measure did not reveal a statistically significant item-type � time
interaction. Furthermore, when the test is delayed, relying on just
the central items results in larger areas than relying on all items,
and this pattern is reflected by both the SCR and the SVT. In this
respect, the present results strengthen the conclusion made by
both Carmel et al. (2003) and Gamer et al. (2010), namely, that
when constructing a CIT, an effort should be made to identify as
many central items as possible. Ben-Shakhar and Elaad (2003)
demonstrated that CITs based on at least five questions produce
optimal detection efficiency. However, it is doubtful whether it
would be possible to identify five central features of a crime, in
the realistic criminal context, and it is unclear from the present
results as well as from Carmel et al. (2003) and Gamer et al.
(2010) whether adding peripheral items would be beneficial. One
option would be to use only central items and repeat each ques-
tion several times (see Ben-Shakhar & Elaad, 2002b; Elaad &
Ben-Shakhar, 1997), but this requires additional research as the
previous examinations of item-repetition effects did not relate to
the distinction between central and peripheral items, nor did they
relate to the crucial factor of delaying the test.
The inclusion of two behavioral measures in addition to SCRs
allows us to examine how thesemeasures are affected by delaying
the test and also to assess their incremental validity when com-
bined with the physiological measure. Both the SVT and the
NGT showed the expected time effect on knowledgeable partic-
ipants. Furthermore, the SVT demonstrated a significantly larger
time effect for ‘‘informed innocents’’ than for the ‘‘guilty’’ par-
ticipants, implying that, in realistic conditions where the CIT is
almost always delayed, its vulnerability to information leakage
may be reduced.
The present results also demonstrate that these behavioral
measures may be useful when used in combination with phys-
iological measures in enhancing the validity of the CIT. For ex-
ample, when adding the SVTto the SCRmeasure, the area under
the ROC curve for detecting ‘‘guilty’’ participants in the imme-
diate test increased from 0.82 to 0.94, and adding the NGT fur-
ther increased the area to 0.97. In the delayed test, the addition of
SVT increased the area from 0.76 to 0.84, but no further increase
with the NGT was revealed. Clearly the addition of these be-
havioral measures increases also the likelihood of false-positive
outcomes in the ‘‘informed innocents,’’ at least in the immediate
testing (the area increased from 0.91 to 0.97). Interestingly, er-
roneous detection of ‘‘informed innocents’’ in the delayed con-
dition is relatively minor and the addition of the SVTand NGT
don’t make any difference (i.e., the area slightly increased from
0.64 to 0.65 and 0.66, all values are not significantly larger than a
chance area of 0.50). These results, which are consistent with the
results of the second experiment reported by Meijer et al. (2007),
imply that the SVTcan be a valuable addition to the traditional
physiological measures in applied settings.
Of course, it is premature to make a definitive recommenda-
tion at this stage, and various aspects of this behavioral measure
must be further investigated. In particular, it will be important to
study its vulnerability to countermeasures and to devise algo-
rithms protecting it from countermeasure attempts. The present
study did not include a systematic examination of the effects of
countermeasures on the SVT, but a post-experiment interview
with the participants revealed that some knowledgeable partic-
ipants tried to use sophisticated strategies to produce a random
pattern (e.g., ignoring the content of the questions and answers,
choosing always the answers that appeared at the right (or left)
side of the screen). This issue was examined by Verschuere et al.
(2008) who coached half of their participants not to perform
below chance level. Indeed, none of the coached participants
performed below chance level and consequently they were not
detected, but 21% of these participants were detected when a run
test was applied to detect deviations in the number of response
alterations. Although these results shed doubts on the utility of
the SVT as an aid in criminal investigations, it is possible that
additional algorithms could be developed to detect deviations
from randomness. Future studies should be conducted to further
examine the vulnerability of the SVTto countermeasures and the
effectiveness of various methods to detect deviations from ran-
domness.
A greater deal of caution should be exerted regarding the use of
the NGT. Although the present results show that it may have a
potential, it has to be remembered that, in this experiment, the
NGTwas based on just four questions, and correlation coefficients
derived from such a small profile may be unreliable. In addition, it
should be noted that the use of only fourNGTquestions reflects an
inherent difficulty to identify critical numerical items. Thus, it is
suggested that the validity of the NGT and its potential as an
additional detection measure in forensic applications should be
further explored before any conclusions are reached.
10 G. Nahari & G. Ben-Shakhar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
Finally, it should be pointed out once again that this study
focused on just two aspects differentiating the laboratory
mock crime set-up from realistic criminal investigations (mem-
ory of various types of critical items when the test is delayed and
leakage of critical information to innocent suspects). Clearly,
there are various emotional and motivational differences
between mock crime studies and criminal investigations that
may affect differential responding to the critical items. It is, of
course, an empirical question whether the present results as well
as the large body of CIT research, which is based on mock crime
paradigms, will generalize to the forensic usage of the CIT.
We believe, following Lykken (1974) that, since the CIT, unlike
the Comparison Questions Test (CQT), focuses on the detec-
tion of specific knowledge stored in memory, rather than on
the detection of deception, it would be relatively unaffected
by the increased emotional arousal associated with realistic crim-
inal investigations. A study by Kugelmass and Lieblich (1966)
who successfully manipulated emotional arousal and stress
and found a general increase in measures of physiological
arousal, but no effect on differential responding to the relevant
information, provides some empirical support for this belief. But
clearly, further research focusing on the emotional and motiva-
tional factors and their effect on the outcomes of the CIT is
needed.
REFERENCES
Ben-Shakhar, G. (1985). Standardization within individuals: A simplemethod to neutralize individual differences in psychophysiologicalresponsivity. Psychophysiology, 22, 292–299.
Ben-Shakhar, G., Bar-Hillel, M., & Kremnitzer, M. (2002). Trial bypolygraph: Reconsidering the use of the GKT in court. Law andHuman Behavior, 26, 527–541.
Ben-Shakhar, G., & Elaad, E. (2002a). The Guilty Knowledge Test(GKT) as an application of psychophysiology: Future prospectsand obstacles. In M. Kleiner (Ed.), Handbook of polygraph testing(pp. 87–102). San Diego, CA: Academic Press.
Ben-Shakhar, G., & Elaad, E. (2002b). Effects of questions’ repetitionand variation on the efficiency of the guilty knowledge test: A reex-amination. Journal of Applied Psychology, 87, 972–977.
Ben-Shakhar, G., & Elaad, E. (2003). The validity of psychophysiolog-ical detection of deception with the Guilty Knowledge Test: A meta-analytic review. Journal of Applied Psychology, 88, 131–151.
Ben-Shakhar, G., & Furedy, J. J. (1990). Theories and applications in thedetection of deception: A psychophysiological and international per-spective. New York: Springer-Verlag.
Ben-Shakhar, G., Gronau, N., & Elaad, E. (1999). Leakage of relevantinformation to innocent examinees in the GKT: An attempt to re-duce false-positive outcomes by introducing target stimuli. Journal ofApplied Psychology, 84, 651–660.
Bradley, M. T., Barefoot, C., & Arsenault, A. (2010). Leakage ofinformation to innocents. In B. Verschuere, G. Ben-Shakhar, &E. Meijer (Eds.), Memory detection: Theory and application of theConcealed Information Test. Cambridge, UK: Cambridge UniversityPress, Forthcoming.
Bradley, M. T., MacLaren, V. V., & Carle, S. B. (1997). Deception andnondeception in guilty knowledge and guilty actions polygraph tests.Journal of Applied Psychology, 81, 153–160.
Bradley, M. T., & Rettinger, J. (1992). Awareness of crime-relevant in-formation and the guilty knowledge test. Journal of Applied Psycho-logy, 77, 55–59.
Bradley,M. T., &Warfield, J. F. (1984). Innocence, information, and theguilty knowledge test in the detection of deception. Psychophysiology,21, 683–689.
Carmel, D., Dayan, E., Naveh, A., Raveh, O., & Ben-Shakhar, G.(2003). Estimating the validity of the Guilty Knowledge Test fromsimulated experiments: The external validity of mock crime studies.Journal of Experimental Psychology: Applied, 9, 261–269.
Cohen, J. E. (1988). Statistical power analysis for the behavioral sciences.Hillsdale, NJ: Lawrence Erlbaum.
deWinstanley, P. A. (1995). A generation effect can be foundduring naturalistic learning. Psychonomic Bulletin & Review, 2,538–541.
deWinstanley, P. A., & Bjork, E. L. (2004). Processing strategies and thegeneration effect: Implications for making a better reader.Memory &Cognition, 32, 945–955.
Elaad, E. (1998). The challenge of the concealed knowledge polygraphtest. Expert Evidence, 6, 161–187.
Elaad, E., & Ben-Shakhar, G. (1997). Effects of items’ repetitions andvariations on the efficiency of the guilty knowledge test. Psycho-physiology, 34, 587–596.
Engelkamp, J. (1998).Memory for actions. East Sussex, UK: PsychologyPress Publishers.
Gamer, M. (2010). Does the guilty action test allow for differentiatingguilty participants from informed innocents? A re-examination.International Journal of Psychophysiology, 76, 19–24.
Gamer, M., Bauermann, T., Stoeter, P., & Vossel, G. (2007). Covari-ations among fMRI, skin conductance and behavioral data duringprocessing of concealed information. Human Brain Mapping, 28,1287–1301.
Gamer, M., & Berti, S. (2010). Task relevance and recognition of con-cealed information have different influences on electrodermal activityand event-related brain potentials. Psychophysiology, 47, 355–364.
Gamer, M., Kosiol, D., & Vossel, G. (2010). Strength of memoryencoding affects physiological responses in the Guilty Action Test.Biological Psychology, 83, 101–107.
Gamer, M., Verschuere, B., Crombez, G., & Vossel, G. (2008). Com-bining physiological measures in the detection of concealed informa-tion. Physiology and Behavior, 95, 333–340.
Green, D. M., & Swets, J. A. (1966). Signal detection theory andpsychophysics. New York: John Wiley & Sons.
Iacono, W. I. (2010). Encouraging the use of the guilty knowledge test(GKT): What the GKT has to offer to law enforcement. InB. Verschuere, G. Ben-Shakhar, & E. Meijer (Eds.), Memory detec-tion: Theory and application of the Concealed Information Test. Cam-bridge, UK: Cambridge University Press, Forthcoming.
Kraphol, D. (2010). Practical limitations of the concealed informationtest in criminal cases. In B. Verschuere, G. Ben-Shakhar, & E. Meijer(Eds.), Memory detection: Theory and application of the ConcealedInformation Test. Cambridge, UK: Cambridge University Press,Forthcoming.
Kugelmass, S., & Lieblich, I. (1966). Effects of realistic stress and pro-cedural interference in experimental lie detection. Journal of AppliedPsychology, 50, 211–216.
Langleben, D. D., Loughead, J. W., Bilker, W. B., Ruparel, K., Chil-dress, A. R., Busch, S. I., &Gur, R. C. (2005). Telling truth from lie inindividual subjects with fast event-related fMRI. Human Brain Map-ping, 26, 262–272.
Lieblich, I., & Ninio, A. (1972). Detection of suppressed involvementwith information through a forced number-guessing technique. ActaPsychologica, 36, 381–387.
Lieblich, I., Shaham, E., & Ninio, A. (1976). Effects of time stress andstimulus-response set size on the efficiency of detection of involvementwith suppressed information through the use of the forced number-guessing technique. Acta Psychologica, 40, 75–84.
Lykken, D. T. (1959). The GSR in the detection of guilt. Journal ofApplied Psychology, 43, 385–388.
Lykken, D. T. (1960). The validity of the guilty knowledge technique:The effects of faking. Journal of Applied Psychology, 44, 258–262.
Lykken, D. T. (1974). Psychology and the lie detector industry.AmericanPsychologist, 29, 725–739.
Lykken, D. T. (1998). A tremor in the blood: Uses and abuses of the liedetector. New York: Plenum Trade.
Marston, W. M. (1917). Systolic blood pressure symptoms of deception.Journal of Experimental Psychology, 2, 117–163.
Meijer, E. H., Smulders, F. T. Y., Johnston, J. E., &Merckelbach, H. L.G. J. (2007). Combining skin conductance and forced choice inthe detection of concealed information. Psychophysiology, 44,814–822.
Role of memory for crime details 11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
Merckelbach, H. L. G. J., Hauer, B., & Rassin, E. (2002). Symptomvalidity testing of feigned dissociative amnesia: A simulation study.Psychology, Crime and Law, 8, 311–318.
Nakayama,M. (2002). Practical use of the concealed information test forcriminal investigation in Japan. In M. Kleiner (Ed.), Handbook ofpolygraph testing (pp. 49–86). San Diego, CA: Academic Press.
National Research Council. (2003). The polygraph and lie detection.Committee to Review the Scientific Evidence on the Polygraph. Wash-ington: The National Academies Press.
Osugi, A. (2010). Daily application of the CIT: Japan. In B. Verschuere,G. Ben-Shakhar, & E. Meijer (Eds.), Memory detection: Theory andapplication of the Concealed Information Test. Cambridge, UK: Cam-bridge University Press., Forthcoming.
Pankratz, L., Fausti, S. A., & Peed, S. (1975). A forced-choice techniqueto evaluate deafness in the hysterical ormalingering patient. Journal ofConsulting and Clinical Psychology, 43, 421–422.
Podlesny, J. A. (1993). Is the guilty knowledge polygraph technique ap-plicable in criminal investigations? A review of FBI case records.Crime Laboratory Digest, 20, 57–61.
Raskin, D. C. (1989). Polygraph techniques for the detection of decep-tion. In D. C. Raskin (Ed.), Psychological methods in criminal inves-tigation and evidence (pp. 247–296). New York: Springer-Verlag.
Reid, J. E., & Inbau, F. E. (1977). Truth and deception: The Polygraph(‘‘Lie Detection’’) Technique. Baltimore: Williams and Wilkins.
Rosenfeld, J. P., Labkovsky, E., Winograd, M., Lui, M. A., Vanden-boom, C., & Chedid, E. (2008). The Complex Trial Protocol (CTP):
A new, countermeasure-resistant, accurate P300-based method fordetection of concealed information. Psychophysiology, 45, 906–919.
Rosenfeld, J. P., Shue, E., & Singer, E. (2007). Single versus multipleprobe blocks of P300-based concealed information tests for autobio-graphical versus incidentally learned information. Biological Psycho-logy, 74, 396–404.
Slamecka, N. J., & Graf, P. (1978). The generation effect: Delineation ofa phenomenon. Journal of Experimental Psychology: Learning, Mem-ory & Cognition, 4, 492–604.
Swets, J. A., Tanner, W. P. Jr., & Birdsall, T. C. (1961). Decision pro-cesses in perception. Psychological Review, 68, 301–340.
Verschuere, B., Crombez, G., De Clercq, A., & Koster, E. (2004). Au-tonomic and behavioral responding to concealed information: Differ-entiating defensive and orienting responses. Psychophysiology, 41,461–466.
Verschuere, B., Crombez, G., & Koster, E. (2004). Orienting to guiltyknowledge. Cognition & Emotion, 18, 265–279.
Verschuere, B., Meijer, E., & Crombez, G. (2008). Symptom validitytesting for the detection of simulated amnesia: Not robust to coach-ing. Psychology, Crime, & Law, 14, 523–528.
Vrij, A. (2008).Detecting lies and deceit. Pitfalls and opportunities (SecondEdition). West Sussex: John Wiley and Sons.
(Received May 5, 2010; Accepted September 17, 2010)
12 G. Nahari & G. Ben-Shakhar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
PSYP 01148
(BW
US
PSY
P 01
148
Web
pdf:
=10
/04/
2010
05:
53:3
9 72
5776
Byt
es 1
2 PA
GE
S n
oper
ator
=)
10/4
/201
0 5:
56:2
8 PM
Author Query Form
_______________________________________________________
_______________________________________________________
Dear Author,
During the copy-editing of your paper, the following queries arose. Please respond to these by marking up your proofs with the necessary changes/additions. Please write your answers clearly on the query sheet if there is insufficient space on the page proofs. If returning the proof by fax do not write too close to the paper's edge. Please remember that illegible mark-ups may delay publication.
Journal PSYPArticle 01148
Query No. Description Author Response
.No Queries