21
Journal of Experimental Psychology: Human Perception and Performance 1982, Vol. 8, No. 2, 194-214 Copyright 19S2 by the American Psychological Association, Inc. 0096-1523/82/0802-0194$00.75 Perceptual Grouping and Attention in Visual Search for Features and for Objects Anne Treisman University of British Columbia, Vancouver, British Columbia, Canada This article explores the effects-of perceptual grouping on search for targets defined by separate features or by conjunction of features. Treisman and Gelade proposed a feature-integration theory of attention, which claims that in the ab- sence of prior knowledge, the separable features of objects are correctly combined only when focused attention is directed to each item in turn. If items are preat- tentively grouped, however, attention may be directed to groups rather than to single items whenever no recombination of features within a group could generate an illusory target. This prediction is confirmed: In search for conjunctions, sub- jects appear to scan serially between groups rather than items. The scanning rate shows little effect of the spatial density of distractors, suggesting that it reflects serial fixations of attention rather than eye movements. Search for features, on the other hand, appears to be independent of perceptual grouping, suggesting that features are detected preattentively. A conjunction target can be camou- flaged at the preattentive level by placing it at the boundary between two adjacent groups, each of which shares one of its features. This suggests that preattentive grouping creates separate feature maps within each separable dimension rather than one global configuration. The separate existence of an object and the boundaries that define and divide it from its surroundings must be established before the object can be identified. Perceptual seg- regation must therefore occur early in visual processing. There has been considerable in- terest recently in the nature of the grouping operation, which organizes the visual field into objects and their backgrounds. Object boundaries can be signaled by differences between adjacent areas in color or bright- ness, or in texture. Perceptual grouping is a special case of texture segregation, in which the texture is composed of discrete elements above some minimal size. The pres- ent article is concerned with the relations between attention and perceptual grouping in determining the perception of objects. This research was supported by a grant from the National Scientific and Engineering Research Council of Canada. I am grateful to Daniel Kahneman and William Prinzmetal for helpful criticism and discussions, and to Randy Paterson for his help in running the experiments. Requests for reprints should be sent to Anne Treis- man, Department of Psychology, University of British Columbia, 154 - 2053 Main Mall, Vancouver, British Columbia, Canada V6T 1Y7. The phenomenology of perceptual orga- nization was first extensively studied by the Gestalt psychologists (e.g., Wertheimer, 1923). They showed that similarity, prox- imity, and common movement of discrete elements are strong determinants of percep- tual grouping. These simple principles max- imize the chance of putting together parts of the same object and separating parts of different objects; presumably, they evolved for this reason. Some organisms have also evolved means of outwitting the parsing skills of their predators, using camouflage to break up local continuities of color, brightness, line, or texture (Fox, 1978). The manner in which similarity controls perceptual grouping was further explored by Beck (1966, 1967) and by Olson and Att- neave (1970). Beck found that differences in line orientation could be as effective as differences in brightness in segregating two groups of elements. A display divided into regions containing Ts and tilted Ts segre- gates easily, whereas one divided between Ts and Ls does not. Beck suggests that simi- larity grouping is "fundamentally an acuity task involving the simultaneous discrimina- tion of stimulus differences prior to a nar- 194

Treisman (1982) Perceptual grouping and attention in …wexler.free.fr/library/files/treisman (1982) perceptual...Perceptual Grouping and Attention in Visual Search for Features and

Embed Size (px)

Citation preview

Journal of Experimental Psychology:Human Perception and Performance1982, Vol. 8, No. 2, 194-214

Copyright 19S2 by the American Psychological Association, Inc.0096-1523/82/0802-0194$00.75

Perceptual Grouping and Attention in Visual Searchfor Features and for Objects

Anne TreismanUniversity of British Columbia, Vancouver, British Columbia, Canada

This article explores the effects-of perceptual grouping on search for targetsdefined by separate features or by conjunction of features. Treisman and Geladeproposed a feature-integration theory of attention, which claims that in the ab-sence of prior knowledge, the separable features of objects are correctly combinedonly when focused attention is directed to each item in turn. If items are preat-tentively grouped, however, attention may be directed to groups rather than tosingle items whenever no recombination of features within a group could generatean illusory target. This prediction is confirmed: In search for conjunctions, sub-jects appear to scan serially between groups rather than items. The scanning rateshows little effect of the spatial density of distractors, suggesting that it reflectsserial fixations of attention rather than eye movements. Search for features, onthe other hand, appears to be independent of perceptual grouping, suggestingthat features are detected preattentively. A conjunction target can be camou-flaged at the preattentive level by placing it at the boundary between two adjacentgroups, each of which shares one of its features. This suggests that preattentivegrouping creates separate feature maps within each separable dimension ratherthan one global configuration.

The separate existence of an object andthe boundaries that define and divide it fromits surroundings must be established beforethe object can be identified. Perceptual seg-regation must therefore occur early in visualprocessing. There has been considerable in-terest recently in the nature of the groupingoperation, which organizes the visual fieldinto objects and their backgrounds. Objectboundaries can be signaled by differencesbetween adjacent areas in color or bright-ness, or in texture. Perceptual grouping isa special case of texture segregation, inwhich the texture is composed of discreteelements above some minimal size. The pres-ent article is concerned with the relationsbetween attention and perceptual groupingin determining the perception of objects.

This research was supported by a grant from theNational Scientific and Engineering Research Councilof Canada.

I am grateful to Daniel Kahneman and WilliamPrinzmetal for helpful criticism and discussions, and toRandy Paterson for his help in running the experiments.

Requests for reprints should be sent to Anne Treis-man, Department of Psychology, University of BritishColumbia, 154 - 2053 Main Mall, Vancouver, BritishColumbia, Canada V6T 1Y7.

The phenomenology of perceptual orga-nization was first extensively studied by theGestalt psychologists (e.g., Wertheimer,1923). They showed that similarity, prox-imity, and common movement of discreteelements are strong determinants of percep-tual grouping. These simple principles max-imize the chance of putting together partsof the same object and separating parts ofdifferent objects; presumably, they evolvedfor this reason. Some organisms have alsoevolved means of outwitting the parsingskills of their predators, using camouflageto break up local continuities of color,brightness, line, or texture (Fox, 1978).

The manner in which similarity controlsperceptual grouping was further explored byBeck (1966, 1967) and by Olson and Att-neave (1970). Beck found that differencesin line orientation could be as effective asdifferences in brightness in segregating twogroups of elements. A display divided intoregions containing Ts and tilted Ts segre-gates easily, whereas one divided between Tsand Ls does not. Beck suggests that simi-larity grouping is "fundamentally an acuitytask involving the simultaneous discrimina-tion of stimulus differences prior to a nar-

194

GROUPING AND OBJECT PERCEPTION 195

rowing or focusing of attention" (1972, p.13). Any features that are easily discrimi-nated with peripheral vision in a patternedvisual field will also mediate grouping. Beck(1972) also showed that discriminability isdetermined differently when attention is fo-cused and when it is spread across a display.Thus the advantage in discriminability of Tsversus tilted Ts compared to Ts and Ls dis-appears when single items are presented inan otherwise empty field.

Julesz (1975) tried to specify in statisticalterms the properties that determine texturesegregation. He proposed that the visual sys-tem automatically detects first- and second-order dependencies among local elementsand segregates any areas that differ system-atically at either of these two levels. Higherorder dependencies are detected only slowly,with effort and "scrutiny"; they cannot me-diate the spontaneous emergence of salientboundaries. Later, Caelli, Julesz, and Gil-bert (1978) and Julesz and Caelli (1979)included as potential mediators of texturesegregation a few specific higher order com-binations, which represent the local geomet-ric properties of collinearity, corner, and clo-sure. Most recently, Julesz (1980) reducedto just two basic elements the local cues thathe believes can distinguish textures sharingthe same second-order statistics: these ele-ments, which he calls "textons," are (a)"elongated blobs" and (b) the number of lineends or "terminators."

The theories all agree that perceptualgrouping occurs automatically and in par-allel, without attention or "scrutiny," as Ju-lesz (1980) puts it. This preattentive orga-nization should then affect all subsequentstages of processing. Treisman, Sykes, andGelade (1977) proposed a new hypothesisabout the role of attention in the perceptionof objects that has implications for the na-ture and use of perceptual grouping. Thetheory suggests that focused attention is nec-essary for the accurate combination of fea-tures or properties of complex objects when-ever more than one object is presented in anunpredictable context. Separable featuresare registered independently and in parallelacross the visual field. However, when ob-jects must be identified and no contextualcues are available to select the expected com-

binations of features, a correct synthesis canbe achieved only by focusing attention onone location at a time. Features which co-occur in a single fixation of attention arecombined to form an object. If attention isdiverted or overloaded, the features may bewrongly recombined, forming "illusory con-junctions" (Treisman & Schmidt, 1982).

Treisman and Gelade (1980) tested thisfeature-integration theory of attention inseveral experiments. We showed that visualsearch for targets defined by one or moredisjunctive features (e.g., blue or curved)occurs in parallel across a spatial display,whereas search for a conjunction target (e.g.,red and vertical) requires a serial, self-ter-minating scan through items in the display,suggesting that attention must be focused oneach item in turn. The linear functions and2:1 slope ratio of negative to positive searchtimes were maintained across two differentlevels of target/distractor discriminability,which produced markedly different searchrates (40 msec and 92 msec per item). Theserial scan for conjunction targets was there-fore not due to the overall level of difficulty.Conversely, a difficult feature search (forellipses differing slightly in size) producedvery slow search rates but neither linearfunctions nor 2:1 slope ratios.

In a different paradigm, we showed thatfeature targets can be identified withoutbeing correctly localized, whereas for con-junction targets, identity and location ap-pear to be interdependent. The interdepen-dence is consistent with the claim thatconjunctions are correctly registered onlywhen attention is focused on the relevantlocation. Finally, when attention is divertedor overloaded, illusory conjunctions of fea-tures are in fact experienced (Prinzmetal,1981; Treisman & Schmidt, 1982). Illusoryobjects may recombine either dimensions,like the shape, size, and color of differentreal objects, or component parts, like thecurves, lines, and angles of more complexshapes.

The convergence of these results from avariety of different experimental paradigmsprovides stronger support for the theory thanany single finding on its own. Townsend(1972) has shown that some models of par-allel as well as of serial search can predict

196 ANNE TREISMAN

linear functions, but the fact that attentionlimits are implicated in the formation of il-lusory conjunctions and in the accurate lo-calization of conjunction targets makes itplausible in conjunction search to attributethe linear increase in latencies with in-creased display size to a serial self-termi-nating scan with focused attention directedto each item in turn.

Some predictions about perceptual group-ing and segregation also follow from the the-ory. The first was tested by Treisman andGelade (1980), and the others are exploredin the present aricle.

1. Determinants of Grouping

If grouping is an early, .preattentive pro-cess, it should be mediated only by the dis-crimination of simple, separable features.Grouping should therefore not be immedi-ately apparent when two regions of textureelements differ only in the way in which thesame sets of features are combined. Weshowed that elements that differ either incolor (red Os and Xs vs; blue Os and Xs)or in shape (red and blue Os vs. red and blueXs) segregate easily into salient perceptualgroups, whereas elements that differ only inthe conjunction of colors and shapes (forexample red Os mixed with blue Xs in onearea vs. red Xs with blue Os in the other)do not automatically segregate into percep-tual groups (Treisman & Gelade, 1980). Theboundaries of groups defined only by con-junctions of features are found through slow,perhaps serial, scanning and seem to be in-ferred rather than directly perceived.

In general this hypothesis makes similarpredictions to those of Julesz (1975) andBeck (1972); but whereas Julesz tries to de-fine a priori the statistical criteria that resultin preattentive grouping, we propose thatpreattentive grouping can be mediated byany independent population of feature de-tectors, which are present in the perceptualsystem. Thus we predict a correlation be-tween the ease of segregation and the pres-ence of any features that in other tests havedemonstrated separability (Garner, 1974;Treisman & Gelade, 1980). It is also pos-sible that new, functionally independent de-tectors could be set up through prolonged

practice, as in Laberge's (1973) "unitiza-tion" process or Shiffrin and Schneider's(1977) "automatization" of search with con-stant mapping of target and distractors. Anynewly acquired feature detectors that allowautomatic detection or matching unaffectedby expectancy or attention, should then alsobe capable of mediating perceptual segre-gation.

2. Grouping Effects on Attention

A second prediction (tested in Experiment1) is that the results of preattentive groupingshould control the direction of focused at-tention. If the function of perceptual group-ing is to set up candidate objects for furtherprocessing, and if attention is necessary forobject identification, it follows that we shouldattend initially to groups of ite.ms when theseare present rather than to single items withingroups. Thus the perceptual organization ofa display should have marked effects onsearch for conjunction targets but not forfeature targets, since these are detected au-tomatically without attention.

The relation between attention and per-ceptual grouping has been discussed byKahneman and Henik (1977). They describea phenomenon that they call "group pro-cessing." Group processing in the report oftachistoscopic arrays is characterized bysimilar and highly correlated probabilitiesof report for items within the same percep-tual group and by discontinuities and neg-ative correlations in the probabilities of re-port for items in different groups. Kahnemanand Henik (1977) suggested that attentionis allocated in a hierarchical fashion, first toa group as a whole and only subsequentlyto elements within a group. They also lookedat selective attention in relation to the degreeof spatial segregation or mixing of red andblue letters. In attempting to recall the redletters only, subjects showed a marked ad-vantage of grouped displays compared to acheckerboard arrangement. The results sug-gest that attention can only (or much moreeffectively) be focused on relevant items ifthese are spatially grouped.

The metaphors chosen for attention inKahneman's model and in feature-Integra-

GROUPING AND OBJECT PERCEPTION 197

tion theory imply rather different functions.For Kahneman, attention plays the role ofan enabling input or supply of resources thatis allocated to items to facilitate their pro-cessing. In feature-integration theory, atten-tion functions as a selective device or spot-light, selecting which features should beconjoined and how. These two analogies dif-fer in their emphasis but are not incompat-ible. A spotlight can vary in intensity as wellas in spatial direction and area. The resultof distributing a resource more widely whenthe pool is limited is that each recipient getsless of it; in feature-integration theory, theresult of spreading the attention spotlightover a larger number of items is to increaseuncertainty in the allocation of features toobjects. Attention may serve both functions:first selecting which objects will be accu-rately resynthesized and then facilitatingany further processing they receive.

3. Grouping and Illusory Conjunctions

Another prediction also follows from theidea that attention is allocated to groupsrather than to items: If an otherwise ho-mogeneous group contains two items withcontrasting features (e.g., red in a greengroup), the theory implies that their loca-tions and conjunctions should remain uncer-tain until attention is narrowed to containeach unique item alone. We would expect,therefore, more illusory conjunctions to beformed within perceptual groups than acrossthem. Prinzmetal (1981) has shown that il-lusory conjunctions are more often seenwhen two interchangeable features are pres-ent within a single group than when theyoccur in separate groups.

4. Spatial Density and Search forConjunctions or Features

^It should be possible to distinguish theeffects of perceptual grouping from those ofspatial distribution. The physical or retinalseparation of items should affect conjunctionformation only if and when it acts as a meansof segregating groups of items that can thenreceive separate attention. Any differencesin the spatial distribution of items that haveno impact on their preattentive grouping

should also leave conjunction search unaf-fected. Retinal location may affect the ac-curacy of feature registration, since acuityand color discrimination decrease away fromthe fovea. To counteract these peripherallimits, we expect more eye movements to bemade with displays in which items are dis-tributed over a wider area whenever theirfeatures become confusable at the peripherallocations. This should not, however, resultin the strictly serial checking with the mental(rather than the physical) "eyeball," whichis necessary in conjunction search. Thesepredictions are tested in Experiment 3.

5. Separate Preattentive Feature Mapsand Serial Scanning of Groups

A fifth prediction is somewhat counter-intuitive and contrasts with the generallyaccepted view first proposed by Neisser(1967). If features are registered indepen-dently of each other, they should form sev-eral separate preattentive organizationsrather than one global configuration. In factthere should be one for every population ofseparable feature detectors. Each should of-fer its own candidates for object and eventidentification, and they should be interre-lated only with the aid of focused attention.

The idea of separate feature maps mayneed some clarification. Features that arespatially defined, like shapes and sizes, arethemselves arrangements or spatial extentsof other features like brightnesses or colors,with which they are physically conjoined.We should, however, distinguish real dis-plays from coded representations of thosedisplays. It is logically possible to registerthe presence of something red without spec-ifying what or where it is. Sizes, line ori-entations, curvature, intersections and so onmay be abstracted from the particularbrightness contrasts that physically embodythem. Treisman and Schmidt (1982) showedthat illusory conjunctions may exchange theshapes, sizes, colors, or solidity (outline vs.filled) of real objects and as a consequencesubstantially alter, for example, the amountof a certain color that is seen, or alter itsspatial distribution.

Location may diffep from other features,however. In order for a feature "map" to be

198 ANNE TREISMAN

structured, at least two dimensions must beconjoined, one to act as medium for theother. Kubovy (1981) suggested that for vi-sual stimuli, space and time are the "indis-pensable attributes" that perform the roleof media to carry variation in other attri-butes. This would give location a special sta-tus in perceptual coding. There is in factconsiderable evidence that locations are (insome sense of the word) registered early;examples are the localized receptive fieldsof visual cortical cells and the fine spatialresolution that mediates stereoscopic depth.It does not follow, however, that the loca-tions of features are also automatically avail-able to the mechanisms that discriminateand identify multidimensional objects. Lo-cation is an ambiguous term: It could refersimply to relations in space, such as "left ofor "below," or it could be defined in absolutecoordinates, either on the retina or in thephysical three-dimensional world. If loca-tions were coded relationally within eachfeature map, without absolute coordinatesor common scale, this would allow percep-tual grouping and segregation of differentcolors, orientations, and so on, but would notbe sufficient to allow correct alignment ofone feature map with another. Locationsmay therefore be conjoined with each set ofseparable features at an earlier stage and bya different process from that which conjoinsother features with each other.

If focused attention is, as we suggest, themeans by which cross-dimensional integra-tion takes place, it must allow access tocross-dimensional spatial coordinates so thatabsolute locations can be matched across thedifferent feature maps. These absolute co-ordinates would presumably refer to real-world locations in three-dimensional space.Serial processing with focused attention maytherefore be needed not only to conjoin fea-tures but also to interpret relative featurelocations as points in a common spatial rep-resentation of the external world.

Kahneman and Henik's (1977) resultssuggest that focused attention can be di-rected to the locations of preattentively de-fined groups. Since we claim that there areseveral separate preattentive feature maps,one of them must be selected to control at-tention at any given time, either by its task

relevance or by the amount of organizationthat it contains and the economy of searchthat it consequently allows. A relational cod-ing of spatial layout could be sufficient toguide the direction of attention to particularitems within a single feature map. The at-tended features could then be localized inabsolute coordinates within a representationof external space. Features at the same lo-cation but on other dimensions would alsobe accessed and all would be conjoined toform a structured, multidimensional object.The mechanisms that achieve the integratedspatial representation and that conjoin fea-tures to identify objects are not our presentconcern. This article attempts to explore thenature of preattentive processing and tospecify which operations require attentionand must therefore be performed serially.

What effects would this account entail forsearch tasks with grouped displays? Theclaim is that the serial scan with focusedattention, which mediates search for con-junction targets, is needed to bring the dif-ferent feature maps into register with eachother. When the display is randomly mixed,it seems this must be done sequentially foreach item in turn. However, when the dis-play is divided into regions of homogeneousitems, the exact location of each feature rel-ative to the others within its group may beirrelevant; illusory exchanges of color andshape between two identical items wouldproduce no detectable error. In this case, itmay be sufficient to match up feature mapsat a cruder resolution, focusing attention ongroups to integrate these, rather than indi-vidual items. This matching at cruder res-olution would be sufficient to determine, forexample, that all items in a group of Xs aregreen, but would not decide the color of eachX in turn. It should then be possible to checkitems within each group in parallel, even insearch for a conjunction target. In lookingfor a green H in a background of separatelygrouped green Xs and red Hs, early segre-gation of items by color, for example, couldallow attention to be focused on the homo-geneous groups that share one target feature(e.g., green). A parallel search within eachfocused group should then allow detectionof the other target feature (e.g., vertical)without requiring a serial scan to locate it

GROUPING AND OBJECT PERCEPTION 199

more precisely. Search times should varywith the number of groups rather than withthe number of items. This prediction is testedin Experiment 2.

Our earlier results with conjunction searchconfirm the conclusion reached by Kahne-man and Henik (1981) that there are con-straints on the spatial distribution of selec-tive attention. They show that attentioncannot be distributed over a subset of items(e.g., the red ones or the curved ones) whenthese are spatially scattered among otheritems in a randomly mixed display. If sub-jects could attend selectively and in parallelto all the red items in a mixed display of redhorizontal and green vertical lines, thisshould allow them to detect a conjunctiontarget (e.g., red and vertical) within the redsubset, as if it were defined by a single fea-ture (e.g., a vertical line among horizontals).The fact that this could not be done suggeststhat attention can be focused only on setsof spatially grouped items. The spotlight ofattention appears to be unitary (or at leastnot multiply divisible) and may be con-strained in the possible shapes it can assume,although it may vary in size (Eriksen &Hoffman, 1972).

6. Grouping and Camouflage

A final prediction, tested in Experiment4, also follows from the idea that a numberof separate preattentive feature maps areformed, rather than one global configura-tion. If a conjunction target is placed at theboundary between two groups, each of whichshares one of its features, it should beembedded as one of a group in both preat-tentive feature maps and should exist as aunique entity in neither. Thus a red X at theboundary between blue Xs and red Osshould be preattentively invisible. It shouldbecome visible only when attention is nar-rowed sufficiently to exclude at least onegroup of distractor items. In the everydayworld, of course, such situations would berare; the different preattentive maps willnormally give highly correlated parsings offigure and ground. Our hypothesis may helpto explain some cases of natural camouflage,however. It could be that animals concealthemselves from predators by sharing, for

example, their color with some elements oftheir background and their shape withothers.

In summary, the predictions to be testedin the present article are that (a) perceptualgrouping should have little effect on featuresearch but large effects on conjunction searchthrough its role in the control of attention;(b) spatial density without perceptual group-ing should have no effect on the rate of con-junction search, since the serial scan involvesthe mental rather than the physical eyeball;(c) when homogeneous groups of items arepresent in the display, groups rather thansingle items will be serially scanned; and (d)adjacent groups of items should preatten-tively camouflage an item at their boundaryif the item conjoins one feature from eachgroup and has no unique distinguishing fea-ture of its own.

Experiment 1: Perceptual Grouping andSearch for Conjunctions

Experiment 1 investigates the effect ofperceptual grouping on search for objectsdefined either by features or by conjunctions.If feature targets are detected through thesame preattentive operations that segregategroups and textures, they should be affectedby the same variables that control the group-ing itself—the discriminability of adjacentshapes or colors in the display. Detection offeature targets should vary with the size andnumber of perceptual groups only to the ex-tent that these determine the local contextof the target. Thus, a green target might beeasier to detect when surrounded only byyellow distractors in its immediate vicinitythan when surrounded by a mixture of blueand yellow distractors; this would follow ifand because two different discriminationsare harder to make than one. The effectsshould be small, unless the discriminabilityof the target from the different distractorsvaries substantially within the displays.

Detection of conjunction targets, on theother hand, requires attention. If items arespatially grouped by a shared feature, at-tention may be directed to groups ratherthan to items. We should then expect a largeeffect of the degree of spatial homogenityon search for conjunctions of features.

200 ANNE TREISMAN

Experiment 1 compared the effects ofgrouping and of display size on visual search.Two groups of subjects were tested: one onconjunction targets and one on feature tar-gets. Each group was run in two conditions:(a) with display size varying from 1 to 36items and (b) with display size fixed at 36items, but the number of homogeneousgroups within the display varying from 1to 36.

Method

Subjects. Eight subjects (five female and three male)were tested with the conjunction target and eight sub-jects (three female and five male) with the feature tar-gets. All were students, aged 16-25, who volunteeredto take part in the experiment and were paid $3 an hour.

Stimulus materials. Two sets of stimulus cards wereconstructed: a feature set and a conjunction set. Thedistractors in both sets were,red Hs and green Xs. Therewere two types of target: In the conjunction condition,the target was a green H, which differed from the dis-tractors only by the conjunction of its color and shape.In the feature condition, the target was defined dis-junctively: On each trial subjects were asked to searchfor both a blue letter and an S without knowing which,if either, would be present. The blue letter was an Hor an X and the S was green or red, but these featureswere irrelevant to the task. Thus each distractor sharedone of the target features in both feature and conjunc-tion conditions. Subjects had to check both color andshape for the distractors in both the conjunction and thefeature condition, but they had to check how color andshape were conjoined only in the conjunction condition.

Two sets of displays wer^ made for both the featureand the conjunction search conditions. In one set, displaysize was varied; in the other, display size was fixed at36 items, and the number of homogeneous groups ofdistractors was varied. The sets that varied display sizecontained examples of cards with 1,4, 16, and 36 items;eight cards for each display size in each set containeda target (positive), and eight cards did not (negative).The items were arranged in regular rectangular matrices( 2 X 2 , 4 X 4 , 6 X 6 ) with a 1.3:1 ratio of width toheight. Displays containing only one item used one lo-cation from the 2 X 2 matrix, taking each of the fourlocations equally often. The average distance of itemsfrom the fixation point was 2.14° for all display sizes.Each item subtended .57°. Target positions were chosenso that the mean distance from fixation for targets wasequal to the mean distance from fixation for the dis-tractors in each set. Targets were placed equally oftenin each quadrant of the matrices. If negative conjunctiondisplays of four items had always contained equal num-bers of each distractor type, it would have been possiblefor subjects to use an easily subitized number of featuresas a cue to the presence of the target. To avoid this, wemade an equal number of positive displays consistingof (a) two red Hs, a green X and a green H, and (b)two green Xs, a red H and a green H. Similarly, therewere equal numbers of negative displays with two red

Hs and two green Xs and with three of one color orshape and one of the other. A 3:1 ratio of green to redor of Hs to Xs could therefore not be a cue to the pres-ence of a target. The same distractors were used in thefeature target condition.

For the grouped condition, the same matrix of 36positions was used and all displays contained 36 items.However, the red Hs and the green Xs were arrangedin varying numbers of homogeneous groups: They wereeither randomly placed (as 36 separate items), orgrouped in horizontal pairs of identical items (to makea checkerboard of 18 groups), or grouped into 9 alter-nating squares of 2 X 2 identical items (to make 9groups), or grouped in 4 alternating squares of 3 X 3identical items (to make 4 groups), or formed a singlegroup of identical items, either all green Xs or all redHs. The green H conjunction target or the blue or Sdisjunctive target in positive displays replaced one item,equally often in each quadrant and in each type of group(red Hs and green Xs).

The cards were shown in a two-field Cambridge ta-chistoscope. The onset of each display started a milli-second timer, which stopped when the subject pressedeither of two response keys.

Procedure. One group of subjects was tested onlyon conjunction targets and one group only on featuretargets. They were tested in two sessions consisting ofone block of varied display size, followed by one blockof grouped displays on the first day and then one blockof grou.ped displays and one of varied display sizes onthe second day. Each block consisted of eight trials withand eight without a target at each display size or ineach grouping condition. Subjects were informed of thenature of the displays in both conditions and were shownexamples of the cards. In the varied display size blocks,trials with displays of 1, 4, 16, and 36 were randomlymixed, as were trials with varying numbers of groupsin the grouped display blocks.

The instructions were to decide as quickly as possible,while minimizing errors, whether the display containeda target and to press the right-hand key if it did andthe left-hand key if it did not. Subjects were given a fewtrials in each condition for practice before beginning theexperimental trials.

Results and Discussion

The mean search times are shown in Fig-ure 1, and details of the linear regressionsare given in Table 1. The results with varieddisplay size confirm our previous finding(Treisman & Gelade, 1980). Conjunctiontargets give strictly linear functions relatingsearch time to display size, and^a 2:1 ratioof negative to positive slopes. This patternsuggests a serial self-terminating scan andis consistent with the hypothesis that focusedattention is directed to each item in turn tocheck the conjunction of color and shape.The search rate is similar to the rate we ob-tained earlier with a green T target, green

GROUPING AND OBJECT PERCEPTION 201

2400

^/)§ 2000

8IUV)

3 1600

I

u. 1200

800

400

CONJUNCTIONS FEATURES

Grouping

Display Size

16 16 36 DISPUY SIZE

9 18 36 1 4 9 18 36 NUMBER Of GROUPS

Figure I. Mean search times for feature and conjunction targets in Experiment 1.

1 4

1 4

36

36

1 4

1 4

X, and brown T distractors spread haphaz-ardly over the larger area of 15° by 8°.

The feature targets again showed muchless effect of display size, averaging a searchrate of only 2 msec per item on positive dis-plays compared to 28 msec for conjunctions.For color targets, the effect of display sizedid not reach significance. The functionswere also less linear (linearity accounting foronly 85% and 80% of the variance due todisplay size for color and for shape, respec-tively) compared to almost 100% for con-junctions. The slope ratios for features weremuch larger than 2:1. Again these resultsreplicate our earlier finding and suggest thatdisjunctive features, when present, can bedetected in parallel across the display. Notethat with single-item displays, there is nodifference between feature and conjunctionlatencies. This makes it unlikely that thedifference is due to the discriminability ofthe two types of target. The difficulty withconjunctions arises only when more than oneitem is presented and the load on attentionis increased.

Individual subjects varied considerably intheir search rates; the slopes of response timeagainst display size ranged from 23 to 41(SD = 5.3) for conjunction positives andfrom 38 to 72 (SD = 10.7) for conjunctionnegatives. In every case, however, the pro-portion of the variance with display size that

was due to linearity was over 99%. In thefeature search condition, the positive slopesranged from -1.0 to 4.6 (SD = 1.6) and thenegative slopes ranged from 2.5 to 28.0(SD = 7.9). The proportion of the variancedue to linearity ranged from 6% to 99% with85% for the median subject.

The main interest in this study was in theeffect of grouping. With feature targets inpositive displays, there is little scope forgrouping to have an effect, since targets aredetected only 60 msec to 80 msec moreslowly in displays of 36 items than in displaysof 1. The effect of grouping did not reachsignificance for the positive displays; how-ever, it was significant for the negatives, F(4,28) = 5.43, p = .002. Farmer and Taylor(1980) reported a similar result: Searchtimes for a target defined by color were un-affected by the color grouping of distractorson positive trials but showed a substantialeffect of heterogeneous mixing of distractorson negative trials. It seems that on negativetrials, subjects adopt the strategy of scan-ning the display more carefully to confirmthe absence of the target. They then showsome effect both of display size and of group-ing, although these are an order of magni-tude smaller than with conjunction targets.

In conjunction search, on the other hand,the number of groups has a striking effect.The results show a major decrease in search

202 ANNE TREISMAN

Table 1Analysis of Mean Search Times in Experiment 1

Condition

ConjunctionPositiveNegative

P<

.001

.001

Slope %V«

Display size

28.2 > 9958.0 > 99

Sloperatio

.49

Int.

486511

%errors

6.74.5

FeatureShape positiveColor positiveNegative

.001—

.001

2.41.78.6

858096

.28

.20521539589

8.77.24.5

ConjunctionPositiveNegative

.001

.00121.545.8

Grouping

75"82"

Note. Int. = Intercept." Percentage of variance with, display size due to linearity.b Departure from linearity significant (p < .01).

.47 8091,202

8.62.7

FeatureShape positiveColor positiveNegative

——.002

.62.25.9

159977

.10

.37527552640

4.28.72.9 '

time as the groups become larger and fewer,until with a homogenous display of 36 itemsthe positive search time is only slightlylonger than with a single target item. It isclear that subjects do not serially check eachof the 36 items when these are perceptuallygrouped. However, the search time does notdecrease linearly with the number of groups.It decreases much more slowly at first, asthe number of groups drops to 18 and thento 9 and then more steeply as it drops to 4and then to 1. Subjects benefit less fromsmall groups (of 2 or 4 items) than theyshould if they were serially checking groupsof items. One reason might be that focusingattention accurately enough to include morethan one homogeneous item but to excludeany heterogeneous items is itself a time-con-suming operation. The smaller the groupsthe less perceptually salient they become.

Another possibility is that in randomlymixed displays, subjects also perceptuallygroup items into clusters averaging 2 or 3homogeneous items that happen to be spa-tially adjacent to each other. This would

change the interpretation of the slope in thevaried display size condition; it should thenbe understood not as 58 msec per serialcheck of each item in turn, but, perhaps, as116 msec per serial check of each group of2 items in turn, or 174 msec per group of3 homogeneous items. If this hypothesis iscorrect, the grouped condition would showno difference between search times for 18groups and for 36 randomly mixed items,simply because the 36 items are in facttreated as approximately 18 groups. As thegroups in the grouped condition grow largerthan the randomly occurring groups in het-erogeneous displays, the search times startto reflect their presence.

These results differ from those of Kahne-man and Henik (1981), who found no effectof grouping on latencies for target detection.There are many differences between the ex-periments: the displays in Kahneman andHenik's experiment were presented for afixed, short interval (200 msec), whereasours remained available until the subject re-sponded. Perhaps more important is the fact

GROUPING AND OBJECT PERCEPTION 203

that their targets were digits in digit dis-tractors, and the basis they used for orga-nizing groups was a spatial discontinuity, aproperty that was irrelevant to the definitionof the target. In terms of feature-integrationtheory, there would be no gain in Kahnemanand Henik's task from focusing attentionserially on groups: If the target digit hap-pened to have a unique feature, it would bepicked up in parallel across the whole dis-play; otherwise it would require serial check-ing, item by item, regardless of whether theitems were spatially grouped.

Experiment 2: Visual Search Betweenand Within Perceptual Groups

Experiment 1 explored the effect of per-ceptual grouping on displays of constantsize. The number of groups therefore variedinversely with the number of items pergroup. The next experiment varied these twofactors orthogonally to compare directlytheir effects on search time. The theory pre-dicts that when groups are homogeneous,subjects carry out a self-terminating series(one per group) of feature checks, which areparallel within each group of homogeneousitems. The total search time should thereforereflect the number of groups to be checkedmultiplied by the time taken for a featurecheck of a group of that size. The search ofgroups should be self-terminating, while thesearch of items within groups should be ex-haustive up to the last group searched.

MethodSubjects. Two groups of nine subjects were tested,

four men and five women in each. They were students,aged 16-30, who volunteered to take part in the exper-iment and were paid $3 an hour. They had not takenpart in Experiment 1.

Stimulus materials. We used the same colored let-ters as in the conjunction condition of the previous ex-periment. They were spread over a larger area however,and displays of up to 81 letters were included. Thegroups in this experiment were spatially separated byempty space as well as segregated by the color and shapeof their items. There were nine types of display, com-prising all the combinations of one, four, and nine groupsand one, four, and nine letters per group. Examples offour types are shown in Figure 2. The groups were ar-ranged in a 3 X 3 square matrix with the outer cornersat 6.9° from the center. Each group of nine letters sub-tended 2.35°, and the total display of 81 letters sub-tended 9.67° X 9.67°. Letters within a group were ar-

ranged in a 3 X 3 matrix, or in a smaller 2 X 2 matrixkeeping the same interletter spacing as the 3 X 3 matrix,or a single letter was placed at the center position withina group. For displays containing only one group, eachof the nine group positions was used equally often; fordisplays containing four groups, the positions of thegroups were chosen randomly from the nine possible,except that each position was used equally often and nocombination of four positions was used more than once.

The distractors within each group were always ho-mogeneous, comprising either green Xs or red Hs. Therewere equal numbers of green X and of red H groupsacross the complete set of displays. Within the displaysof four groups, there were two of each kind, and withinthe displays of nine groups, there were four of one kindand five of the other. Their positions were randomlymixed in the matrix of nine group positions. Nine dis-plays of each type contained no target. Nine other dis-plays had one item replaced by a green H. In five dis-plays of each type it replaced a green X and in four itreplaced a red H. It appeared once in each group andonce in each position within the groups of nine, twiceor three times in each position within the groups of four.The stimuli were again presented in a Cambridge two-field tachistoscope, and subjects' responses were timedwith a millisecond timer.

Procedure. Two groups of nine subjects each wereused in this experiment. One group was given all thedisplay types randomly mixed together. The other groupwas given separate blocks containing the displays withone letter per group, four letters per group, and nineletters per1 group to see whether different strategieswould be adopted. The order of these three conditionswas counterbalanced across subjects and reversed forthe second session of each subject.

The procedure was otherwise the same as in Exper-iment 1. The task was to search for the conjunctiontarget (green H) and to press the right-hand key if itwas present and the left-hand key if it was not.

Results and Discussion

There were no significant differences be-tween the results of the two groups of sub-jects (those who had all the trial types ran-domly mixed and those who had displayswith different numbers of letters per groupin separate blocks). It seems that any strat-egy differences were minor and insignificant.

The theory predicts serial, self-terminat-ing search of groups and feature searchwithin each group. The feature search timewithin groups would be expected to show asmall effect of group size. In Experiment 1,search times on feature trials increasedslightly with display size. In the present ex-periment all groups except the last onechecked on positive trials would give nega-tive results; extrapolating from the negativelatencies in the feature trials of Experiment

204 ANNE TREISMAN

X X XX X XX X X

X X XX X XX X X

H H HgreerMH HJH

H H H

X X XX X XX X X

H H HHH HHH H

HH HH H HH H H

H H HH H HH H H

X X XX X XX X X

X X XX X XX X X

H H H-« green

HHHH

X X.X X

HHHH

X XX X

X X XX X XX X X

Figure 2. Examples of displays used in Experiment 2: (1) Nine groups of nine letters per group withtarget present; (2) four groups of four letters per group with target absent; (3) nine groups of one letterper group with target present; (4) one group of nine letters with target absent. (All Xs were green andall Hs were red with the exception of the green H targets, labeled in the figure, examples [1] and [3].)

1, we expect the effect of the numbers ofletters per group in Experment 2 to average8.6 msec per extra letter. There could alsobe an effect of group size on the intercept,for example, through changes in confidenceor in lateral masking. This makes the num-ber of parameters to be fitted rather large.However, three predictions can be testedagainst the data: (a) for each group size, theeffect of the number of groups on searchtimes should be linear; (b) for each groupsize, the ratio of positive to negative slopesfor groups should be 1:2; (c) the effect of thenumber of letters per group should be shownin the differences between the slopes forgroups of one, four, and nine letters; thiseffect should be small and nonlinear, ap-proximating 8.6 msec per letter. If, on theother hand, search were serial and self-ter-minating across items, regardless of group-ing, the slope for groups of nine lettersshould be nine times the slope for groupsof one.

The results, shown in Figure 3, clearly

reject the prediction of serial search of eachitem in turn. The ratio of the slope for nine-letter groups to the slope for one-lettergroups is closer to two than to nine. Oh theother hand, the results,are consistent withthe prediction that search of groups is serial,whereas search within groups, is parallel.Table 2 gives the best-fitting slopes and in-tercepts and the proportion of the variancewith number of groups that is due to lin-earity. The ratios of positive to negativeslopes approximate .5, and in each conditionat jeast 96% of the variance due to displaysize is accounted for by a linear function.Departures from linearity were significantonly in one case, positive displays with fourletters per group, F(\, 34) = 12.1, p < .01).When present, these departures all take thesame form, displays of four groups being alittle slower than expected. A likely reasonfor this disparity is that more eye movementsper group were required in these displays;the four groups were randomly dispersedover the nine possible locations, whereas

GROUPING AND OBJECT PERCEPTION 205

Response Time(in seconds)

800

Letters per GroupI 4 9

Positives fl c j.

Negatives {>_._o_.^

I1 4 9

Number of Croups

Figure 3. Mean search times in Experiment 2.

with nine groups all locations were filled.Analyzing separately' the negative displaysin which the four groups were located in twoadjacent rows or columns, we found a meanreduction of 50 msec for the four-lettergroups and 90 msec for the nine-lettergroups. This brings the mean reaction timesmuch closer to the predicted values on a lin-ear function.

The difference between the negative slopesfor groups of different sizes indicates a dif-ference of 72 msec between search timesthrough groups of nine homogeneous lettersand search times for a single letter. This isvery close to the 69 msec difference pre-dicted from the feature search results ofExperiment/1 and supports the claim thatsearch within each group consists of check-ing for single features.

The search rate through groups, on theother hand, is similar to but slightly slowerthan the rate for conjunctions in the un-grouped displays of Experiment 1, averaging69 msec per group compared to 58 msec peritem in the more densely packed displaysused earlier. Thus the present data appearto combine a serial conjunction search ofgroups with a parallel feature check withingroups.

Individual subjects again varied consid-erably in their search rates, but the general

pattern of results was similar for all. Positiveslopes for individual subjects ranged from10 to 48 (SD = 13) for displays with oneletter per group, from 36 to 76 (SD =16)for four-letter groups and from 48 to 205(SD = 45) for nine-letter groups; negativeslopes ranged from 33 to 186 (SD = 35) forone-letter groups, from 44 to 219 (SD = 44)for four-letter groups, and from 42 to 259(SD = 55) for nine-letter groups. Linearityaccounted for more than 98% of the variancewith display size in 60 of the 108 cases andfor less than 90% in only 10 cases.

Experiment 3: Spatial Densityand Search for Conjunction and

Feature Targets

A possible confounding variable in Ex-periment 2 is the spatial density of the items.The displays with nine groups of nine lettersused the whole area of 9.7° X 9.7°, whereasthe displays with one group occupied only2.35° X 2.35°. Eye movements could there-fore have contributed to the effect of groupsbut much less to the effect of items withingroups. The question of how spatial densityaffects search for conjunctions and for fea-tures is also of some general interest. Weclaim that serial search for objects definedby conjunctions of features is centrally ratherthan peripherally determined; it represents

Table 2Linear Regressions of Mean Search Times onNumber of Groups in Experiment 2

Lettersper

group Slope Intercept

Positive

149

325789

583654653

9896"99

Negative

73119145

638696771

989998

Note. Slope ratio for 1-letter groups = .44; for 4-lettergroups = .48; for 9-letter groups = .61 (M = .51).a % Variance with display size due to linearity.b Departure from linearity significant (p < .01).

206 ANNE TREISMAN

successive fixations of attention rather thanof the eyes. We expect an effect of spatialspread only if and when it affects the dis-criminability of features.

There are alternative accounts that wouldlink the difficulty of forming conjunctionsto the characteristics of peripheral vision,however. It is likely, for example, that lo-calization becomes less precise the more pe-ripherally the items are presented. Receptivefields are much larger in the periphery thanin the fovea.

Some findings by Beck (1972), Beck andAmbler (1973), and Ambler and Finklea(1976) are relevant here. Beck (1967) andBeck and Ambler (1973) found much worseperformance with distributed than with fo-cused attention in a task requiring discrim-ination of line arrangements (e.g., T vs. L)but not in a task involving line orientation(T vs. X). However, Ambler and Finklea(1976) found that the difference betweenfocused and distributed attention for linearrangements was absent or much reducedwhen the stimuli were presented foveally,suggesting that the difficulty is character-istic only of peripheral vision. More recently,Ambler, Keel, and-Phelps (1978) obtainedthe opposite results in a better-controlledexperiment. With line arrangements, per-formance deteriorated as the number ofitems increased, even with foveal presenta-tion, whereas with line orientation, parallelprocessing was possible in both cases. Itseemed worth investigating with color-shapeconjunctions whether retinal area was rele-vant or whether search depends, as we pre-dict, on a more central attentional scan.

We also compared the effects of the spa-tial density of the display on search for fea-ture targets. Here search should be parallelrather than serial in both central and pe-ripheral vision; the only differences shouldbe due to poorer acuity with peripheral rel-ative to central vision. Eye movements mightbecome necessary when features are hard todiscriminate and could introduce an effectof display size on sparsely scattered itemsthat would be absent with foveal displays.

MethodSubjects. The eight subjects (three male and five

female) were students at high school or university aged

between 16 and 25. They volunteered for the experimentand were paid $3 an hour.

Stimulus materials. Two sets of stimulus cards wereused; a sparse set and a dense set. The dense set wasthe cards used in Experiment 1 for conjunction and forfeature search in the varied display size condition; thesparse set was newly constructed to match in everythingbut spatial location. On both sets of cards, the itemswere arranged in regular rectangular matrices (2 X 2,4 X 4, 6 X 6) with a 1.3:1 ratio of width to height. Theitems in the sparse set for each display size were onaverage 4.28° from the fixation point; the items in thedense set for each display size were on average 2.14°from the fixation point. All the letters subtended .57°.Target positions were chosen so that the mean distancefrom fixation for targets was equal to the mean distancefrom fixation for the distractors in each set. Targetswere placed equally often in each quadrant of the ma-trices. The apparatus and task were the same as thosein Experiment 1.

Procedure. The four conditions, dense and sparseconjunction search and dense and sparse feature search,were run in four separate blocks with the order coun-terbalanced Across subjects as follows: Half the subjectsstarted with conjunctions and were tested for two ses-sions, one starting with a block of 64 dense trials, thena block of 64 sparse trials, and the other starting with64 sparse trials followed by 64 dense trials. These sub-jects were then tested in the same way on two furthersessions with feature targets. The other half of the sub-jects started with feature targets and then did the con-junction targets. The different display sizes, 1, 4, 16,and 36, were randomly mixed within blocks, as werepositive and negative trials.

Results

The mean correct response times in eachof the four conditions are shown in Figure4. Table 3 shows the slopes and the inter-cepts relating response times to display sizein each condition. The slopes for individualsubjects ranged from 22.8 to 45.6 for theconjunction positives (SD = 5.5) and from44.5 to 86.9 (SD = 14.4) for the conjunctionnegatives. In all but three of the 32 cases theproportion of the variance with display sizethat was due to linearity was over 98%. Forthe feature targets, individual slopes rangedfrom 1.8 to 7.3 (SD = 1.8) for the positivesand from -.6 to 37.6 (SD = 10.1) for thenegatives. Twenty-one out of 32 slopes hadless than 98% variance atrributable to lin-earity, the lowest being 15%.

Analysis of variance (ANOVA) on the con-junction positive trials showed a small butsignificant effect of density, F(l, 7) = 6.72,p = .04, but no interaction between displaysize and density F(3, 21) = 1.0. Thus, for

GROUPING AND OBJECT PERCEPTION

CONJUNCTIONS FEATURES

207

28OO —

24OO-

2OOO —

1600-

12OO —

8OO —

4OO-

Pbs. Neg.

I I1 4 36 1 4

DISPLAY SIZE

36

Figure 4. Mean search times in Experiment 3.

positive displays, doubling the spatial densityslightly speeded up the decision times buthad no effect on the rate of serial search forconjunction targets. On negative trials, den-sity did interact with display size, F(3,21) =4.78, p = .01, as well as affecting the overalldecision time, F( 1,7) = 11.86,/> = .01. Sub-jects took an extra 5 msec per item on av-erage, perhaps reflecting some additionalrechecking before the decision that there wasno target.

For feature targets there was a smalloverall effect of density on search for posi-tives, F(l, 7) = 14.5, p < .01, and a largereffect on negatives, F(l, 7) = 28.8,;? = .001,and in both cases a significant interactionwith display size, F(3, 21) = 3.79, p < .05,andF(3, 21) = 26.06, p< .001, for positivesand negatives, respectively.

Discussion

The results with conjunction targets againconfirm our previous findings of linear func-tions relating search time to display size anda 2:1 ratio of negative to positive slopes. Theaim of the present experiment was to see

whether search rates were affected by thespatial spread of items in the display. Thefact that doubling the area to be searchedhad little or no effect on search rates is con-

Table 3: Linear Regressions of Mean SearchTimes on Display Size in Experiment 3

Displaytype

DensePos.Neg.

SparsePos.Neg.

Slope

28.562.3

30.667.7

Int.

Conjunction

482486

495523

Sloperatio

.46

.45

%V"

>99.9>99.9

>99.9->99.9

Feature

DensePos.Neg.

SparsePos.

' Neg.

3.413.0

5.924.7

539603

559638

.26

.24

96.499.3

99.599.8

Note. Int. = Intercept. Pos. = positive. Neg. = negative.a Variance with display size due to linearity.

208 ANNE TREISMAN

sistent with the hypothesis that the serialscan is centrally controlled and representssuccessive fixations of attention rather thanperipheral fixations of the eyes. Eye move-ments were certainly unnecessary with thedense targets, which were all within 2.2° ofthe fovea and well within the acuity limitsfor discrimination, as shown by the speed ofdetecting the feature targets.

The only large effect of density was shownon feature negative trials, where the decisionthat no target was present was made muchmore slowly with the sparse than with thedense displays. Subjects may have been lessconfident that the target was absent whenthe distractors were peripheral than whenthey were centrally located. The increasedtime in this condition could reflect increasedeye movements and fixations. The fact thatfeature targets, when they were present inthe sparse displays, were still detected easilyand with little effect of display size showsthat the slow checking of negative displayswas actually unnecessary. However, it ap-pears to be a typical strategy with featuresearch under difficult or confusable condi-tions (cf. Treisman & Gelade, 1980, Exper-iments 3 and 4).

The absence of much effect of density onconjunction search contrasts with the sub-stantial effect of discriminability that weobtained in an earlier experiment (Treisman& Gelade, 1980). The speed of identifyingthe features that define a conjunction targetshould affect each serial check in the con-junction search and should therefore be re-flected in slope differences. The resultsconfirmed this prediction: The rate forconjunctions of easy features (red, green, O,and N) was more than double the rate forconjunctions of more difficult features (blue,green, X, and T). On the other hand, wehave little reason for the theory to expectslope differences in the present experimentwhere only spatial density was varied, sincethis should effect the time taken for eachserial check only if the shapes (X and H)or the colors (red and green) were harder todistinguish in the outside positions of sparsedisplays.

The results are consistent with Ambler etal.'s (1978) most recent findings, and notwith the earlier ones of Ambler and Finklea

(1976), and extend their conclusion fromline arrangement to conjunctions of valueson the different dimensions of shape andcolor. The conclusions may also be relevantto an alternative account of errors in objectperception proposed by Wolford (1975) inhis feature perturbation model. Wolfordsuggested that features become locally dis-placed over time and produce errors of iden-tification among neighboring items. Sincethese perturbations are local, they shouldaffect dense displays more than the sparseones, making them more prone to error andtherefore inducing slower, more carefulsearch. The fact that search rates hardlydiffered in our experiment suggests that spa-tial distance between features may be irrel-evant. However, one would need to varydensity and distance from the foveaindependently of each other before drawingany firm conclusion, since the two variablesmay have opposite effects.

Experiment 4: The Group Boundary Effecton Visual Search for Conjunctions

and Features

The final prediction we test in this articleis the most direct test of the idea that preat-tentive organization exists only within di-mensions—within a color map, within ashape map, within a map of movements ororientations-and that these maps are relatedto each other where and when attention isfocused.

If the generally accepted view were cor-rect, the early level of preattentive organi-zation should result in a single, global con-figuration to which all the different dimen-sions contribute according to their salience.A target should emerge spontaneously when-ever it is locally salient as figure againstground. On this view, the conjunction targetswere difficult to find in our earlier experi-ments (Treisman & Gelade, 1980) becausethe preattentive organization was too com-plex; the colors and shapes were randomlyintermingled. If, however, we introduce lo-cally homogeneous groups within which aconjunction target contrasts on a highly dis-criminable feature, it should "pop out" byautomatic preattentive segregation. The re-

GROUPING AND OBJECT PERCEPTION 209

suits so far described refute this prediction.The conjunction target did not emerge au-tomatically from grouped displays, evenwhen the groups were few and the targetvery salient in its local context. Feature-in-tegration theory, on the other hand, pre-cludes conjunction "pop out." If texture-seg-regation occurs only within each separatefeature map, focal attention will be requiredbefore the different feature maps (e.g., thecolor map and the orientation map) can beintegrated or interrelated to locate the uniqueconjunction of color and shape.

The further possibility then arises that wemight effectively camouflage an object byplacing it at a boundary between two groups,each of which shares one of its features. Wecan choose an object that, within eithergroup alone, would be quite salient and seeif adding the second group makes it harderto see. For example, we can place a red Xat the boundary between a group of red Osand a group of blue Xs. Figure 5 shows anexample. In order to detect the presence ofthe red X at the boundary, we predict thatattention must be narrowed down to excludethe adjacent red Os and blue Xs and to focuson the item itself, although in either groupalone it would be salient and easy to detect.When the target is in the center of a group,attention need be narrowed only to the groupwhich contains it. Thus detection should befaster for a conjunction target placed in thecenter of a group than at the boundary be-tween groups, although it should be slowerin both cases than with one group only. Onthe other hand, if the target is defined by aunique feature (the color green or verticallines), we would expect its detection to beindependent of its position relative to theboundary.

The displays in this experiment consistedof just six items arranged in a row. Therewere always two, three, or four blue Xs andfour, three, or two red Os homogeneouslygrouped. In half the displays one item in agroup of three or four was replaced eitherby a conjunction target (a red X or blue O)or by a feature target (a red or blue H ora green X or O). The task was to detectwhether the display contained an odd itemthat was not a blue X or a red O. The targetwas placed either at the boundary between

TARGET AT BOUNDARY

[XOOO

OOOTOI

TARGET IN CROUP

oo©o

raooo

Figure 5. Examples of displays used in Experiment 4.(The solid letters represent red, the outline letters blueand the dotted letter green.)

the blue Xs and red Os or one position awayfrom the boundary.

MethodSubjects, There were 10 subjects (6 women and 4

men), all students at the University of British Columbia,who volunteered for the experiment and were paid $3an hour.

Stimulus material. Each display contained a row ofsix colored letters, each subtending 1.18 °, with the wholerow subtending 8.5°. There were 30 negative displays,equal numbers containing two, three, and four blue Xson one side of a boundary and four, three, and two redOs on the other. Half of each type had the red Os onthe left and half on the right. The positive displays weredivided into four categories, depending on whether thetarget was a conjunction target (red X or blue O) ora feature target (green X, green O, blue H or red H)and on whether the target was next to the boundary orone item away from it. The target occurred equally oftenin positions 2, 3, 4, and 5, and never in positions 1 and6. It never replaced one of a group of two distractors,since this would have resulted in two unique items forthat, display. These variations produced 8 displays foreach of the four categories of positive display. Thus thecomplete pack, with the 30 negative displays, consistedof 62 displays.

Procedure. The task was to decide, as quickly aspossible without-making errors, whether the display con-tained a unique item (i.e., one which was neither a blueX nor a red O). Subjects pressed the right-hand key ifit did and the left-hand key if it did not. They were toldwhat the six unique items could be and were shownexamples of the positive and negative displays. Theywere given a few trials for practice before beginning theexperiment proper. Each subject was tested on fourblocks of 62 trials.

Results and Discussion

The mean reaction times and the meanpercentage of errors are shown in Table 4.

210 ANNE TREISMAN

Reaction times that were more than 3 stan-dard deviations from the mean for a givensubject in a given condition were discarded;these averaged .5%. An ANOVA showed sig-nificant effects of target type: conjunctionversus feature, F(l, 9) = 34.28, p = .0002;of target position (at the boundary or cen-tered in one group), F(l, 9) = 14.97, p =.004, and a significant interaction betweentarget type and target position, F(l, 9) =11.72, p = .0076. Newman-Keuls tests werecarried out to compare specific pairs ofmeans and are reported in context below.

There are two main findings: first, a cam-ouflaging effect of placing a conjunction tar-get at the boundary between groups; second,the fact that even within a homogeneousdistractor group, the conjunction target isdetected more slowly than the feature tar-gets.

Conjunction targets at the boundary weredetected 135 msec more slowly than featuretargets at the boundary (p < .01). Conjunc-tion targets were also more slowly detectedat the boundary than in the center of a group(p < .01), whereas for feature targets therewas no difference. Thus a unique feature,whether color or shape, can be detectedequally well at a boundary between two dif-

ferent groups and when placed between iden-tical items. A conjunction target, on theother hand, is harder to detect when placedat a boundary between items that on one sideshare its color and on the other its shape.Subjects missed it altogether on 9.3% trials(compared to a mean of 2.6% for all otherconditions) and took considerably longer(104 msec) to find it when they were suc-cessful. Phenomenologically, the conjunc-tion target is effectively camouflaged at theboundary unless attention is directed specif-ically to its location. There appear to be twocompeting ways of preattentively groupingthis display—one within the color map andone within the shape map. The conjunctiontarget exists in neither, although the featuretarget is always unique in one of the two. Inorder to see the conjunction target, the twoautomatic groupings must be broken. Thisinvolves relating shapes to colors and ap-pears to require an extra operation that takestime and focuses attention specifically on thelocation of the unique conjunction.

A second inference can be made by com-paring detection of a conjunction targetwithin a group with detection of a featuretarget within a group. Even when the targetis removed from the group boundary, it is

Table 4Mean Reaction Times, Standard Deviations, and Percent Errors in Experiment 4

Reaction times % Errors

Display

ConjunctionMSD

FeatureMSD

Feature shapeMSD

Feature colorMSD

NegativesMSD

Atboundary

769262

634206

625198

643216

681265

Ingroup

665195

623203

616179

630230

At InDifference boundary group

104 9.3 3.3

11 2.1 2.4

9 1.8 1.2

13 2.4 3.6

3.3

Difference

6.0

T

.6

-1.2

GROUPING AND OBJECT PERCEPTION 211

detected more slowly when defined by a con-junction of features than by a unique feature(p < .05). This is unlikely to result from adifference in discriminability at the featurelevel: The conjunction features (red and blueand X and O) are less confusable with eachother on average than they are with greenor with H. If discriminability at the featurelevel had been matched, we would expect,if anything, a larger advantage of the featureover the conjunction targets.

Thus, the presence of distractors else-where in the display, which share the locallydistinctive feature of the conjunction target,forces attention to narrow down at least suf-ficiently to exclude them. This conclusionwas also implicit in the results of the earlierexperiments. In Experiment 2, for example,the time to find a green H in an otherwisehomogeneous group of eight red Hs wasmore than 800 msec longer when there werefour other groups of red Hs and four of greenXs present in the display than when the sin-gle group of eight red Hs with its green Htarget was presented alone (in an unpre-dictable location). Here again the presence,even spatially segregated some distance awayfrom the target, of distractor letters thatshared the target color acted as potent cam-ouflage, destroying the salience of the localcolor contrast, which would otherwise havemade the target impossible to miss. The con-junction target apparently does not "popout" of the display as a whole, however ho-mogeneous its local context.

Although grouping variables had little orno effect on feature targets over the rangeof conditions we tested here, it would prob-ably be possible to devise displays in whichthey did: For example, in a display with 60%red and 20% each of blue and of yellow dis-tractors, we could probably make a greentarget more salient by embedding it in ahomogeneous group of red distractors thanby embedding it in a mixed group of blueand yellow distractors. However, local con-trast and discriminability interactions mightaccount for all or part of this difference. Atpresent, the claim should simply be that theeffects of grouping are both greater and alsodifferently mediated when conjunctions mustbe specified than when features will suffice.

General Discussion

We began with three premises: (a) Tex-ture segregation and perceptual grouping areearly preattentive processes, which are de-signed to specify candidates for object iden-tification (Neisser, 1967); (b) perceptualgroups are organized separately within eachdimension, and the different configurationsformed within the color and the shape mapsare interrelated only when attention is nar-rowed onto an item or group; (c) focusedattention is necessary to identify objectswhenever these must be specified by con-junctions of separable features (Treisman& Gelade, 1980). This article describes fourexperiments that relate these premises byexploring the effects of spatial configura-tions on search for objects and for features.

We infer a link between grouping and thedistribution of attention from the pattern ofsearch times produced by different spatialarrangements of targets and distractors.Variations that changed the predicted riskof illusory conjunctions affected search times,whereas those that did not had no effect.Detection of feature targets was, in general,independent of the spatial arrangement ofdistractors, although negative trials wereslower when distractors were more distantfrom the fovea. Search for conjunction tar-gets was serial across items when items wererandomly mixed and was hardly affected bytheir spatial density. It is unlikely, therefore,to be causally determined by eye movements,although it is certainly the case that subjectsmoved their eyes during the 1,200 and 2,600msec they spent searching the displays of 16or 36 items and presumably moved themmore often during the longer search latenciesthan during the shorter ones. Perceptualgrouping, on the other hand, had a very sub-stantial effect on conjunction search. Searchappeared to be serial across groups of itemswhenever the display contained spatially ho-mogeneous groups. We predicted and ex-plain this result by the claim that when thestructure of the display allows attention tobe spread over several homogeneous itemswithout risking an illusory target, the fea-tures of those items will be checked inparallel. Prinzmetal (1981) has obtained

212 ANNE TREISMAN

converging evidence consistent with this hy-pothesis. He found that subjects were morelikely to interchange features (e.g., movinga diameter from one circle to another) withina perceptual group than between perceptualgroups, even when the physical distance wasgreater within than between groups. His dis-plays were too .brief to allow serial attentionto each item in turn. If attention were at-tracted, at least on some proportion of trials,to one or other perceptual group, we wouldexpect illusory conjunctions to form morefrequently within than across the groupboundaries.

The contrast between attending to singleitems and attending to groups is clearly re-lated to the difference between local andglobal processing. When we attend to theglobal configuration, each local elementloses its separate identity, except insofar asit contributes to the identity of the whole.Thus, for example, the location of each dotin a row is specified at the global level onlyby the fact that it is colinear with the others.A change of processing level entails a changein the encoding of the same physical stim-ulus. Thus a change in the direction of cur-vature of the mouth becomes a change froma sad to a smiling face; at this global levelthe identity of the local cue may be lost.However, the existence of emergent prop-erties (Pomerantz, Sager, & Stoever, 1977)at the global level should place constraintson the acceptability of illusory conjunctionsbetween local features within an attendedgroup of items. Feature exchanges thatwould alter global as well as local unitsshould more often be rejected than those thatleave the global encoding unchanged. Thusa red X might be easier to detect and lesslikely to exchange its color or shape whenit is at a uniquely specified point in a globalconfiguration.

The claim that preattentive organizationoccurs only within and not between dimen-sions explains why conjunction targets areinvisible until attention is narrowed down tothe local group within which they possess aunique, distinctive feature. It also predictsthat embedding a conjunction target be-tween two competing groups should makeit more difficult to detect. Experiment 4 de-

monstrated this conjunction camouflage.Although in either group alone the targetwould be salient (a red H with red Os or ablue O with red Os), adding the secondgroup effectively concealed it at the preat-tentive level. The results suggest that wehave many separate preattentive worlds,each organized along its own lines and eachoffering its jpwn particular candidates forobjecthood. In normal life these worlds willtend to coincide, since physical objects havehighly correlated feature boundaries. If everthey disagree, we may discover the truth onlywhen we turn our attention to the misleadingcues.

It would be interesting to know whetherthe different feature maps are simulta-neously accessible. One way to test this isto see whether subjects detect a feature tar-get more quickly when they know which mapto consult than when either could reveal thetarget. In a supplementary experiment, eightnew subjects were tested on the feature cardsfrom Experiment 4 in two different condi-tions, one with the H and the green targetsrandomly mixed and one with the H and thegreen targets separated in different blocks.When subjects knew which feature to lookfor (green or H) they found it 68 msec fasterthan when they were looking for either, F(l,7) = 18.49, p = .005. However, there was nointeraction between known versus unknownfeature and the position of the target, sug-gesting that attention in the spatially selec-tive sense was not involved. Although thefeatures within both the shape and the colormap are registered in parallel, the two di-mensions may compete or alternate witheach other in their access to decision andresponse.

What is the relation between the preat-tentive grouping that segregates potentialobjects for focused attention and the phe-nomenal experience of a perceptually orga-nized world?, It follows from the present ac-count that preattentive parsing cannot beavailable to consciousness. We do not ex-perience shapeless color or colorless shapesbut always some combination of the two.Neisser (1967) appeared to equate preatten-tive organization with the representationthat we consciously experience of regions

GROUPING AND OBJECT PERCEPTION 213

outside the current focus of attention. Thushe suggested that preattentive processingguides our movements and creates a globalimpression of our surroundings. Other wri-ters have also operationally equated preat-tentive with phenomenological organization.For example, Banks and Prinzmetal (1976)asked subjects to indicate the phenomenalgrouping that they experienced in a displayand correlated this with the efficiency ofsearch for target items in the same display.It seems empirically plausible, although notlogically necessary, that preattentive featuresegregation should also determine experi-enced organization. However, our claim isthat the separate preattentive feature mapsare not themselves directly accessible to con-scious awareness. Conscious access must fol-low a level of resynthesis at which colors areallocated to shapes and movements to ob-jects, however imprecisely defined these fea-tures may be. Whether the combinations arecorrect or not will depend (a) on whetherattention was focused on the item or groupsin question and (b) whether, in the absenceof attention, prior expectations and knowl-edge of the context can select the possibleor plausible features to conjoin.

A final question concerns the nature of theattention spotlight. What are the constraintson the shape it can adopt? Can it focus ona concave or indented shape like a cross, ormust the edges of the spotlight be convex?Can it lock onto a group of moving stimulior continuously vary its shape to track achanging configuration of elements overtime? Can it in fact be spatially split whenits target is a moving group interspersed withstationary stimuli? Does it conform to all theGestalt grouping principles or is it con-strained by rules of its own? If the functionof both preattentive grouping and focusedattention is the same—to specify objects—we might expect the same laws to determineboth. The paradigm of conjunction searchcould perhaps be used to throw more lighton these issues.

ReferencesAmbler, B., & Finklea, D. L. The influence of selective

attention in peripheral and foveal vision. Perception& Psychophysics. 1976, 19, 518-526.

Ambler, B. A., Keel, R., & Phelps, E. A foveal dis-criminability difference for one vs. four letters. Bul-letin of the Psychonomic Science, 1978, //, 317-320.

Banks, W., & Prinzmetal, W. Configurational effectsin visual information processing. Perception & Psy-chophysics, 1976, 19, 361-367.

Beck, J. Effect of orientation and of shape similarity onperceptual grouping. Perception & Psychophysics,1966, ;, 300-302.

Beck, J. Perceptual grouping produced by line figures.Perception & Psychophysics, 1967, 2, 491-495.

Beck, J. Similarity grouping and perceptual discrimi-nability under uncertainty. American Journal of Psy-chology, 1972, 85. 1-19.

Beck, J., & Ambler, B. The effects of concentrated anddistributed attention on perceptual acuity. Perception& Psychophysics, 1973, 12, 482-486.

Caelli, T. M., Julesz B., <%. Gilbert, E. N. On perceptualanalyzers underlying visual texture discrimination:Part II. Biological Cybernetics, 1978, 29, 201-214.

Eriksen, C., & Hoffman, J. Temporal and spatial char-acteristics of selective coding from visual displays.Perception & Psychophysics, 1972, 12, 201-294.

Farmer, E. W., & Taylor, R. M. Visual search throughcolor displays: Effects of target-background similar-ity and background uniformity. Perception & Psy-chophysics, 1980, 27, 267-272.

Fox, J. R. Continuity, concealment and visual attention.In G. Underwood (Ed.), Strategies of informationprocessing. New York: Academic Press, 1978.

Garner, W. R. The processing of information and struc-ture. Potomac, Md. Erlbaum, 1974.

Julesz, B. Experiments in the visual perception of tex-ture. Scientific American, 1975, 232(4), 34-43.

Julesz, B. Spatial nonlinearities in the instantaneousperception of textures with identical power spectra.In C. Longuet-Higgins & N. S. Sutherland (Eds.),The psychology of vision. Philosophical Transactionsof the Royal Society, London, 1980, 290, 83-94.

Julesz, B. & Caelli, T. On the limits of Fourier decom-position in visual texture perception. Perception,1979,, 5, 69-73.

Kahneman, D., & Henik, A. Effects of visual groupingon immediate recall and selective attention. In S.Domic (Ed.), Attention and performance VI. Hills-dale, N.J.: Erlbaum, 1977.

Kahneman, D., & Henik, A. Perceptual organizationand attention. In M. Kubovy & J. R. Pomerantz(Eds.), Perceptual organization. Hillsdale, N.J.: Erl-baum, 1981.

Kubovy, M. Concurrent-pitch segregation and the the-ory of indispensable attributes. In M. Kubovy & J.R. Pomerantz (Eds.), Perceptual organization. Hills-dale, N.J.: Erlbaum, 1981.

Laberge, D. Attention and the measurement of percep-tual learning. Memory & Cognition, 1973, /, 268-276.

Neisser, U. Cognitive psychology. New York: Appleton-Century-Crofts, 1967.

Olson, R., & Attneave, F. What variables produce sim-ilarity, grouping? American Journal of Psychology,1970, 83, 1-21.

Pomerantz, J. R., Sager, L. C., & Stoever, R. J. Per-

214 ANNE TREISMAN

ception of wholes and of their component parts: Someconfigural superiority effects. Journal of Experimen-tal Psychology: Human Perception and Performance,1977, 3, 422-435.

Prinzmetal, W. Principles of feature integration in vi-sual perception. Perception & Psychophysics, 1981,30, 330-340.

Shiffrin, R., & Schneider, W. Controlled and automatichuman information processing: II. Perceptual learn-ing, automatic attending, and a general theory. Psy-chological Review, 1911,84, 127-190.

Townsend, J. T. Some results on the identifiability ofparallel and serial processes. British Journal ofMathematical and Statistical Psychology, 1972, 25,168-199.

Treisman, A., & Gelade, O. A feature integration theoryof attention. Cognitive Psychology, 1980,12, 97-136.

Treisman, A., & Schmidt, H. Illusory conjunctions inthe perception of objects. Cognitive Psychology,1982, 14, 107-141.

Treisman, A., Sykes, M., & Gelade, G. Selective at-tention and stimulus integration. In S. Dornic (Ed.),Attention and performance, VI. Hillsdale, N.J.: Erl-baum, 1977.

Wertheimer, M. Untersuchungen zur lehre von derGestalt. Psychologische Forschung, 1923, 4, 301-350.

Wolford, G. Perturbation model for letter identification.Psychological Review, 1915,82, 184-199.

Received June 26, 1981 •