Motion Processing: How Low Can You Go?

Embed Size (px)

Text of Motion Processing: How Low Can You Go?

  • Motion Processing: How Low Can YouGo?


    Richard J.A. van Wezel and Maarten J. van derSmagt

    Neurons at early stages in the visual system can onlyview small parts of the visual world, impeding theirability to determine correctly an objects motion direc-tion. New studies suggest that this aperture problemis already solved by special neurons at the first stageof motion detection in primary visual cortex.

    An important task for the visual system is to determinethe direction and speed of moving objects. At the firststages of cortical motion processing, in the primaryvisual cortex (V1), motion-sensitive neurons eachrespond to only a very small region of the visual field(Figure 1). This means that, for most of these neurons,the objects contours will extend beyond their smallreceptive fields. This would seem to present a problem,as these neurons only measure the motion componentperpendicular to the moving contour, which is not nec-essarily the same direction or speed as the wholemoving object. This is what has become well known asthe aperture problem [1,2]. Now Pack et al. [3] havereported that a certain class of primary visual neurons the so-called end-stopped cells respond in a direc-tion-selective manner specifically to the end-points of amoving contour, irrespective of its orientation. Theseneurons should in principle be capable of solving theaperture problem.

    Neurons in V1 respond to contours of a specificorientation. Most of these neurons especially the so-called complex cells are also direction selective.This means that they are excited when the contour ismoving in one direction and inhibited by motion in theopposite direction. Most of those complex cellsrespond well to short contours or endings of contours,but are suppressed by longer contours. This character-istic of these neurons is referred to as end-stopping.V1 contains neurons with a whole range of degrees ofend-stopping [4]. Strongly end-stopped cells were iden-tified in the pioneering days of neurophysiologicalrecordings by Hubel and Wiesel [5], who referred tothem as hyper-complex cells.

    A number of researchers in computational vision andhuman psychology [6,7] have suggested that neuronswith end-stopping behavior are optimally suited tosolving the aperture problem, as these cells respondwell to contour-endings. As illustrated in Figure 1, forinstance, an end-stopped neuron responds to thecontour-ending of the wing of a plane in accordancewith the perceived motion direction. Non-end-stoppedcells, however, respond to the long contour of the wingbut signal the wrong direction. In contrast with these

    theoretical considerations, neurophysiological evidencehas tended to support the idea that the apertureproblem is solved at a later stage of motion processing,where information from V1 neurons is combined. Singleunit recordings have shown that a group of strongdirection-selective neurons in the middle temporal area(MT, also known as V5) that receive input from V1 havesolved the aperture problem [810]. The paper by Packet al. [3] seems to refute this generally accepted hier-archical model.

    Pack et al. [3] trained macaque monkeys to maintainvisual fixation while a stimulus of white and black barswas presented on a grey background at the receptivefield location of a recorded cell. The researchers usedstimulus sequences that contained bars moving in dif-ferent directions in combination with different orienta-tions. End-stopped cells responded best when the barendpoints were in the receptive field, and poorly whenthe bar was centred on, and extended beyond, thereceptive field. The cells preferred motion directionwas independent of the bars orientation. It is this inde-pendence of stimulus orientation that makes these cellssuitable candidates for solving the aperture problem,without the need for further integration in area MT.

    Another recent study, by Tinsley et al. [11], hasshown that a specific class of V1 neurons in themarmoset monkey responds to the pattern direction ofa moving plaid. A plaid is composed of two sinusoidalgratings of different orientations added together. Eventhough the stimulus consists of two independentlymoving component gratings, it is perceived as a plaidmoving in a single direction. For component gratings,the aperture problem holds only the direction per-pendicular to the orientation is perceived. For plaids,the aperture problem is solved there is only oneunique possible plaid direction.

    Current Biology, Vol. 13, R840R842, October 28, 2003, 2003 Elsevier Science Ltd. All rights reserved. DOI 10.1016/j.cub.2003.10.017

    Helmholtz Institute, Utrecht University, Padualaan 8, 3584 CHUtrecht, The Netherlands. E-mail:

    Figure 1. The aperture problem of motion vision.A non-end-stopped V1 neuron can signal an incorrect motiondirection of a contour of a moving object. Through an aperturethe contour seems to move perpendicular to the orientation ofthe contour, and this does not always coincide with the direc-tion of the whole object. An end-stopped neuron signals thecorrect motion direction for a contour end.

    Non-endstopped cell

    End-stopped cell

    Current Biology

  • Again, traditional views hold that gratings areprocessed in V1, and that those signals are combinedin a subsequent stage of visual motion processing.Evidence for this integration stage has been found inarea MT, which contains neurons responding specifi-cally to moving plaids, the so-called pattern neurons[8,9,12]. Tinsley et al. [11] show for the first time thatneurons in V1, which are broadly tuned for directionand have short, wide receptive fields prefer a directionthat coincides with the direction of the moving plaid.Even though Tinsley et al. [11] do not report the end-stoppedness of the cells that they recorded from, it isvery well possible that the group of neurons is similar tothe end-stopped neurons described by Pack et al. [3].

    The conclusion drawn by both Pack et al. [3] andTinsley et al. [11] is that the aperture problem can besolved as early as in V1, and that integration at highercortical stages is not necessary to solve the apertureproblem and explain phenomena such as plaid motion.Yet the hierarchical (two-stage) model of motion pro-cessing has long been the consensus view in the field.Indeed, several studies have demonstrated that motionintegration takes place in MT, for instance in thepattern neurons described above [8,9,12]. Further-more, Pack et al. [3] show that their results on end-stopped neurons are very similar to what they find inMT. Do these pattern-selective MT neurons perhapsreceive their input from end-stopped V1 cells? Cur-rently, there is no direct evidence for this, but it mightbe obtained from simultaneous recording of cells in V1and MT. MT pattern cells respond differently to movingplaid patterns that are changed slightly to mimic trans-parency [13], in parallel with the change in perceptualperformance [14,15]. What about those end-stoppedV1 cells?

    A closer inspection of the temporal dynamics of thecell responses can also give more insight in the flow ofinformation. One very interesting aspect of Pack et al.s[3] study is that they used a receptive-field mappingtechnique that allows for the disclosure of temporal

    dynamics of neural responses in detail. This so-calledreverse correlation technique is based on a very rapidpresentation of a noisy stimulus sequence (in the1020 millisecond range). By reverse correlating theneural response with this stimulus sequence, it is pos-sible to obtain the temporal aspects of receptive fieldcharacteristics very efficiently. This technique, devel-oped to study auditory cortical neurons [16] is nowa-days increasingly in vogue in vision research to studydynamical processes in the visual cortex [17,18]. Packet al. [3] observed that end-stopping grows over time.At first, the cell responds well to a bar placed anywherein the receptive field, and only 2030 milliseconds laterthe end-stopped characteristic is clearly visible (Figure2A). This means that end-stopping takes time to evolve,just like many surround effects documented in V1[19,20]. It is interesting to note that such a time coursefor end-stopped neurons has also been previously sug-gested from the results of human psychophysicalexperiments [6].

    In conclusion, the new studies [3,11] put forward thepossibility that the motion direction (and speed) isalready adequately coded in primary visual cortex, yetimportant questions remain. The most important ofthese questions is how end-stopping emerges in V1. Isthis accomplished at the level of V1, for instance byhorizontal connections, or is it a consequence of feed-back from higher cortical areas like area MT (see Figure2B)? In the latter case the aperture problem would notbe solved in primary visual cortex, even though wenow know it can at least be observed this low in thecortical hierarchy.

    References 1. Stumpf, P. (1911). Uber die abhangigkeit der visuellen bewe-

    gungsrichtung und negativen nachbildes von den reizvorgangen aufder netzhaut. Zeitschr. Psychol. 59, 321-330.

    2. Wallach, H. (1935). Uber visuell wahrgenommene bewegungsrich-tung. Psychol. Forsch. 20, 325-380.

    3. Pack, C.C., Livingstone, M.S., Duffy, K.R., and Born, R.T. (2003).End-stopping and the aperture problem: two-dimensional motionsignals in macaque V1. Neuron 39, 671-680.

    Current BiologyR841

    Figure 2. Response dynamics of end-stopped cells.(A) Time course of the response in V1 when a long bar is presented in the center of the receptive field (green line), or when the end-point of the bar appeared in the receptive field (black line); data averaged over 29 end-stopped V1 neurons, and taken from [3]. Thecells initially respond well to a bar placed anywhere in the receptive field, but after 2030 milliseconds, they respond only when theendpoints are in the recept