17
CYBERPSYCHOLOGY & BEHAVIOR Volume 3, Number 3, 2000 Mary Ann Liebert, Inc. Bridging the Gap Between “Real Reality” and Virtual Reality: Intelligent Human–Machine Therapeutic Interaction in Patient Videospace MORRIS STEFFIN, M.D. ABSTRACT Any virtual reality (VR) system may be considered to be a bidirectional interface between pa- tient and effector devices. Most standard configurations of VR interaction present a computer- generated, virtual environment for the patient with measurement of the patient response, fol- lowed usually by rather stereotyped changes in the computer-generated presentation. The patient’s behavior is thus molded to the computer “conception” of reality. In contrast, tech- niques of quantitative patient videospace analysis developed in this laboratory provide for use of real-world input/output in patient and clinician interfaces. Monitoring of patient move- ment during epileptic seizures with spatially oriented time-domain and spectral processing of video intensity data leads to a mapping procedure that allows pattern recognition of ictal behaviors, thus providing the substrate for the clinician virtually to observe the behavior with greater analytic detail. A similar technique, based on real-world videospace and applied to patients with movement disorders, produces signals that should be suitable for controlling haptic therapeutic and assistive devices. Close linking of trigger signals to video material pre- sented to subjects in simulation training environments provides the methodology that will be useful for monitoring cognitive responses through event-related evoked potentials. Such monitoring provides a basis for closing psychophysical feedback loops to increase effective- ness of training paradigms and cognitive therapies. The technique also provides for bidirec- tional patient interaction by tying video displays to physiologic responses and to electro- physiologically measurable cognitive responses, thus suggesting enhanced modes of biofeedback and cognitive feedback approaches. 447 INTRODUCTION H UMAN COMPUTER INTERACTION can assist in both the diagnosis of neurologic and psy- chiatric disease and in the generation of novel management and therapeutic strategies. An es- sential component of such interaction is real time quantitative video analysis of the patient visuospatial milieu, that is quantitative patient videospace analysis. As the interaction among patient, therapist, and machine attains increas- ing fidelity to these human environments, the involvement at both cognitive and affective (limbic) levels becomes more focused in gen- erating useful behaviors for both the patient and the therapist. An effective virtual reality (VR) interven- tional system must be a bidirectional interface Scottsdale, Arizona.

Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

  • Upload
    morris

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

CYBERPSYCHOLOGY & BEHAVIOR

Volume 3, Number 3, 2000Mary Ann Liebert, Inc.

Bridging the Gap Between “Real Reality” and VirtualReality: Intelligent Human–Machine Therapeutic

Interaction in Patient Videospace

MORRIS STEFFIN, M.D.

ABSTRACT

Any virtual reality (VR) system may be considered to be a bidirectional interface between pa-tient and effector devices. Most standard configurations of VR interaction present a computer-generated, virtual environment for the patient with measurement of the patient response, fol-lowed usually by rather stereotyped changes in the computer-generated presentation. Thepatient’s behavior is thus molded to the computer “conception” of reality. In contrast, tech-niques of quantitative patient videospace analysis developed in this laboratory provide foruse of real-world input/output in patient and clinician interfaces. Monitoring of patient move-ment during epileptic seizures with spatially oriented time-domain and spectral processingof video intensity data leads to a mapping procedure that allows pattern recognition of ictalbehaviors, thus providing the substrate for the clinician virtually to observe the behavior withgreater analytic detail. A similar technique, based on real-world videospace and applied topatients with movement disorders, produces signals that should be suitable for controllinghaptic therapeutic and assistive devices. Close linking of trigger signals to video material pre-sented to subjects in simulation training environments provides the methodology that willbe useful for monitoring cognitive responses through event-related evoked potentials. Suchmonitoring provides a basis for closing psychophysical feedback loops to increase effective-ness of training paradigms and cognitive therapies. The technique also provides for bidirec-tional patient interaction by tying video displays to physiologic responses and to electro-physiologically measurable cognitive responses, thus suggesting enhanced modes ofbiofeedback and cognitive feedback approaches.

447

INTRODUCTION

HUMAN–COMPUTER INTERACTION can assist inboth the diagnosis of neurologic and psy-

chiatric disease and in the generation of novelmanagement and therapeutic strategies. An es-sential component of such interaction is realtime quantitative video analysis of the patientvisuospatial milieu, that is quantitative patient

videospace analysis. As the interaction amongpatient, therapist, and machine attains increas-ing fidelity to these human environments, theinvolvement at both cognitive and affective(limbic) levels becomes more focused in gen-erating useful behaviors for both the patientand the therapist.

An effective virtual reality (VR) interven-tional system must be a bidirectional interface

Scottsdale, Arizona.

Page 2: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

between patient and devices, and, just as im-portantly, an interface between therapist andpatient. In most standard configurations of VRinteraction, a computer-derived, virtual envi-ronment is presented for the patient with mea-surement of the patient response, followedusually by rather stereotyped changes in thecomputer-generated presentation. The pa-tient’s behavior is thus molded to the computer“conception” of reality. But in many medicalapplications, the reverse philosophy must beadopted. The computer must sense the actual(“real”) environment generated by the patientand his actions and then react “intelligently” tothe patient. Two effects flow from such com-puter behavior. First, the immersiveness of thepatient environment is greatly enhanced by theessential realism of the presentation. Second,the computer assessment of the patient can beused to provide insights for the therapist thatmay not be evident from observing the pa-tient’s behavior unaided by machine filteringand processing.

In this connection, it is useful to recall the di-chotomous relationship between human andmachine. The human abstracts subtle, “fuzzy”(in the artificial intelligence sense), and inher-ently parallel multisensory perceptions of theenvironment. In particular, this method of dataprocessing leads to the appreciation of affectivecomponents of nonverbal behavior and, inmany therapeutic situations (neurologic andpsychiatric patient evaluation in the usual clin-ical settings, for example), to semiquantitativeassessments of patient performance. Whenmore quantitative behavioral data are required,for the most part, assessments remain primar-ily behavioral (neuropsychologic test instru-ments or simple transducers for physiologicmonitoring). The computer, on the other hand,quintessentially discretizes input following aprimarily serial, von Neumann approach.Fuzzy pattern recognition is difficult to imple-ment with power even remotely approachingthat available to a human in an interpersonalencounter.

Results of bidirectional quantitative video-space techniques under development in this lab-oratory suggest that the bidirectional video-space approach, coupled with novel integrationof simultaneously measured electrophysiologic

data including event-related evoked potentials(ERPs), can provide new methods of patient–therapist–machine interaction that should pro-duce greater insight into patient behavior atboth cognitive and physical levels. Four clinicalapplications to which these methods apply areepilepsy monitoring for automated seizure de-tection and analysis, visual-haptic therapeuticinterfaces, simulation training environments,and biofeedback. The first two applications aredependent upon machine-based interpretiveaids for the objective assessment of physiology.The last two applications depend upon a bidi-rectional flow of information between machineand patient or subject, with the establishment ofpsychophysical feedback loops for behavioraland affective modification.

The following methods and discussion will bedivided between the primarily physiologic(epilepsy monitoring and haptics) and the pri-marily psychophysical (simulation training andbiofeedback) applications. Both types are heav-ily dependent upon the quantitative videospacetechnology under development here.

PHYSIOLOGIC APPLICATIONS

Computerized pattern analysis of move-ments during video-EEG monitoring can flagand quantify convulsions and activity suspi-cious for complex partial seizure activities.Such ictal events are often of mesial temporalor frontal lobe origin, and the EEG manifesta-tion (especially with scalp electrode record-ings) may be equivocal or nil.1–3 The movementpatterns may involve head adversion, facialmyoclonus, automatisms, or simple semipur-poseful movements. With present technology,the observing neurologist categorizes suchmovements by visually matching behavior towhat is expected, through clinical experience,to be observed in a seizure. While the human“fuzzy” approach allows screening of a widevariety of possible ictal behaviors, in equivocalsituations this subjectivity can be insufficient todiscriminate epileptic seizures from nonepilep-tic seizures. On the other hand, while video-space analysis techniques now can quantify as-pects of particular events preselected by theclinician, as yet, ongoing automated pattern

STEFFIN448

Page 3: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

recognition of candidate ictal behaviors re-mains elusive.4 In an attempt to provide a sub-strate for pattern recognition technology in thisarea, an expanded time-domain, frequency do-main method has been implemented to processdata obtained from video-EEG telemetry.

Methods

The computer algorithms that have in-creased intelligence and discrimination abili-ties that are currently under development re-

quire continuous input of time- and frequency-sensitive measurements of patient motor be-havior. This information becomes available byexpansion of the methods previously de-scribed.1 Multiple regions of interest (ROIs) are established over the patient videospace.Video intensity in each of these regions is con-tinuously recorded. The frequency and timecharacteristics are then analyzed online.

Figure 1 demonstrates this approach for ac-quisition of data generated by ictal motor ac-

PATIENT VIDEOSPACE 449

FIG. 1. Videospace technique applied to complex partial seizure. Video intensity is averaged separately over foursubregions (from SR 1 to SR 4) within each of three regions of interest (ROI 1 to ROI 3). These are placed under po-sitioning control by the observer at the start of recording and may be modified during recording. The ROI and SRplacement is labeled over ROI 1 covering the head region. ROI 2, over the left upper extremity, and ROI 3, over theright upper extremity, are indicated by overlying intensity markers varying with the SRs; layout in all ROIs is simi-lar, though the size of each ROI with its associated SRs is separately set. Traces below the ROIs show time-domainintensity data. Diagonal lines at the left of intensity traces are pointers to the respective ROIs. EEG recorded in thevideotelemetry suite is shown at the immediate left of the video. Graphs at far left are variable resolution Fourierpower plots for each SR in each ROI, for full epoch (P) and subepochs (P/2 and P/4). Spectral mapping of this graphicdata is shown in the auxiliary ROI display at lower right, as described in text. Scale at right qualitatively shows in-creasing power in higher frequency bands. The dark mask is used to obscure patient identity. Full epoch is 6 sec-onds. Time shown is total (over multiple epochs) since beginning recording of event. Results of movement spikingare seen, especially involving ROI 1 and ROI 2, with associated SRs (see text).

Page 4: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

tivity. In this example, three ROIs are estab-lished: ROI 1 over the face, ROI 2 over the leftupper extremity, and ROI 3 over the right up-per extremity. Each ROI is further subdividedinto four subregions (SR 1–SR 4), as labeled inROI 1. The ROI locations are indicated by theoverlying rectangles of different color intensi-ties, delineating left upper, right upper, leftlower, and right lower SRs respectively, foreach ROI. The video intensity averaged overeach SR is computed frame by frame (real time)and plotted over an 6-second epoch sweep.One such epoch is displayed in Figure 1. Thus,below the video display are 12 channels of in-tensity data (3 ROIs 3 4 SRs), each corre-sponding to a SR within the respective ROI.(Diagonal lines at the left of the time-domain

display are reference pointers linking thegraphs to the ROIs in the video display. Timedomain plots for the SRs, from top to bottom,represent SR1–SR4 respectively.) Repetitivespiking can be observed in several channels, es-pecially those associated with ROI 1 and ROI2. These spikes are the result of slight, fairlyregular, clonic activity causing facial and ex-tremity movement during partial seizure ac-tivity.

Frequency domain data are extracted by amodified hybrid FFT-wavelet technique thatuses sine/cosine basis functions similar to ashort-term Fourier transform (STFT) approach,but also provides for changing resolution overthe epoch and translation over subepochs.5 Thefrequency domain data are plotted at left. The

STEFFIN450

FIG. 2. Plots are arranged as in Figure 1 and are applied to adversive head movement during complex partial seizure.However, the spectral map is being computed during the epoch and is shown on both main and auxiliary displaysas described in the text. Subepoch displays for each SR in each ROI are labeled according to schema described in text.

Page 5: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

first trace (headed P) shows the power spec-trum over 5 bands, from 0 to 4f, where P is theperiod of the full epoch (6 seconds in this case), and f 5 1/P. The second and third traces,headed P/2, show each half-epoch (first 3 sec-onds and last 9 seconds of the full epoch), againwith five bands scaled similarly for these half-epochs. This approach provides twice the timeand frequency resolution, and selects mid-fre-quencies. The four traces at right show thesame results, except for quarter-epochs (1.5 sec-onds), beginning with the first quarter epoch atleft, with time and frequency resolution fourtimes greater than over the full epoch. At theend of each full epoch sweep, all data are com-puted, although computation may also be ob-served during the sweep, selectable during dis-play by a switch option.

The frequency-domain graphic depiction,

then, shows a more detailed spectral patternwith greater time/frequency resolution thanwould be possible with an STFT approach. Theresolution is, of course, substantially less thanwould be available from a pure wavelet analy-sis, but the latter could not be realistically in-corporated into a real-time method running ona PC under the constraints of this application.A future modification of the technique will al-low for a parallel processing approach to thisproblem.

While the graphic portrayal allows for moredetailed examination of movement patterns, itis at this point still somewhat anti-intuitive.Therefore, the frequency data are also color-mapped over the ROIs. The second, smallerpicture shows the ROIs overlaid with acolor/intensity code (employing a scale shownqualitatively at right) indicating the relative

PATIENT VIDEOSPACE 451

FIG. 3. Plots are arranged as in previous figures and are applied to wingbeat automatism involving mainly rightupper extremity during complex partial seizure.

Page 6: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

power over the frequency bands for eachepoch. It can be observed that a shift towardhigher amplitudes of higher frequencies is present in ROI 1, with somewhat less high fre-quencies and power in ROI 2, and still less inROI 3 (all SRs). This pattern indicates that mostof the movement spiking activity is occurringin the first two regions, and this fact corre-sponds with the spiking pattern seen in the ROI1 and ROI 2 time domain intensity plots. Thislevel of spectral quantification would be diffi-cult to portray to the observer with more con-ventional data output formats.

Figure 2 demonstrates in more detail the lay-out of the spectral components over the ROIsand the respective SRs. Here, the spectral dis-play is seen in both displays (main and alsoauxiliary spectral) for the case of an adversivehead movement. Again, three ROIs are shown,

positioned as in Figure 1. In the main display,the layout is labeled. Each ROI contains fourSRs. In each SR, the color power display rec-tangle for the full epoch (labeled P) is at thelower left, the two display rectangles for thetwo half-epochs (labeled P/2) are shown onelevel higher, and the four display rectanglescorresponding to the four quarter-epochs (la-beled P/4) are shown at the top of the SR. Thesame pattern is followed for all ROIs and allsubregions within the ROIs. When spectra arecomputed during the epoch, the color maps areshown on both the main video and the auxil-iary spectral displays. Occasionally, this modeallows a more intuitive understanding of theprocess as the video plays. For this intraepochcomputation mode, spectra were computed fornearly the full epochs (Figs. 1 and 2). Truncatedcomputation (only half the epoch for example)

STEFFIN452

FIG. 4. Plots are as in previous figures and are applied to adversive head movement preceding clonic phase of gen-eralized seizure.

Page 7: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

was avoided in final displays because it is lessaccurate and does not show all the epoch data.Where standard displays (only final computa-tion of complete epoch) are shown, the spectralmaps are presented only in the auxiliary dis-plays and show data (also displayed graphi-cally at left) from the epoch just preceding thatshown in the time domain graphs below themain display.

Results

Epilepsy monitoring with automated seizure de-tection and analysis. Figure 1 depicts data forthe movement spikes described above. Theseoccurred in this patient as the presenting mo-tor manifestation during a frontal lobe seizure.Shortly thereafter, a head adversive movement

occurred (patient right to left), as shown in Fig.2. The time-domain graphs show a portion ofthis movement as a large, low-frequency dis-placement mainly in the head-related (ROI 1and ROI 2) channels. Note the opposite direc-tion of the intensity change as the head shiftsfrom ROI 1 into ROI 2. The spectral graphs alsoshow rather less higher frequency componentsover ROI 1 and ROI 2 in Figure 2, and the spec-tral maps confirm this. There is still substantialhigh frequency background activity, more ran-dom, corresponding to “movement noise,”which is seen especially over ROI 3 (right up-per extremity) compared to ROI 1 and ROI 2(head and left upper extremity) as opposed tothe rhythmic spiking seen in Figure 1. Note thatthis is essentially a reversal of the pattern ofFigure 1, in which the higher frequency com-

PATIENT VIDEOSPACE 453

FIG. 5. Plots are as in Figure 4 and are applied to the clonic phase of the same seizure. Higher frequency compo-nents are present diffusely in maps and graphs (see text).

Page 8: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

ponents were more prominent in ROI 1 andROI 2. It is suggested that sufficient distinctionexists that an automated pattern recognitionroutine, currently under development here,should be able to discriminate these two con-ditions and also the conditions that arise for dif-ferent movement patterns.

Figure 3 demonstrates a different behavior.Here, the patient is manifesting a wingbeatingmovement that is regular in frequency but non-sinusoidal, or “jerky,” in character. The move-ment involves both upper extremities, espe-cially the right, and traverses all the ROIs (andtheir SRs) as positioned for this episode. Thetime domain waveforms clearly show the peri-odic sawtooth movement (period approxi-mately 2 seconds). The spectral graphs show

substantial power in both high and lower fre-quency bands across the epoch, correspondingto the fairly low fundamental frequency withthe higher frequencies arising from the nonsi-nusoidal characteristic of the movement. Ofcourse, there are also still substantial amountsof movement noise that could be averaged outover multiple episodes with the proposed tech-nique described below.

A generalized tonic-clonic seizure is seen inFigures 4 and 5. In Figure 4, there is a head ad-versive movement, indicated in the time do-main as an upward drift of the ROI 1 and, to alesser extent, the ROI 2 waveforms. In Figure5, the clonic phase begins, as manifested byhigh amplitude continuous oscillatory activityin all the ROI time domain signals. Compari-

STEFFIN454

FIG. 6. Complex partial seizure showing head adversion and handclasp. Display as described in previous figures.Note coherence of head and extremity movements in time domain plots and greater high frequency components inROI 1 and ROI 2.

Page 9: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

son of both the graphic spectral data and themaps indicates this increase in high-frequencypower.

A complex partial seizure is demonstrated inFigures 6 and 7. First, in Figure 6, nearly syn-chronous movements of head (ROI 1), left up-per extremity (ROI 2), and right upper extrem-ity (ROI 3) occur. (Toward the end of themovement, the patient’s head becomes situatedpartly outside the ROI in this example, and isseen in that final position at the conclusion ofthe epoch.) The time domain graphs, spectralgraphs, and the spectral maps show clearly in-creased high frequency activity for the headmovement (ROI 1) and also a substantial highfrequency component for the left upper ex-tremity (ROI 2). Smaller amplitude high fre-quency components occur in ROI 3 (right up-per extremity). Shortly afterward, as shown in

Figure 7, there is a relaxation phase for the headand extremities, though less well defined.Here, too, however, a predominance of highfrequency activity is localizable to ROI 1 andROI 2.

The described method of videospace analy-sis exhibits increased definition of patterns in-herent in seizure activity and renders these pat-terns with high sensitivity. This form of datapresentation appears, therefore, to provide abasis for the construction of pattern-recogni-tion algorithms for flagging seizure activity.Further refinements will be necessary to pro-vide reliable automation of the recognitionprocess.

Visual haptic interface for therapy and motor taskassistance. According to principles of move-ment analysis,6,7 the videospace methods de-

PATIENT VIDEOSPACE 455

FIG. 7. As in Figure 6, but at relaxation phase (end of motor event).

Page 10: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

scribed can be utilized to generate data applic-able to haptic drivers. For visual-haptic feed-back to be useful in a clinical setting, machinedetection of erratic patterns of motor behaviormust be directly related to the relevant featuresin the patient’s visual space.

Consider application of the technique to thecase of cerebellar tremor (dysmetria). The ba-sic defect is a nonlinear oscillation due to in-adequate control feedback mechanisms. Sens-ing this oscillation as a pattern requiresdetailed analysis of its spatial characteristics inreal time and in relation to videospace targetacquisition. Figure 8 shows an epoch duringwhich the patient attempts to reach a glass butoverturns it. (Here, total epoch length is 3 sec-

onds). Previous analytic approaches to thisproblem6–8 have allowed quantitative charac-terization of the oscillatory behavior but withinsufficient spatial and pattern resolution forthe establishment of a force corridor, which isa videospace region for a target trajectory thatis translatable into haptic space. Upper ex-tremity movement is allowed within the forcecorridor, but must be damped by appropriatecounterforce when movement aberration fromthe trajectory arises. With the method de-scribed here, a videospace mapped to appro-priate force space can be set up with multipleROIs and SRs as cues from the videospace.

The approach is shown in Figure 8. ROIs arepositioned along the patient’s forearm, wrist,and over the target object (glass). Encroach-

STEFFIN456

FIG. 8. Trajectory display as would be required for force corridor setup. Patient with cerebellar tremor attempts toreach glass target, but overturns the glass. ROIs with SRs are placed along forearm, wrist, and hand. High amplitudehigh frequencies are seen throughout in all displays (high amplitude tremor). Full epoch (P) is 3 seconds.

Page 11: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

ment on the subregions is signaled by inten-sity alterations plotted on the time-domain dis-plays as described above. Rather than threechannels, as in the previously described ap-proach,6 there are 12 channels, each with vari-able frequency resolution. Spectral plots andmaps are rendered as above. Substantial aber-ration is evident in both low frequency andhigher frequency movement components.Overall there is less aberration demonstratedwhen the patient’s movement is stabilized byassistance from the therapist, as shown in Fig-ure 9, although there is a single large ampli-tude movement spike at the initiation of move-ment. This approach, therefore, allows mea-surement of the time and frequency compo-nents of the aberration coregistered with posi-tion information related to the target in video-space.

Discussion

A method is presented that increases preci-sion of extraction of clinically relevant datafrom patient videospace. Such data may beused to monitor behavior more precisely in as-sessment of aberration of motor activity in sev-eral diverse settings, ranging from epilepsy tomotor dysfunction.

Currently, while the technique is an im-provement over previous methods, a majorproblem remains. Pattern recognition is visiblewith much more quantitative detail, but corre-lation to discrete clinical events is still not au-tomated. In the case of epileptic monitoring, itremains to produce an intelligent system thatcan be trained on the time-domain and spectralcharacteristics of the selected individual pa-tient behavior so as to allow unattended flag-

PATIENT VIDEOSPACE 457

FIG. 9. As in Figure 8 but with therapist manual assist to steady patient right upper extremity. Overall, there is lesshigh frequency activity with a more accurate approach (glass is grasped).

Page 12: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

ging of behavior and assessment of probabilitythat the discriminated behavior is ictal and localizable. In the case of visual target interac-tion, a similar problem arises, with the addi-tional complication of generating a quantita-tively appropriate response to drive a hapticdevice to correct or assist with aberrant move-ments.

The benefits of such automation includemore rapid and accurate diagnosis of atypicalseizure activity and the creation of more re-sponsive motor physical therapy and assistiveapproaches. Progress in such approaches re-quire refinement of the movement data acqui-sition techniques described here.

PSYCHOPHYSICAL APPLICATIONS

The following applications can be developedon the basis of a videospace-electrophysiologicinterface being designed here. Rationale for theconcept follows, along with preliminary resultsindicating the directions for implementing theinterface for psychophysical studies.

Methods

Enhancement of pilot simulation and related cog-nitive training techniques. Development of ef-fective pilot simulation training environmentsrequires objective measures of subject perfor-mance. Such measures include subject output,particularly motor responses to stimulus input.However, such measures, while documentingthe final goal of training, are limited in the elu-cidation of the process of learning. The simu-lation environment presents an array of stim-uli that the subject must perceive and prioritize.Measurement of these cognitive and executivesteps is largely subjective and relies to a greatextent on reports of the subjects as to the per-ception of stimuli and the motor responseprocess. However, by application of interactivevideospace stimulus presentations using tech-niques already developed in this laboratory incombination with simultaneous acquisition ofERPs generated by target stimuli, it should bepossible to achieve quantitative online assess-ments of the fundamental and integrative cog-nitive processes essential to the performance of

required tasks and the generation of requiredbehavior.

The operating hypothesis is thus the follow-ing: ERP monitoring, appropriately coupled torelevant stimuli in the simulator environment,will provide a robust means of detecting pilotcognition of important target stimuli indepen-dent of behavioral responses to those stimuli.As a corollary, this method of cognition detec-tion in the simulator environment will be apowerful tool to elucidate the process by whichthe behavioral response is evoked and, thus, toguide operationally the optimization of stimu-lus parameters in training paradigms and pro-tocols.

In this laboratory, techniques and instru-mentation are currently being developed to re-alize this goal. Work has been completed on thesoftware allowing for simultaneous onlinevideo and evoked potential processing. Thesystem under development thus allows for in-tegrating evoked potential recording with thespecialized requirements for simulation pilottraining. The result of this approach will be toclose the psychophysical feedback loops re-quired to implement analysis of cognitive pro-cessing, as well as motor output, and so to fa-cilitate design and monitoring of simulationtraining protocols.

The system imports real-world audio andvideo. A wide variety of data can be processed,including ERPs and physiologic input. By us-ing the videospace analysis technique devel-oped here, it is possible to pick off trigger sig-nals from the video that represent complex andmultiple stimuli. Several different triggers canbe used for different audio/video stimuli inthis configuration. Such triggers can initiateERP averaging epochs in bins corresponding tothe individual targets.

The test milieu allows interfacing with otherneurological methods of functional imaging. Inparticular, the trigger signals available fromstimulus paradigms of the type described herecan be employed to control acquisition of func-tional MRI (fMRI) data in combination withERP data in an approach that should lead tomore precise cortical mapping. Triggers de-rived from this approach will be used in anal-ogous fashion to time fMRI recording epochs.There is inherent complementarity of fMRI and

STEFFIN458

Page 13: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

ERP: the former has superior spatial resolution,while the latter has superior temporal resolu-tion. With the video trigger system described,this complementarity can be exploited in futurestudies to enhance the temporospatial func-tional data that would be difficult to obtainwith either technique alone. Greater precisioncan theoretically be achieved in the charting ofcortical function in training and cognitive pro-cessing environments applicable to simulationsettings and analysis of learning disabilitiesand cortical dysfunction in a variety of settings,including learning disorders.9–12

Optimization of display parameters for highperformance stimulus response environmentsin air traffic control requires many of the samegeneral cognitive mechanisms as those re-quired for pilot functioning. Specifically, targetstimuli on the monitor screen must be distin-guished from nontarget stimuli, and the targetstatus varies contextually with the immediatetasks and their priorities. For example, tworadar indicators may show reduced proximity,and it is important to determine what displayparameters will trigger the perception by thecontroller of an impending collision course.Similarly, a missile warning control monitormust abstract changing patterns of radar to de-cide when an atypical cluster pattern may rep-resent a hostile activity. In each case, the effec-tiveness of the stimulus can be determined byERP measurements prior to and independentof any motor response by the training subject.

Objective evaluation of psychophysical phe-nomena is simplified by examination of ERPscoupled to quantitative videospace. The par-ticular importance of the ERPs is that their mea-surement provides objective, quantitative evi-dence of the degree of attention and perceptualactivity that the subject is producing regardingthe target stimuli. Thus, if a subject indicatesthat he is perceiving a given type of stimulus,but generates no corresponding ERP, the like-lihood is less that his description is accurate.Conversely, if a subject does not recall percep-tion of a target stimulus, but his ERP record in-dicates robust electrophysiologic evidence ofsuch perception, then his report of the eventsat the time of stimulation is probably inaccu-rate; while he may actually not recall percep-tion of the stimulus at a later time, his aware-

ness at the time of presentation would be es-tablished. Such analysis will better define thestages in grapheme and phoneme processingin dyslexic patients, and should also be applic-able to the study of autism, brain injury, anddementia. The study of fluctuation in cognitionduring complex partial seizures can also be en-hanced by correlating ERP data with behav-ioral and fMRI.9,10

Enhancement of pilot simulation training tech-niques. Central to the rationale for pilot train-ing by simulation (and, by extension, to otherstimulus-response learning paradigms) is therequirement that subjects must learn relevantcockpit signals and must develop appropriateresponses to those signals. On the basis of wellestablished neurophysiologic principles, it ispossible to study pilot perception of relevantstimuli and reaction times electrophysiologi-cally, as related to cognitive processes, to suchstimuli.

Considerable evidence exists to indicate thatintegration of ERP data with quantitativevideospace techniques will realize these goalsof objective metrics of cognition and related in-creases in training efficiency. ERP data haveshown promise both with regard to the earlycomponents and the P3 complex with somespecificity for the processes measured.

Attention has been given to the significanceof the early ERPs, particularly though N1, andpossibly N2, which are more exogenous (thatis, less dependent on mental set and more de-pendent on physical stimulus parameters) thanthe later potentials, such as P3 and following.The exogenous potentials are classically andgenerally more associated with preprocessingand, for the visual stimulation case, probablyoriginate in occipitotemporal and occipitopari-etal projection systems.13,14

The P300 (P3) potential arises whenever thesubject becomes aware of the target stimulusthat he or she has been instructed or condi-tioned to anticipate, and as such is endogenous.The exact brain structures that generate this po-tential have not been entirely elucidated, but itis clear that temporal lobe (probably includinghippocampus), frontal lobe, and parietal lobe,as well as additional portions of the limbic sys-tem, all contribute to this potential. Virtuallyany form of stimulus can be used to evoke the

PATIENT VIDEOSPACE 459

Page 14: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

N1-P3 complex. A blip, or pattern of blips, ona screen that varies in color, intensity, or posi-tion from other blips or patterns will evoke theP3 response provided the rare blip is consideredto be relevant. Multimodality (audio-visual ortactile-visual) stimulation can also be used toevoke N1-P3 complexes.

However, it is not necessary that the subjectbe constantly attending to a particular stimu-lus as a target to mount a P3 response to it. Ithas been shown that the appearance with per-ception and recognition of a novel or rare stim-ulus on background, when the subject isprimed appropriately, is sufficient to producea P3 response. The brain systems generatingthis characteristic response probably have hadevolutionary protective value in mediating therecognition of predator or other rapidly devel-oping dangerous situations.15 In that regard,the applicability to the pilot situation is clear.

ERPs, and particularly P3, are robust indica-tors of subject attention and recognition ofstimulus relevance in situations requiring be-haviors relating to cockpit performance. Targetshooters demonstrate grater amplitude of P3,as well as earlier components, than control sub-jects. Decreased reaction time has been shownto be correlated with decreased P3 latency.16

Clusters of objects, with a target being oneatypical object in the cluster, elicit changes bothin the earlier ERP components, indicating gen-eral perception of the stimulus, and in P3 com-ponents, indicating perception of the target, oratypical, stimulus.17 Additional P3 componentscan be resolved that are induced by target-likestimuli as distractors.18 P3 morphology can beshown to vary with targets having multiple fea-tures required to be relevant, with dependenceupon whether the targets are superimposed orseparated spatially.19 Division of attention be-tween multiple relevant stimuli can compro-mise performance, and such compromise hascorrelated well with decreased P3 amplitude inthese situations.20 And, as also noted above, P3potentials can be evoked by target stimulialone, without nontarget stimuli.21 Apparently,because of the increased complexity of pro-cessing, P3 latencies tend to increase with com-peting or conflicting features in the target stim-ulus.22 P3 amplitude decrements in prolongedstimulus situations can reflect decreasing vigi-

lance with more difficult discrimination tasks,and such decrement is associated with poorerresponse performance.23 Multiple types of on-going processing (e.g., determining the direc-tion of a dot placement and searching for a tar-get match) produce changes in P3 associatedwith each type of activity, and performance im-pairments in such cases are associated with de-creases in the amplitudes of associated P3 com-ponents.24 Presentation of stereoscopicallyrelevant data can also generate P3 signals inperceivers and fail to generate P3 in nonper-ceivers.25

Thus, the neurophysiologic substrate existsfor the utilization of the integration of ERP withvideospace technique to produce more efficienttraining routines for pilots and to advancestudies of learning disabilities . Based upon thisneurophysiology, a video-ERP trigger acquisi-tion system has been developed as follows.

Implementation of the videospace–physio-logic interface is shown in Figure 10. Importedreal world video, simulator external imagery,is combined with two types of simulated dis-plays. The instrumentation and target displayscomprise two oscilloscopic images and a track-ing display that may vary arbitrarily. Thewarning displays include moderate to high pri-ority indicators that may appear infrequently,according to a training paradigm. The pilotview is a combination of the external imagesand the machine-generated displays. The ex-ternal images can be processed before displayin the pilot view; for example, notice the imagerotation between the two views. The machine-generated displays and/or the external im-agery can be downloaded from a simulator en-vironment as well as from real-world imagery.Integration of subject video presentation (thatis the pilot view) and physiologic data isdemonstrated in the evaluator view with phys-iologic data.

Results

In Figure 10, intensity plots are presented inthe bottom graphs that are derived from thevideo intensity as described above. However,in contrast to the configuration in those appli-cations, in which patient generated video wasmonitored, here, instead, the monitored video

STEFFIN460

Page 15: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

PATIENT VIDEOSPACE 461

FIG. 10. Schema of interactive, simulation training paradigm. Simulator external imagery is combined with warn-ing displays, instrumentation, and target displays. Video feature intensity data and physiologic data, including pro-vision for evoked potentials, are presented in evaluator view with physiologic data. Time-domain graphs of ROI in-tensities are plotted at bottom. ROI 1 (plotted in top time-domain trace) is over dual red flags, ROI 2 (middle trace)is over region of the circle in the rectangle at lower right, and ROI 3 is over upper trace of left oscilloscopic display.Standard Fourier power plots are shown respectively in the graphs at left, with 50 power frequency bands (0 to 50/P).ROIs each contain only one SR (no subdivision of ROIs) in this display only. Full epoch (P) is 18 seconds with nosubepoch calculations in this display only (thus, total of 3 intensity channels). No spectral maps are displayed in thisFigure. (ERP traces at left are representative, for illustrative purposes.)

Page 16: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

is that which appears in the pilot view screen.Three ROIs are used here without subregions:the first over the upper warning display (at thetwo obliquely placed flags), the second over thelower warning display (orange circle in bluerectangle), and the third over a small region ofthe left oscilloscope trace.

Thus, the upper trace of the three time domain plots shows the intensity in the regionof the upper warning display. Several stepchanges are seen when the display occurs, butwhen the display is absent, the intensity plotshows the average over the region of the im-agery intensity. Therefore, any changes in theregion will affect the display and enable its useas a trigger signal for ERP detection. Similarly,the ROI for the second trace is on the filled cir-cle within the rectangle. The downward (neg-ative) deflections here indicate the appearanceof the display as monitored directly from thevideo.

This configuration allows for abstractingERP triggers directly from video features. Con-versely, with this same approach, a video dis-play can be programmed to arise precisely insynchronization with external trigger signals,including physiological measurements (heartrate, electromyogram, skin resistance, and tem-perature) and ERPs. Individual features can belinked to separate triggers that can control av-eraging bins in the evoked potential acquisitionsystem. Such triggers, derived from the videoor used to alter the video display, can also beused to trigger acquisition of fMRI images. Inits present form, then, the system is quite flex-ible. The key point here is that there is close(frame-by-frame) linkage between video fea-tures and the trigger input/output interface.

Discussion

Traditional approaches to behavioral modi-fication and assessment in training and diag-nostic sessions have been directed at behav-iorally relating output to stimulus input.Although electrophysiologic measures of cog-nition have been utilized as a general assess-ment of higher cortical function, precise link-ing to varied and realistic stimuli has beenlimited. The presently described methodologyprovides an alternative approach to this prob-

lem. By closely linking measures of cognitiveresponse to specific real-world or devised com-plex video stimuli, it should be possible to pro-duce closed psychophysical feedback loops.There is evidence to indicate that such loops,particularly in more immersive environments,will profoundly alter affective responses toanxiety stressors and pain.26 The close linkingpossible with this approach should also facili-tate development of novel psychometric in-struments to facilitate the mapping of corticalfunction in learning disorders and dementia.Like the physiologic applications, these psy-chophysical feedback approaches will lead tostronger functional and theoretical bridg-ing between real world perceptions (“real reality”) and machine-generated perceptions.The methodology presented here represents astarting point on this road.

ACKNOWLEDGMENTS

Raw video-EEG patient data were providedby David Blum, M.D., Barrow Neurologic In-stitute, Phoenix, AZ.

REFERENCES

1. Steffin, M. (1999). Quantitative video analysis of com-plex epileptic seizures during videotelemetry: In-creasing the reliability of EEG correlation and behav-ioral autocorrelation. Cyberpsych Behav 2(1):25–33.

2. Thomas, P., Zifkin, B., Migneco, O., Lebrun, C., Dar-court, J., & Andermann, F. (1999). Nonconvulsive sta-tus epilepticus of frontal origin. Neurology 52(6):1174–1183.

3. Krumholz, A. (1999). Nonepileptic seizures: Diagno-sis and management. Neurology 53(5):876–883.

4. Steffin, M. (1999). Virtual reality to evaluate motor re-sponse during seizure activity. In Lorenzo, N.Y., &Lutsen, H. (eds.) Neurology

5. Polikar, R. (1999). The wavelet tutorial. On line document: www.public.iastate.edu/~rpolikar/WAVELETS/WTutorial.html 1999.

6. Steffin, M. (1999). Visual-haptic interfaces: Modifica-tion of motor and cognitive performance. In Lorenzo,N.Y., & Lutsep, H. (eds.) Neurology.

7. Steffin, M. (1997). Virtual reality therapy of multiplesclerosis and spinal cord injury: Design considera-tions for a haptic-visual interface. In Riva, G. (ed), Vir-tual Reality in Neuro-Psycho-Physiology . Amsterdam:IOS Press, pp. 185–208.

8. Steffin, M. (1997). Computer assisted physical ther-

STEFFIN462

Page 17: Bridging the Gap Between "Real Reality" and Virtual Reality: Intelligent Human-Machine Therapeutic Interaction in Patient Videospace

apy of multiple sclerosis and spinal cord injury pa-tients: An application of virtual reality. In Morgan,K.S., Hoffman, H.M., Stredney, D., Weghorst, S.J., etal. (eds.) Medicine meets virtual reality Amsterdam: IOSPress, pp. 64–72.

9. Benson, R.R., FitzGerald, D.B., LeSueur, L.L.,Kennedy, D.N., Kwong, K.K., Buchbinder, B.R.,Davis, T.L., Weisskoff, R.M., Talavage, T.M., Logan,W.J., Cosgrove, G.R., Belliveau, J.W., & Rosen, B.R.(1999). Language dominance determined by wholebrain functional MRI in patients with brain lesions.Neurology 52(4):798–809.

10. Kwee, I.L., Yukihiko, F., Matsuzawa, H., & Nakada,T. (1999). Perceptual processing of stereopsis in hu-mans: High field (3.0 Tesla) functional MRI study.Neurology 53(7):1599– 1601.

11. McAllister, T.W., Saykin, A.J., Flashman, L.A., Spar-ling, B.A., Johnson, S.C., Guerin, S.J., Mamourian,A.C., Weaver, J.B., & Yanofsky, N. (1999). Brain acti-vation during whole memory 1 month after mild trau-matic brain injury. A functional MRI study. Neurology53(6):1300– 1308.

12. Matthews, P.M., Clare, S., & Adcock, J. (1999). Func-tional magnetic resonance imaging: Clinical applica-tions and potential. J Inherit Metab Dis 22(4):337–352.

13. Bertus, M., Wijers, A.A., Lange, J.J., Mulder, G., &Mulder, L.J. (1997). An ERP study of visual spatial at-tention and letter target detection for isoluminant andnonisoluminant stimuli. Psychophysiolo gy 34(5):553–565.

14. Anllo-Vento, L., & Hillyard, S.A. (1996). Selective at-tention to the color and direction of moving stimuli:Electrophysiological correlates of hierarchical featureselection. Percept Psychophys 58(2):191–206.

15. Rappaport, M., Clifford, J.O., & Winterfield, K.M.(1990). P300 response under active and passive at-tention states and uni- and biomodality stimulus pre-sentation conditions. J Neuropsychia t Clin Neurosci2(4):399–407.

16. Czigler, I., Balazs, L., & Lenart. (1998). Attention tofeatures of separate objects: An ERP study of target-shooters and control participants. Int J Psychophysiol31(1):77–87.

17. Luck, S.J., & Hillyard, S.A. (1994). Electrophysiologi-cal correlates of feature analysis during visual search.Psychophysiolog y 31(3):291–308.

18. Makeig, S., Westerfield, M., Jung, T.P., Covington, J.,

Townsend, J., Sejnowski, T.J., & Courchesne, E. (1999).Functionally independent components of the late pos-itive event-related potential during visual spatial at-tention. J Neurosci 19(7):2665–2680.

19. Czigler, I., & Balazs, L. (1998). Object-related atten-tion: An event-related potential study. Brain Cogn38(2):113– 124.

20. Ullsperger, P., & Grune, K. (1995). Processing ofmulti-dimensional stimuli: P300 component of theevent-related brain potential during mental compar-ison of compound digits. Biol Psychol 40(1–2):17–31.

21. Mertens, R., & Polich, J. (1997). P300 from a single-stimulus paradigm: Passive versus active tasks andstimulus modality. Electroencephalogr Clin Neurophys-iol 104(6):488– 497.

22. Smid, H.G., Mulder, G., & Mulder, L.J. (1990). Selec-tive response activation can begin before stimulusrecognition is complete: A psychophysiological anderror analysis of continuous flow. Acta Psychol74(2–3):169–201.

23. Pritchard, W.S., Brandt, M.E., Shappell, S.A., O’Dell,T.J., & Barratt, E.S. No decrement in visual P300 am-plitude during extended performance of the oddballtask. Int J Neurosci 29(3–4):199–204.

24. Hoffman, J.E., Houck, M.R., MacMillan, F.W. III, Si-mons, R.F., & Oatman, L.C. (1985). Event-related po-tentials elicited by automatic targets: A dual-taskanalysis. J Exp Psychol Hum Percept Perform 11(1):50–61.

25. Fenelon, B., Neill, R.A., & Manning, M. (1984). Stereo-scopic cerebral evoked potentials of Air Force pilotsand civilian comparison groups. Aviat Space EnvironMed 55(10):914–920.

26. Steffin, M. (1999). Virtual reality biofeedback inchronic pain and psychiatry. In Lorenzo, N.Y., & Lut-sep, H. (eds.) Neurology http://www.emedicine.com.

Address reprint requests to:Morris Steffin, M.D.

P.O. Box 5654Scottsdale, AZ 85261-5654

E-mail: [email protected]

PATIENT VIDEOSPACE 463