ACTA UNIVERSITATIS UPSALIENSIS Uppsala Dissertations ...uu.diva-portal.org/smash/get/diva2:859855/FULLTEXT01.pdfHär nyttjas teori och tekniker från fält så som statistik, anomalidetektion

ACTA UNIVERSITATIS UPSALIENSIS Uppsala Dissertations from the Faculty of Science and Technology

115

Identification Techniques for Mathematical Modeling of the Human

Smooth Pursuit System

Daniel Jansson

Dissertation presented at Uppsala University to be publicly examined in P2446,Lägerhyddsvägen 2, Uppsala, Friday, 27 November 2015 at 13:15 for the degree of Doctorof Philosophy (Faculty of Theology). The examination will be conducted in English. Facultyexaminer: Professor Johan Schoukens (Electrical Engineering (ELEC) Department of theVrije Universiteit Brussel).

AbstractJansson, D. 2015. Identification Techniques for Mathematical Modeling of the HumanSmooth Pursuit System. Uppsala Dissertations from the Faculty of Science and Technology115. xiv+176 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9367-7.

This thesis proposes nonlinear system identification techniques for the mathematical modelingof the human smooth pursuit system (SPS) with application to motor symptom quantification inParkinson's disease (PD). The SPS refers to the complex neuromuscular system in humans thatgoverns the smooth pursuit eye movements (SPEM). Insight into the SPS and its operation is ofimportance in a wide and steadily expanding array of application areas and research fields. Theultimate purpose of the work in this thesis is to attain a deeper understanding and quantificationof the SPS dynamics and thus facilitate the continued development of novel commercialproducts and medical devices. The main contribution of this thesis is in the derivation andevaluation of several techniques for SPS characterization. While attempts to mathematicallymodel the SPS have been made in the literature before, several key aspects of the problem havebeen previously overlooked.This work is the first one to devise dynamical models intended forextended-time experiments and also to consider systematic visual stimuli design in the context ofSPS modeling. The result is a handful of parametric mathematical models outperforming currentState-of-the-Art models in terms of prediction accuracy for rich input signals. As a complementto the parametric dynamical models, a non-parametric technique involving the constructionof individual statistical models pertaining to specific gaze trajectories is suggested. Both theparametric and non-parametric models are demonstrated to successfully distinguish betweenindividuals or groups of individuals based on eye movements.Furthermore, a novel approach toWiener system identification using Volterra series is proposed and analyzed. It is exploited toconfirm that the SPS in healthy individuals is indeed nonlinear, but that the nonlinearity of thesystem is significantly stronger in PD subjects. The nonlinearity in healthy individuals appearsto be well-modeled by a static output function, whereas the nonlinear behavior introduced tothe SPS by PD is dynamical.

Keywords: nonlinear system identification, biomedical signal processing

Daniel Jansson, Department of Information Technology, Division of Systems and Control, Box337, Uppsala University, SE-75105 Uppsala, Sweden.

© Daniel Jansson 2015

ISSN 1104-2516ISBN 978-91-554-9367-7urn:nbn:se:uu:diva-264292 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-264292)

To whomever it may concern

AcknowledgementsI would like to start by thanking my supervisor Professor Alexander Medvedevfor providing guidance and knowledge whenever I needed it, but letting mework as independently as I wished to. You have been very encouraging andalways had great confidence in me, probably more than I had myself. Youhave taught me that if I just skip sleep, I can write a full-length research paperin four days! I am also grateful that you gave me the opportunity to travel theworld. We have met in 17 cities in 10 countries and we have had a good timetogether in all of them.

I also want to thank my second supervisor, Professor Petre Stoica, both foryour wisdom and help, but also for your anecdotes and comic relief. Youtaught me to always wear sunglasses when biking to avoid flies, to not eat thepotatoes at BMC, to never buy cheap binoculars, and that Hu is the presidentof China.

A big thanks to everyone at SysCon for creating a great work environment.Thank you Marcus for convincing me to apply for the PhD position, and thankyou Patrik for getting me to study engineering physics. Also thank you Olovand Marcus for our daily discussions about my and your research; I think weall benefited from them, and without you this thesis would not be what it istoday.

Of course, I thank my mother and my father for always being supportiveand for raising me to become what I am. This would definitely not have beenpossible if it were not for you.

Finally, I would like to thank Caroline for being there and supporting meduring my work on this thesis. You gave me inspiration and made it all mucheasier.

Thank you,Daniel

Oh and thank you Advanced Grant 247035 from European Research Councilfor all the cash!

Sammanfattning på svenskaDenna avhandling föreslår ett antal olinjära systemidentifieringstekniker förmatematisk modellering av det mänskliga smooth pursuit systemet (SPS) ochderas användbarhet inom motorsymptomkvantifiering i Parkinsons sjukdom(PD). Med SPS avses det komplexa neuromuskulära system som styr smoothpursuit-ögonrörelserna (SPEM). SPSs uppgift är att bibehålla ett rörligt ob-jekt i det visuella fältet. De resulterande ögonrörelserna är frivilliga i detavseende att observatören kan välja huruvida han vill följa det rörliga måleteller inte, men då smooth pursuit väl har påbörjats styrs SPS undermedvetet.Ett rörligt visuellt stimulus är generellt sätt en förutsättning för att aktiveraSPS; att försöka initiera smooth pursuit-rörelser utan stimulus är svårt elleromöjligt för de flesta människor.

Insikt i detta system och dess funktion är viktigt för ett brett och ständigtväxande spektrum av tillämpningsområden och forskningsfält. På senare tidhar den förbättrade prestandan och tillgängligheten hos eye-tracking-utrustninggenererat stort intresse i den kommersiella sektorn. Exempel hittas i mark-nadsforskning, synbaserad interaktion, hjälpmedel vid fysiska funktionsned-sättningar, idrottsutbildning osv. Vidare påverkas SPS negativt i en rad neu-rologiska sjukdomar. Bland annat associeras PD med nedsättningar i SPEM-funktion. En djupare insikt i SPS och dess dynamik kan därför bidra med ökadförståelse för sjukdomen, och förbättrade diagnos- och kvantifieringsmöjligheter.

I de flesta tillämpningar studeras ögonrörelser explicit; man är bara in-tresserad av var en person tittar vid en specifik tidpunkt. Denna avhandlingämnar utveckla mer sofistikierade tekniker för analys av eye-tracking-data isyfte att utvinna tidigare gömd information om individens SPS och därigenomindividen själv. På så sätt öppnas dörrarna för nya och mer avancerade tillämp-ningsområden.

Två angreppssätt för SPS-analys presenteras i detta arbete. Det ena byggerpå systemteori; SPS modelleras matematiskt som ett parametriserat dynamisktsystem som relaterar blickriktning till visuellt stimulus. SPSs autonomitet ochdess relativt svaga interaktion med medvetandet berättigar en sådan ansats.För att skapa en pålitlig dynamisk modell av SPS krävs viss fysiologisk kun-skap och förståelse, men problemet är huvudsakligen ingenjörsmässigt därmatematiska metoder och verktyg måste användas, utvecklas och utvärderas.

Först och främst måste olika modellstrukturer föreslås och implementeras.Detta steg innefattar en utförlig genomgång av tidigare forskning inom SPEM,SPS och matematisk modelleringsteori, men kräver också omfattande exper-imentering och inte minst påhittighet. Modellstrukturen är modellens skelettsom generellt beskriver och begränsar modellens dynamik. Antaganden måstegöras om systemet i sin helhet, men också om dess beståndsdelar, så som deextraokulära muskelsystemet och dess kommunikation med hjärnan. När enlämplig modellstruktur har valts måste modellens okända parametrar skattasutifrån experimentell data. Detta görs genom s.k. systemidentifiering, vars

metoder förlitar sig på principer inom matematiska och tekniska fält, så somlinjär algebra, spektralanalys, optimering, signaldesign osv. Slutligen måstede erhållna modellernas prediktionsprecision, biometripotential och använd-barhet inom kvantifiering av PD-motorsymptom, utvärderas. Här nyttjas teorioch tekniker från fält så som statistik, anomalidetektion och klassificering.

Det andra angreppssättet för SPS-analys fokuserar på möjligheten att skiljapå individer baserat på deras ögonrörelser. Metoden baserar sig på icke-parametriska statistiska modeller och förutsätter ingen fysiologisk kunskapom SPS. Istället används stora mängder ögonföljningsdata för att skatta eye-tracking-profiler som representerar normen för en viss individ eller grupp avindivider. Avvikelser från denna norm, det vill säga individer som inte pas-sar in i profilen, identifieras med hjälp av anomalidetektionsmetoder. Dettaangreppssätt bidrar även med specifik information om mänskliga ögonföljn-ingstrajektorier och avslöjar hur SPS i olika individer svarar på liknande stim-ulus.

Det huvudsakliga bidraget i den här avhandlingen är således härledningenoch utvärderingen av flera tekniker för SPS-karaktärisering. Syftet är att up-pnå en djupare förståelse för SPS och dess dynamik och därigenom stödja denfortsatta utvecklingen av relaterade kommersiella och medicinska produkter.Avhandlingen lägger särskilt fokus på kvantifiering av PD-motorsymptom.För tillfället görs detta av en läkare genom intervju och klinisk observationvilket är subjektivt och tidskrävande. Ett mål med avhandlingen är därför attta ett första steg i riktningen mot ny teknologi som kan användas för att er-hålla individualiserade modeller av SPS från eye-tracking-data som möjliggörautomatisk och objektiv diagnos och bestämning av vilket stadium sjukdomenär i.

Glossary and Notation

NotationA Matrices are written in bold upper case letters.x Vectors are written in bold lower case letters.d A hat is used to denote an estimateI The identity matrix(·)T Vector or matrix transpose(·)∗ Complex conjugate, or for vectors and matrices; the conjugate

transposei or j The imaginary unit (

√−1) unless otherwise specified

X Stochastic variable.f(x) Vector-valued function.Rm×n The real-valued m× n-dimensional matrix spaceRn The real-valued n-dimensional vector spaceCm×n The complex-valued m× n-dimensional matrix spaceCn The complex-valued n-dimensional vector spaceNn n-dimensional set of natural numbers.L2 The space of square-integrable functions`2 The space of square-summable functionsL(·) The Laplace transformZ(·) The Z-transformRe{·} The real part of a complex numberIm{·} The imaginary part of a complex numberarg(·) The argument of a complex numbertr(·) The trace of a matrixvec(·) The column-wise vectorized version of a matrixq−1 The time-shift operator{xk}Kk=1 A set of K elements xk⊗ The Kronecker product∗ The convolution operator, Defined as equal to

∈ Belongs to| · | Magnitude‖ · ‖p The `p-norm‖ · ‖F The Frobenius normE{·} The expected value operatorV {·} The variance operatorn!! Double factorial, the product of all the integers from 1 up integer

n that have the same parity as nPAR(·) The Peak-to-Average ratioP (A) Probability of a random event A∞ Infinity5 The number 5

AbbreviationsBLA Best Linear ApproximationDFT Discrete Fourier TransformDSPG Dynamic Smooth Pursuit GainEOG ElectrooculogramFFT Fast-Fourier TransformFIR Finite Impulse ResponseGQ Goldman QuacksGMM Gaussian Mixture ModelGOBF Generalized Orthonormal Basis Functioni.i.d. Independent and Identically DistributedIIR Infinite Impulse ResponseKDE Kernel Density EstimationLASSO Least Absolute Shrinkage and Selection OperatorLP Linear ProgramLS Ordinary Least SquaresMDL Minimum Description LengthMIMO Multiple Input Multiple OutputMISO Multiple Input Single OutputMSE Mean Square ErrorMMSE Minimum Mean Square ErrorNLS Nonlinear Least SquaresOSA Orthogonal Series ApproximationPAR Peak-to-Average power RatioPEM Prediction Error MethodPD Parkinson’s DiseasePDF Probability Density FunctionPSD Power Spectral DensityRPEM Recursive Prediction Error Method

RSS Residual Sum of SquaresSIMO Single Input Multiple OutputSISO Single Input Single OutputSNR Signal to Noise RatioSotA State-of-the-ArtSPEM Smooth Pursuit Eye MovementsSPICE SParse Iterative Covariance-based EstimationSPG Smooth Pursuit GainSPS Smooth Pursuit SystemVL Volterra-Laguerre

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Overview of The Appended Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Paper I: Visual Stimulus Design in ParameterEstimation of the Human Smooth Pursuit System fromEye-Tracking Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.2 Paper II: Volterra Modeling of the Smooth PursuitSystem with Application to Motor SymptomsCharacterization in Parkinson’s Disease . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.3 Paper III: Identification of Polynomial Wiener Systemsvia Volterra-Laguerre Series with Application toSmooth Pursuit System Characterization . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.4 Paper IV: Non-Parametric Analysis of Eye-MovementData by Anomaly Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.5 Paper V: Mathematical Modeling and Grey-BoxIdentification of the Human Smooth Pursuit Mechanism 8

1.3 List of Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1 The Extraocular Muscles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Smooth Pursuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Eye Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.4 Parkinson’s Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Dynamical Modeling of the SPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.1 Input Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 Proposed Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2.1 Grey-Box Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2.2 ARX Wiener Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2.3 Volterra-Laguerre Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.2.4 Polynomial Wiener Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1

3.3 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.3.1 Grey-Box Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.3.2 ARX Wiener Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.3.3 Volterra-Laguerre Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3.4 Polynomial Wiener Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.4 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.4.1 Grey-Box Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4.2 ARX Wiener Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4.3 Volterra-Laguerre Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4.4 Polynomial Wiener Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.5 Model Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.5.1 Considered Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.5.2 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5.3 User Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5.4 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Non-parametric modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.1.1 Anomaly Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.1.2 Distribution Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.1.3 Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.1.4 Method extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2 Some Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.2.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.1 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.1.1 Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.1.2 Input Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.1.3 Wiener System Identification: The Non-Gaussian

Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.1.4 Improved Eye-Tracking Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2

Chapter 1Introduction

This thesis proposes nonlinear system identification techniques for the mathe-matical modeling of the human smooth pursuit system (SPS) with applicationto motor symptom quantification in Parkinson’s disease (PD).

The SPS refers to the complex neuromuscular system in humans that gov-erns the smooth pursuit eye movements (SPEM). The task of smooth pursuitis to maintain a moving target within the visual field. Such movements areunder voluntary control in the sense that the observer can choose whether ornot to track the moving stimulus, but once the pursuit is initiated the SPS op-erates subconsciously [21]. A moving visual stimulus is generally required toactivate the SPS and attempting to initiate smooth pursuit in the absence ofstimulus is difficult or impossible for most individuals [21].

Insight into the SPS and its operation is of importance in a wide array of ap-plication areas and research fields. Lately, the increased performance and ac-cessibility of eye-tracking technologies have generated a great deal of interestin the commercial sector. Examples are found in market research, gaze-basedinteraction, sports education etc [5, 8]. Moreover, the SPS is impaired in avariety of neurological diseases. In particular, PD is known to be associatedwith deficits in SPEM control [6]. A deeper insight into the SPS may thus leadto a better understanding, diagnostics, and staging of the disease.

In most applications, the eye-tracker output is explicitly examined, with in-terest only in where the subject is looking at a given time. This thesis makes aneffort towards developing more sophisticated eye-tracking data analysis tech-niques to extract previously hidden information about the individual and toopen up for new more advanced applications.

Two approaches for SPS analysis are presented in this thesis. One is asystems theory approach where the SPS is mathematically modeled as a pa-rameterized dynamical system relating gaze direction to visual stimulus. Theautomaticity of the SPS and its relatively weak conscious interaction makessuch an approach viable. Studying the SPS from this point of view demands

3

some physiological knowledge and understanding of the system, but mainlypresents an engineering problem where mathematical methods and tools mustbe utilized, developed and evaluated.

First, model structures must be suggested and implemented. This step in-volves a thorough review of previous research on SPEM, the SPS and theoryof mathematical modeling as well as extensive experimentation and not leastinnovation. The model structure is the skeleton of the model, imposing generalrestrictions on the model dynamics by specifying the functional dependencesbetween the model variables. Assumptions must be made regarding the systemas a whole as well as its components, such as the workings of the extraocularsystem and its communication with the brain. Then, when appropriate modelstructures have been proposed, the models must be completed by determin-ing unknown parameters from observed experimental data. This is done bymeans of system identification, whose methods rely on principles from suchmathematical and engineering fields as linear algebra, spectral analysis, opti-mization, sequence design etc. Finally, the obtained models must be evaluatedin terms of their prediction accuracy, performance as SPEM biometrics andusability in the quantification of PD motor symptoms. Here, techniques fromother fields must be applied, such as statistics, anomaly detection, and classi-fication.

The second approach of this thesis focuses on the task of distinguishingbetween individuals based on eye movements. It employs non-parametric sta-tistical models and requires no physiological knowledge of the SPS. Instead,it uses large amounts of recorded eye-tracking data to establish eye-trackingprofiles, representing the smooth pursuit norm behavior of certain individualsor groups. Deviations from said norms, i.e. individuals not fitting the pro-files, are identified by means of anomaly detection methods. This approach isalso meant to provide more specific information about human gaze trajectoriesand reveal in what way the SPS responses of different individuals to a certainstimulus deviate from each other.

The main contribution of this thesis is thus in the derivation and evalua-tion of several techniques for SPS characterization. The purpose is to attain adeeper understanding of the SPS dynamics and consequently to aid the con-tinued development of several commercial and medical applications. Specifi-cally, the thesis focuses on the application of quantifying the motor symptomsof PD. Currently, this is done by interview and clinical observation which issubjective and requires hours of interaction between the patient and a qualifiedclinician. An aim of the thesis is to take a first step towards new technology,in form of a complete technique for obtaining individualized parametric andnon-parametric models of the SPS from eye-tracking data, that allows for fastautomatic and objective staging of PD.

4

1.1 Thesis OutlineThe contribution of this doctoral thesis is conveyed in five papers. The papersare based on 10 publications in conference proceedings, 1 journal publication,1 published book chapter and 1 journal article under review (see List of Pub-lications below). They present practical and theoretical results related to SPScharacterization and PD diagnosing and staging via eye movements.

The second chapter of the thesis gives a short review of relevant backgroundtopics. The following chapters constitute a comprehensive summary of the ap-pended papers where their respective methods and results are briefly presentedand compared. Some additional results and analyses not included in the papersare also provided to give a wider perspective of the problem formulation andthe thesis contribution.

Each of the five papers describes a single approach to the SPS characteri-zation problem and draws on knowledge from different areas of engineeringand mathematics. Every paper contains a review of the literature relevant toits specific topic and provides discussions on how its contribution relates toprevious research. For this reason an overall literature review is left out of theintroductory chapters.

1.2 Overview of The Appended PapersDuring the work on this doctoral thesis, countless models of the SPS and app-roaches for SPS characterization have been considered and efforts have beenmade to further improve results, simplify identification, and reduce compu-tational burden. Among these, the five most prominent are presented in fivepapers that are appended to this thesis. The papers are rewritten versions ofpublished papers, augmented with new results and further analysis. In somecases, content has been moved from one paper to another for better consistencyand continuity.

Paper I utilizes a simple SPS model and lays focus on input signal design forimproved identification results and PD quantification. In Paper II, the black-box Volterra series are used to model the SPS and to study how it is affectedby PD symptoms. Paper III presents a novel approach to the identification ofpolynomial Wiener systems for white and non-white input signals. Paper IVsuggests a non-parametric approach to explicitly study the gaze trajectories ofhealthy and PD subjects. In Paper V, a physiologically motivated grey-boxmodel of the SPS is constructed and evaluated.

The papers are not appended in chronological order. Paper III is the mostrecent one, based on papers from 2014 and 2015. Paper II is based on a paperfrom 2014. Paper IV is based mainly on a paper from 2013 with elementsfrom a paper from 2011. Paper I is based on papers from 2013 and Paper V isbased on a paper from 2010.

The appended papers can be summarized as follows:

5

1.2.1 Paper I: Visual Stimulus Design in Parameter Estimation ofthe Human Smooth Pursuit System from Eye-TrackingData

To allow for easy implementation and less demanding identification, the SPSis modeled as a simple Wiener system with an ARX linear block and a contin-uous piece-wise linear static nonlinearity. Despite its simplicity, this model isdemonstrated to yield prominent results in terms of ability to distinguish be-tween individuals and to recognize the symptoms of PD based on eye move-ments.

The main contribution of this paper is in a novel approach to design visualstimuli for use in SPS experiments. In most previous studies of the SPS, focushas been on its transient behavior and therefore the applied input signals havebeen steps or ramps. Some medical papers have utilized sinusoidal inputs,exciting only single modes of the system. In this paper, stimuli with the richspectral and amplitude excitation needed for accurate system identificationare generated using the presented method. The method effectively solves anoptimization problem that imposes constraints on the power spectral densityof the signal and shapes its amplitude distribution. It is shown that by usingvisual stimuli generated with the proposed method rather than less excitinginput signals of past studies, reduced estimation variance and more apparentseparation between the parameter clouds associated with different individualsare achieved.

This paper also introduces the Orthogonal Series Approximation (OSA)approach to probability density estimation. It is used to obtain the statisticalproperties of the parameter values of test subjects. Furthermore, the notion ofoutlier regions along with a method for their derivation for general probabil-ity density functions (PDFs) are introduced. The outlier regions are used tovisualize the clustering of parameters to reveal inter-subject differences.

1.2.2 Paper II: Volterra Modeling of the Smooth Pursuit Systemwith Application to Motor Symptoms Characterization inParkinson’s Disease

The SPS is represented using a Volterra series expansion which can be consid-ered a black-box way of modeling general time-invariant nonlinear systems.To increase parsimony, the kernel functions are expanded in terms of the or-thonormal set of multivariate discrete Laguerre functions, effectively elimi-nating the need for time truncation and reducing the identification problem tothe estimation of the expansion coefficients. Moreover, through the use of thesparse estimation method SPICE, the number of nonzero parameters is furtherreduced. The obtained models are shown to outperform earlier SPS models.Specifically, they perform better than the Wiener model of Paper I, suggest-ing that the assumptions on model structure given there were partially invalid.

6

Furthermore, it is shown that the SPS in PD exhibits dynamical nonlinear be-havior, whereas the nonlinearity in health is largely static. Finally, by usingGaussian mixture models, estimated parameter vectors are successfully par-titioned into those associated with healthy subjects and those associated withPD subjects.

1.2.3 Paper III: Identification of Polynomial Wiener Systems viaVolterra-Laguerre Series with Application to SmoothPursuit System Characterization

A new method for identification of polynomial Wiener systems is suggested.The analysis relates the Volterra-Laguerre model presented in Paper II to thepolynomial Wiener systems and shows how to effectively divide the nonlinearidentification problem into two linear ones, reducing the computational burdensignificantly. In the case of model mismatch, the resulting polynomial Wienermodel is biased compared to the true system. However, for white inputs, theanalysis of the paper provides explicit analytical expressions for the resultingparameter bias which are then used in the derivation of a bias reduction al-gorithm. In many cases, the algorithm manages to exactly recover the trueparameter values.

The strengths and drawbacks of the method are evaluated in several numer-ical experiments where it is shown to perform better than current off-the-shelfmethods both in terms of identification accuracy and computational speed.

Although the contribution of this paper is mainly theoretical, the proposedmethod is demonstrated through experiment to produce high-quality modelsof the SPS, outperformed only by the more general Volterra-Laguerre modelsof Paper II.

1.2.4 Paper IV: Non-Parametric Analysis of Eye-Movement Databy Anomaly Detection

A non-parametric method of SPS characterization is proposed as a comple-ment to the parametric methods of this thesis. The method is derived withinan anomaly detection framework and is aimed at finding gaze trajectories de-viating from the norm. The norm is constituted by a statistical model obtainedusing large amounts of eye-tracking data from an individual or a select groupof individuals and is referred to as the eye-tracking profile of that individual orgroup.

The eye-tracking profiles are established by methods of trajectory distribu-tion estimation based on the same PDF approximation techniques as describedin Paper I. Here, Kernel Density Estimation (KDE) is used instead of OSA, butthe same procedure of finding outlier regions is applied. Furthermore, it is re-vealed that the distribution of gaze trajectory data is generally not Gaussian

7

and that KDE therefore produces favorable results compared to the commonapproach of normal distribution fitting.

To further improve the non-parametric model, the dynamical nature of theSPS is partly accounted for by also including estimates of gaze velocity inthe eye-tracking profiles. Without this augmentation, the method succeeds inrecognizing age differences between different test subjects, but occasionallyfails to distinguish the effects of PD from those of healthy aging. However,with the position/velocity profiles, this problem is overcome.

1.2.5 Paper V: Mathematical Modeling and Grey-BoxIdentification of the Human Smooth Pursuit Mechanism

A grey-box model of the SPS is constructed by putting a fourth-order nonlinearwhite-box model of the eye plant in a feedback-loop together with an empiricalnonlinear controller. This approach to model the SPS is a novel one since noprevious studies have adopted a physiologically accurate eye-plant model. Theresulting fifth-order nonlinear model is shown to produce favorable resultscompared to what has been achieved in earlier research. System properties,such as the Smooth Pursuit Gain (SPG) and angular velocity step responsesare evaluated using the obtained model and shown to agree with what hasbeen previously observed in clinical experiments. Furthermore, the model isanalyzed to foster a better understanding of the nonlinear dynamics of theSPS. A drawback of the model is that the identification is computationallydemanding due to its complexity.

8

1.3 List of Publications

Journal Publications

1. D. Jansson, A. Medvedev, Identification of Polynomial Wiener Systemsvia Volterra-Laguerre Series, IEEE Transactions on Signal Processing,Under review, 2015.

2. D. Jansson, O. Rosen, A. Medvedev, Parametric and Non-ParametricAnalysis of Eye-Tracking Data by Anomaly Detection, IEEE Transac-tions on Control Systems Technology, vol. 23, 2015.

Conference Publications

1. D. Jansson, A. Medvedev, Identification of Polynomial Wiener Systemsvia Volterra-Laguerre Series with Model Mismatch, IFAC Conferenceon Modelling, Identification and Control of Nonlinear Systems, Saint-Petersburg, Russia, 2015

2. D. Jansson, A. Medvedev, System Identification of Wiener Systems viaVolterra-Laguerre Models: Application to Human Smooth Pursuit Analy-sis, 14th IEEE European Control Conference, Linz, Austria, 2015

3. D. Jansson, A. Medvedev, Volterra Modeling of the Smooth Pursuit Sys-tem with Application to Motor Symptoms Characterization in Parkin-son’s Disease 13th IEEE European Control Conference, Strasbourg,France, 2014.

4. D. Jansson, A. Medvedev, H. Axelson, D. Nyholm, Stochastic AnomalyDetection in Eye-Tracking Data for Quantification of Motor Symptomsin Parkinson’s Disease, International Symposium on ComputationalModels for Life Sciences: CMLS 2013, Sydney, Austrlia, 2013 (BestStudent Paper Prize).

5. D. Jansson, A. Medvedev, H. Axelson, D. Nyholm, Parametric and Non-Parametric System Identification of Oculomotor System with Applicationto the Analysis of Smooth Pursuit Eye Movements in Parkinson’s Dis-ease, INCF Congress on Neuroinformatics, Stockholm, Sweden, 2013.

6. D. Jansson, A. Medvedev, Parametric and Non-Parametric StochasticAnomaly Detection in Analysis of Eye-Tracking Data, 52nd IEEE Con-ference on Decision and Control, Florence, Italy, 2013

7. D. Jansson, O. Rosen, A. Medvedev, Non-Parametric Analysis of Eye-Tracking Data by Anomaly Detection 12th IEEE European Control Con-ference, Zurich, Switzerland 2013.

8. D. Jansson, A. Medvedev, Visual Stimulus Design in Parameter Esti-mation of the Human Smooth Pursuit System from Eye-Tracking Data,IEEE American Control Conference, Washington D.C., 2013.

9. D. Jansson, A. Medvedev, Dynamic Smooth Pursuit Gain Estimationfrom Eye-Tracking Data, 50th IEEE Conference on Decision and Con-trol, Orlando, Florida, 2011.

9

10. D. Jansson, A. Medvedev, P. Stoica, H. Axelson, Mathematical Model-ing and Grey-Box Identification of the Human Smooth Pursuit Mech-anism, IEEE Multi-conference on Systems and Control, Yokohama,Japan, 2010.

Licentiate Thesis

D. Jansson, Mathematical Modeling of the Human Smooth Pursuit Sys-tem IT Licentiate theses, Uppsala University, Department of InformationTechnology, 2014

Book Chapter

D. Jansson, A. Medvedev, H. Axelson, D. Nyholm, Stochastic AnomalyDetection in Eye-Tracking Data for Quantification of Motor Symptomsin Parkinson’s Disease, Signal and Image Analysis for Biomedical andLife Sciences, vol. 823 of Advances in Experimental Medicine and Bi-ology, Springer 2015.

10

Chapter 2Background

This chapter reviews some background topics that are relevant to the contribu-tions of the thesis and which the reader may not be familiar with. Although thetopics are discussed in the appended papers, the mention is often very brief.Here, a concise but informative summary is given to put the remainder of thethesis into context.

2.1 The Extraocular MusclesThere are six muscles governing the movement of the eye, referred to as theextraocular muscles. Fig. 2.1 shows the right eye with its accompanying ex-traocular muscles. Four of the muscles control the movement of the eye in thefour directions up, down, left, and right. The remaining two muscles controlthe adjustments in gaze direction involved in counteracting head movements.The four muscles controlling standard eye movement are the superior, infe-rior, lateral and medial recti. The two remaining muscles are the superior andinferior oblique. The primary action of the superior and inferior recti are eleva-tion (upward movement) and depression (downward movement) respectivelyand those of the lateral and medial recti are abduction (away from the mediansagittal plane of the body) and adduction (towards the sagittal plane).

2.2 Smooth PursuitThe two ways in which humans can voluntarily shift gaze are smooth pursuiteye movements (SPEM) and saccades. Saccades are discrete movements thatquickly change the orientation of the eyes, thereby translating the image ofthe object of interest from an eccentric retinal location to the fovea (the centerof the retina responsible for sharp central vision). Smooth pursuit is a contin-uous movement that slowly rotates the eyes to track the motion of an object

11

Superior oblique muscle

Superior rectus muscle

Medial rectus muscle

Lateral rectus muscle

Inferior rectus muscle

Inferior oblique muscle

Figure 2.1. The extraocular muscles of the right eye.

and to keep it within the visual field. SPEM are governed by the smooth pur-suit system (SPS). Smooth pursuit is primarily driven by visual motion whichmakes it difficult for most individuals to initiate it without a moving target.The maximum angular velocity of the eyes during smooth pursuit is about80 - 100 ◦/s [15]. For targets with exceeding velocities, the SPS passes thecontrol to the saccadic system. Research has shown that direction-selective,motion-sensitive cells in the primary visual cortex estimate target angular ve-locity [17] and that the SPS acts as a velocity servo; in that it tries to minimizethe angular velocity error between the gaze and the target [20]. Any stationaryerror in angular position will be left uncorrected by this mechanism.

There are several research papers on quantifying the SPS in an attempt touse SPEM as a biometric. Most papers on the subject are published in medicaljournals and apply straight-forward and facile techniques for data analysis.An example of such a technique is the evaluation of the Smooth Pursuit Gain(SPG), which is the ratio between the eye velocity and the stimuli velocity. It isused as a measure for characterizing the SPS in e.g. [11, 13]. However, sincethe SPG is nothing but the steady-state angular velocity gain, it is merely onepoint in the frequency response of the SPS and may thus not be an exhaustivemetric.

2.3 Eye TrackingEye tracking is the process of measuring either the point of gaze or the motionof the eye relative to the head. The two most common techniques for eye-

12

movement registration are electrooculography (EOG) and video eye tracking.In EOG, electrodes are placed around the eye to measure the potential dif-ferences produced by the retina as it turns, see Fig. 2.2. Assuming that theresting potential is constant, the recorded potential is a measure of the eye’sangular position. Therefore, to calculate the point of gaze on the monitor, thedistance from the test subject to the monitor must be known. If this distance isnot measured continuously throughout the experiment, it is important that thehead remains still, for example by using a mounted chin rest.

The signal acquired from an EOG measurement is called the electroocu-logram. Because the EOG relies on the potential differences produced by ashift in the angular position of the retina, it is possible to use EOG even whenthe eyes are closed and it can thus be used in for example sleep studies. Adrawback of the EOG is the fact that the resting potential is often not constant,resulting in nonlinear trends in the recorded data. Another drawback is thesomewhat daunting task of placing the electrodes which also induces somediscomfort in the test subject due to the need for a thorough scrubbing wherethe electrodes are to be placed.

Figure 2.2. EOG electrodes placed around the eyes to measure the potential differ-ences produced when the eye turns.

Video eye tracking uses one or more cameras, usually infrared, togetherwith image analysis algorithms to locate the pupils via the corneal reflectionsand to determine the gaze direction. Video eye tracking is non-invasive andquick, but a simple calibration procedure of the individual is needed beforeusing the eye tracker. When more than one camera is utilized, the images fromthe different cameras can be combined to form a 3D environment, allowingfor accurate tracking of the position and orientation of the head, which greatlyimproves the gaze direction measurements.

13

In Paper V of this thesis, the EOG method of eye tracking is employed. Inthe remaining papers, a two-camera video eye tracking system from Smart EyeAB, Sweden is used. The eye tracker output yields the number of centimeters(horizontal and vertical components separately) from the monitor center to thepoint where the gaze vector intersects the monitor. The system samples thegaze direction at 60 Hz. Fig. 2.3 is a screenshot from the eye tracking softwareshowing how the algorithms have found the gaze direction and orientation ofthe head.

Figure 2.3. A screenshot from the eye tracking software.

2.4 Parkinson’s DiseaseParkinson’s disease (PD) is a degenerative disorder of the central nervous sys-tem. The cause of the disease is attributed to degeneration of dopaminergicneurons from the substantia nigra [10]. Impairment of the substantia nigra tosynthesize dopamine causes a progressive depletion of this signifcant neur-transmitter for the putamen and caudate nucleus. The progression of PD ischaracterized by tremor during rest, abnormal gait, muscular rigidity and im-paired balance [10].

Currently, the status of PD in a patient is evaluated through the UnifiedParkinson’s Disease Rating Scale (UPDRS) which is qualitatively interpreted

14

by a clinician [19]. Knowledge of the current status of the disease in a pa-tient is important for the selection and dosage of drug therapy. There are twomajor issues with the scale. The process of observing and interviewing thepatient to determine the UPDRS result is time consuming and often tiring forboth the patient and the clinician. Moreover, since the scale is qualitativelyinterpreted, there may be variation among the subjective decisions of differentclinicians. Hence, it is of great interest to find means for quick and objectivequantification of the PD status in a patient.

It has been shown that the SPS is negatively affected by PD and that theseverity and character of the impairment is related to the progression of thedisease [13, 16, 34]. Consequently, acquiring a full understanding of the SPSmay be a first step towards developing a technology that allows for fast andautomatic PD staging.

15

Chapter 3Dynamical Modeling of the SPS

The SPS is a dynamical system relating gaze direction to the movements ofa dynamical visual stimulus. The main contributor to the inertia of the sys-tem is the eye globe and the extraocular system, but also the various delaysand dynamics introduced in the feedback loop implementing the interactionwith the brain. The SPS can thus be described by a dynamical model in theform of differential or difference equations. The input to the system is themotion of the visual stimulus, henceforth denoted by s. The output is the re-sulting gaze direction, y. Because the eye can move in both the horizontaland the vertical plane, both the input and the output signals can be consideredto be two-dimensional, rendering the SPS a Multiple Input Multiple Output(MIMO) system. In this thesis, the weak interaction between the horizontaland the vertical part of the SPS is mostly ignored. Instead, the SPS is viewedas two SISO systems in parallel, one governing the horizontal and one thevertical movements.

As a coarse classification of dynamical models in system identification, themodel types white-box, grey-box and black-box are often used [27].

White-box modeling requires complete knowledge of the system to be mod-eled. The models are based on first principles and are derived from physicallaws. Although very useful, white-box models are relatively uncommon dueto the exceeding complexity of most processes in nature.

Grey-box modeling requires partial knowledge of the system to be modeled.Through certain insight into the system, a semi-physical description of it canbe obtained, where one or more of the model parameters have been assignedphysical meaning. Grey-box models are important tools and widely used topredict and evaluate the behavior of countless processes in a range of industrialand scientific applications [1].

In black-box modeling [27], a general model structure is assumed with-out any physical meaning in the parameters. There is a number of commonblack-box structures for both linear and nonlinear systems. Black-box models

17

provide simple means of evaluating the dynamics of a system generally at lowcomputational cost and without the need for any deeper understanding of itsnature.

In the case of the SPS, white-box modeling is a near impossible task due tothe vast complexity of the feedback loop; the interaction between the eyes andthe brain, which is affected by the not easily modeled human consciousness.Instead, this thesis considers various parametric and non-parametric black-and grey-box models, both linear and nonlinear, to portray the SPS.

3.1 Input DesignIf the SPS is viewed as a dynamical system where gaze direction or gaze ve-locity is the output, the input is the position or velocity of a smoothly movingobject or target. Throughout the experiments of this thesis, the target is awhite circle moving in a black background window on a computer monitor.The color and size of the circle are not of great significance when studying theSPS, since the SPS responds to motion. However, the circle should be smallenough so that saccades between different parts of it are negligible, and boththe size and color must be chosen so that the circle is clearly visible.

The way the circle moves has a crucial effect on the outcome of any exper-iment, no matter which method is used to evaluate the eye tracker output. Forthe purposes of this study, it is important that the stimulus is designed to exciteall the relevant modes of the SPS to allow for accurate system identification. Itis not obvious how to achieve such excitation, but an attempt is made in PaperI, where a stimulus generation method is derived to yield stimuli which excitesboth the linear and nonlinear parts of the system.

3.2 Proposed ModelsFour different dynamical models of the SPS are proposed in this thesis.

The first, which is described in Paper V, is a grey-box model devised fromphysical and biological knowledge of the system. It is an attempt to accuratelymodel the SPS by capturing all the fine nonlinear dynamics and by assigningphysical meaning to all parameters. The input and output of the system isstimulus and target angular velocity, respectively.

The second is a Wiener model described in Paper I adopting an ARX linearblock and a continuous piecewise-linear static nonlinearity. It is intended tobe a simpler model that is easier to work with, retaining the overall behaviorof the first model, but without the physically meaningful parameters. Here itis important to note that the model input is stimulus and the output is targetposition instead of velocity. For this model, the increase of model accuracy,when using velocity signals, was not large enough to motivate the extra effort

18

required for signal differentiation, which is not only time consuming, but alsointroduces extra uncertainty depending on the differentiation method at hand.

The third is a sparse Volterra-Laguerre model, described in Paper II, whichcan be considered a more general nonlinear black-box model. It is used toshow that the nonlinearities of the SPS are in fact dynamical and that theWiener model of Paper I merely approximates the true nonlinear behavior.

The fourth model is the polynomial Wiener model treated in Paper III. Themodel is obtained by projecting a Volterra-Laguerre model on the set of poly-nomial Wiener models and thereby reducing the number of parameters. ThisWiener model differs from the one in Paper I where the latter assumes a cer-tain structure for the linear block while the former is black-box with a prioriassumptions.

3.2.1 Grey-Box ModelThe grey-box model of the SPS derived in Paper V is on the form shown inFig. 3.1. The eye plant is put in a feedback loop together with a controllerconstituting the involvement of the brain and the nervous system. The outputangular velocity, y(t), of the eye plant is fed back and subtracted from thestimulus velocity, s(t), to create a velocity error signal e(t). The velocity erroris passed to the controller that produces the neural input signals n(t) to theeye plant based on the dynamics of the error. Depending on the structure andproperties of the biomechanical model and the controller model, the degreeof accuracy to which this negative feedback-based model simulates the actualextraocular system will vary.

Controller Eye plants(t) e(t) n(t) y(t)

Figure 3.1. Overview of the model structure for the considered grey-box models ofthe SPS.

The assumed biomechanical model of the eye plant is a fifth-order nonlinearmodel devised from knowledge of the physical nature of the eye globe and itsappendages, the muscles and tendons of the extraocular system. The purposeof the biomechanical model is to constrain the modeled ocular movements tobe within the limits of what is physically possible. Velocities and accelerationsmust be correctly modeled and motion must be increasingly restrained as theangle of gaze grows.

19

The controller model must mimic the behavior of the part of the brain andnervous system involved in the SPS control. There are many plausible waysto model the controller. Since the complex structure of the brain and nervoussystem is neither exactly known nor possible to model with simple dynamicmodels, an empircal controller must be used. It is noted in previous researchthat the eye is more sensitive to perturbations when at a high angular velocitythan during fixation [2], which motivates the use of a dynamic gain control,where the gain depends linearly on the ocular rotational speed. In Paper V,such a dynamic gain controller is combined with the mentioned eye plant de-scription to form the closed-loop model of the SPS.

The complete model can be described by a set of differential equations

x = f (x, s) , (3.1)

where the elements of the state vector x represent various quantities such asgaze angle y, gaze angular velocity y and the forces in the extraocular mus-cles, where f describes the dynamics of the SPS. For a full description of thismodel, see the details of Paper V.

The unknown parameters to be estimated in this model are the two para-meters of the controller. Values for all the parameters in the eye plant modelhave been determined experimentally in previous research.

3.2.2 ARX Wiener ModelThe grey-box model of Paper V is accurate and its parameters have physicaland biological meaning. However, due to the model complexity, identificationis computationally demanding and theoretical evaluation of the obtained esti-mates is difficult. The main focus of Paper I is SPS input design and it isdesirable to use a model that is easy to work with, but still captures the overallbehavior of the system adequately. A linear black-box model may seem asan appropriate first candidate for the task. However, as is revealed in the ap-pended papers, linear models are unable to accurately predict the amplitudesof the SPS output. A nonlinearity must be included in the model in order forthe model output to comply with data. A simple approach to alleviate thisproblem is to augment a linear model with a static output nonlinearity to yielda Wiener model. The Wiener model consists of a linear dynamic block incascade with a static nonlinear function as shown in Fig. 3.2.

lineardynamics

staticnonlinearity

input output

Figure 3.2. The structure of a Wiener system

20

In Paper I, the linear block is assumed to be a time-invariant ARX-structure.The output of the linear block is fed to a static nonlinearity, which in this studyis for simplicity chosen to be a continuous piecewise-linear function. Theresulting discrete Wiener model of the SPS is given by{

(1 + a1q−1 + a2q

−2 + a3q−3)y`(k) = q−4s(k),

y(k) = f (y`(k)) + ζ(k),(3.2)

where y(k) is the gaze direction at time kTs, y`(k) is the output of the lin-ear block, s(k) is the input (i.e. the position of the visual stimulus), ζ(k) iszero-mean white Gaussian noise with variance σ2, θ = [a1, a2, a3]

T is theparameter vector of the linear block, f(x) is the static nonlinearity, q is theforward time shift operator, k = 0, 1, . . . , and Ts is the sampling time.

This model has a user parameter to be specified, namely the number of com-ponents in the piecewise-linear output block. A larger value yields a smoothernonlinearity but also increases the number of unknown parameters to be esti-mated.

For details on the choice of the linear block structure and the static nonlin-earity, refer to Paper I. The unknown parameters to be estimated in this modelare θ and the parameters of f .

3.2.3 Volterra-Laguerre ModelThe assumed Wiener model of Paper I produces better results than linear mod-els do. However, since the input and output of the model are target position andgaze direction respectively, the system has unity gain for any constant input(the gaze is on the target during fixations), which contradicts the assumptionof a static nonlinearity in the SPS. The static nonlinearity of the Wiener modelthus only approximates the actual dynamical nonlinear behavior of the sys-tem. In Paper II a Volterra series approach is used to capture the nonlineardynamics of the SPS, without imposing any restrictions on the structure of itsnonlinearity.

The Volterra series can be viewed as a dynamical analogue to the Taylorseries. It is a functional expansion of a dynamic, nonlinear, time-invariantsystem. The Volterra series expansion of a continuous time-invariant systemwith input s(t) and output y(t) is

y(t) =

k0 +∞∑

n=1

∫ ∞−∞· · ·

∫ ∞−∞kn(t1, . . . , tn)s(t− t1) · · · s(t− tn)dt1 · · · dtn + ζ(t),

where kn is the n-th order Volterra kernel which can be regarded as a higher-order impulse response of the system. The signal ζ(t) represents noise.

21

Since in any physically realizable system the output can only depend onprevious values of the input, the kernels will be zero for any negative valuesof the time variables t1, . . . , tn. The integrals may then be written over theinterval from zero to infinity. In Paper II, the SPS is modeled by a discretesystem and thus the discrete version of the Volterra series must be used. Adiscrete time-invariant system with input s(k) and output y(k) k = 0, . . . ,K−1, can be approximated by the Volterra series

y(k) = y0 +N∑

n=1

Hns(k) + ζ(k), (3.3)

where ζ(k) is a noise term, N ∈ N is the chosen Volterra order and

Hns(k) =

∞∑i1=0

· · ·∞∑

in=0

hn(i1, . . . , in)s(k − i1) . . . s(k − in)

are the Volterra functionals. The functions hn are the discrete Volterra kernels.In system identification, an appropriate Volterra representation has to be

found by e.g. minimizing the sum of squared errors. In general, this mini-mization requires the solution of a simultaneous set of series equations whichin most practical cases is difficult or even impossible to obtain. Hence, esti-mation of Volterra coefficients is generally performed by estimating the coeffi-cients of an orthogonalized series, e.g. the Wiener series, and then recomput-ing the coefficients of the original Volterra series. In many cases the number ofunknown coefficients to be estimated is immense even if the considered timewindow is of moderate size and the chosen series order is low.

To avoid solving the system of series equations for a very large number ofunknown variables, Paper II presents an alternative method. The kernels arethere expanded in terms of the orthonormal Laguerre functions. The resultingVolterra functionals can then be written as

Hns(k) =

L∑j1=0

· · ·L∑

jn=0

γn(j1, . . . , jn)ψj1(k) · · ·ψjn(k), (3.4)

where L is the chosen Laguerre order and the sequences {ψj(k)}Lj=0 are theresponses of the Laguerre filters φj(k) to the input sequence s(k), i.e.

ψj(k) =

∞∑i=0

φj(i)s(k − i). (3.5)

The convolutions in (3.5) can be computed exactly through the system

ψL(k + 1) = FLψL(k) +GLs(k) (3.6)

whereψL(k) = [ψ0(k) ψ1(k) . . . ψL(k)]

T ,

22

and FL ∈ RL+1×L+1 and GL ∈ RL+1×1 are matrices derived in Paper II.Possible initializations of (3.6) are also discussed there.

With these functionals, model (3.3) is refered to as a Volterra-Laguerremodel, where the unknown parameters to be estimated are the Volterra-Laguerrecoefficients γn. The number of unknown Volterra-Laguerre coefficients ismuch lower than the number of unknown parameters in the standard Volterraseries.

It should be noted that a Volterra-Laguerre model is a special-case of aWiener-Schetzen model [23], [32] which can be viewed as a Wiener systemwith Single-Input-Multiple-Output (SIMO) linear dynamics and a Multiple-Input-Single-Output static nonlinearity, as depicted in Fig. 3.3. The lineardynamics are given by (3.6) where each ψj(k) can be viewed as a separateoutput signal. The static nonlinearity is constituted by the combination ofthese signals in a multivariate polynomial, given in (3.4).

lineardynamics

staticnonlinearity

input output

Figure 3.3. The structure of a Volterra-Laguerre model

To simplify both simulation and identification, it is convenient to write theVolterra-Laguerre model in matrix form

y = [A I]

[γ

ζ

]= Bβ, (3.7)

where y is a vector output measurements at time instances k = 0, . . . ,K − 1,γ is a vector with all unknown coefficients to be estimated, ζ is a vector ofnoise terms, A ∈ RK×(Nc+1) is the coefficient matrix constructed from thefunctionals in (3.4) and I is the identity matrix of order K. A is given by

23

A =

1 ϕT

1 (0) ϕT2 (0) . . . ϕN (0)

1 ϕT1 (1) ϕT

2 (1) . . . ϕN (1)...

......

1 ϕT1 (K − 1) ϕT

2 (K − 1) . . . ϕN (K − 1)

(3.8)

where ϕ1(k) = ψL(k), ϕ2(k) = ψL(k) ⊗ ψL(k), ϕ3(k) = (ψL(k) ⊗ψL(k))⊗ψL(k) etc, ⊗ denoting the Kronecker product.

The user parameters of this model are the Volterra order N , the maximumfunction order in the Laguerre expansion L, and the Laguerre decay parameterα. The choice of N and L greatly influences the number of unknown modelparameters to be estimated and should be chosen to be as small as possible.The Laguerre parameter α should be chosen to match the dominating time-constant of the underlying system, preferably determined through gridding.

3.2.4 Polynomial Wiener ModelThe model presented in Paper III is a special case of the more general Volterra-Laguerre model. As mentioned, the Volterra-Laguerre model constitutes aWiener model with SIMO and MISO subsystems. The model in Paper IIIrequires the static nonlinearity to be SISO, thus forming a standard polynomialWiener model. Mathematically, the model is expressed as

{ψ(k + 1) = Fψ(k) +Gs(k),

y`(k) = ψTL(k)c+ ε(k),

(3.9)

y(k) = y` + d2y2` + . . .+ dNy

N` + ζ(k) k = 0, 1, . . . (3.10)

where s(k) is the input, y`(k) is the output of the linear block, y(k) is theoutput of the entire system, F ∈ RL+1×L+1 and G ∈ RL+1×1 are the samematrices as in (3.6), ζ(k) is measurement noise and ε(k) represents the dy-namics in the linear block not captured by the first L + 1 Laguerre functions.Note here that any orthonormal basis could be used to model the linear block[32]. The choice of the Laguerre basis is motivated in Paper III.

The unknown parameters to be estimated are the Laguerre coefficients cand the output-polynomial coefficients {dn}Nn=2.

The user parameters of the polynomial Wiener model are the output-polynomialorder N , the maximum function order in the Laguerre expansion L, and theLaguerre parameter α.

24

3.3 SimulationSimulation is the process of finding the output of a mathematical model forgiven input signals and initial conditions. This is of interest in model evalu-ation where the ability of the identified model to produce accurate output isinvestigated and compared to that of other models.

Depending on the model type, simulation is carried out in different ways.

3.3.1 Grey-Box ModelThe grey-box model in Paper V is a continuous dynamical model given bythe set of differential equations in (3.1). Assuming that all the parameters ofthe model are known, simulation of the model for a certain input is carriedout by plugging the input function into the model expressions and solving theresulting system of equations under the given initial conditions. Because of thehigh complexity of the model and the fact that the input function is arbitrary,solving the problem analytically is not possible. Instead, numerical methodsmust be used. Here, the classical Runge-Kutta method is employed where thestate at time instance n + 1, denoted by xn+1, is given as a function of thestate at time instance n, xn, by

xn+1 = xn +h

6(k1 + 2k2 + 2k3 + k4),

where

k1 = f(tn, xn),

k2 = f(tn +1

2h, xn +

h

2k1),

k3 = f(tn +1

2h, xn +

h

2k2),

k4 = f(tn + h, xn + hk3),

f is the right-hand side of (3.1), h is the time-step length and tn = nh. Simula-tion requires the specification of an initial state x0, which should be estimatedfrom data. However, as the considered input signals of this work all start atzero, the initial output of the linear block can thus also be assumed to be zero.

The simulated gaze direction and gaze velocity at time instance n are ex-tracted from the state vector xn.

3.3.2 ARX Wiener ModelSimulation of the Wiener model in (3.2) for a given input sequence s(k), underthe assumption that the model parameters are known, is done by first using therecurrence relation to simulate the linear block and then plugging the resulting

25

output sequence into the static nonlinearity. The output of the linear block,y`(k), and the remaining state variables are assumed to be zero for k ≤ 0.

3.3.3 Volterra-Laguerre ModelThe Volterra-Laguerre model in (3.7) is simulated by first calculating the Laguerre-filter outputs {ψj(k)}Lj=0 at times k = 0, 1, . . . ,K−1 using (3.6) for the giveninput sequence s(k). The matrix A in (3.8) is then constructed after which themodel output is obtained by carrying out the matrix multiplication in (3.7)with all the noise terms in ζ equal to zero.

3.3.4 Polynomial Wiener ModelThe polynomial Wiener model given by (3.9) and (3.10) is simulated by firstcalculating the Laguerre-filter outputs {ψj(k)}Lj=0 at times k = 0, 1, . . . ,K−1 using (3.6) for the given input sequence s(k). The output of the linear blockis then computed as y`(k) = ψT

L(k)c and inserted into (3.10) to give y(k).

3.4 IdentificationIn order for a mathematical model to be meaningful, the values of its para-meters must be such that the model accurately reproduces experimental data.Using optimization methods to find appropriate values for the model para-meters based on measured data is referred to as system identification.

In grey-box models, some parameters represent physical elements of thesystem and therefore have true values, e.g. the elasticity of a tendon or themass of a muscle. In this case, the goal of the system identification is torecover the true parameter values. In black-box models, the parameters haveno true values. Instead, those giving the best fit between model output andmeasured data in some given metric are sought.

In the case of the SPS, the input data are the movement of the visual stim-ulus which is readily obtained from the generation method discussed in Sec-tion 3.1. The output data are the eye tracker output, i.e. the measured eyemovements of the test subject attempting to follow the moving target.

In order to find the parameter values which give the best correspondencebetween model output and measured data, one must first decide on an appro-priate way to quantify this correspondence; a cost function must be chosen. Acommonly used cost function in system identification is the prediction error(PE) defined as

V (θ) =1

N

N−1∑i=0

(y(ti)− y(t, θ))2 , (3.11)

26

where y(ti) is the measured system output at time ti, y(ti) is the model out-put at time ti and θ is the vector comprising the unknown model parameters.The task of system identification is to find the vector of parameters θ whichminimizes cost function (3.11).

Depending on the considered model structure, the appearance of the costfunction in (3.11) will vary and different methods must be applied to minimizeit. Below, approaches to identify each of the proposed models are outlined.

3.4.1 Grey-Box ModelThe grey-box model given by (3.1) is nonlinear both in the states and in theparameters, requiring the use of some nonlinear least-squares method to findthe minimum of (3.11). In Paper V, the trust-region reflective Newton methodis utilized [4]. At each iteration, this method starts at a point θi and attemptsto find a nearby point θi+1 for which V is smaller. To do this, V (θ) is approx-imated by a simpler function h(θ). Preferably, h should reflect the behavior ofV reasonably well in some neighborhoodN around θi, called the trust-region.The minimum of h is then found over N at the point θs. If V (θs) < V (θi),then θs is chosen to be our next guess, θi+1, and the algorithm can continueto the next iteration. If V (θs) > V (θi), the trust-region N is shrunk andthe process is repeated. The properties that define which type of trust-regionreflective algorithm is used are how the trust-region N is altered in each stepand how h is minimized overN . In this work, built-in functions in MathWorksMATLAB® are used to do these minimizations.

Because of the nonlinearity of (3.1), the cost function in (3.11) is not nec-essarily convex, meaning it may have several local minima. Hence, it is notguaranteed that the global minimum will be found. The outcome will dependon the initial guess of the parameters.

3.4.2 ARX Wiener ModelMany ways of identifying Wiener type models exist, a list of batch methods isprovided in [7]. The ARX Wiener model of Paper I, given by (3.2), is iden-tified using [Algorithm VII] in [35], that recursively estimates the unknownparameters of a Wiener system with a continuous piecewise-linear static non-linearity by application of a recursive prediction error method (RPEM).

3.4.3 Volterra-Laguerre ModelIn Volterra-Laguere model (3.3), the model output depends linearly on theunknown parameters and thus (3.11) will be convex and its minimum canbe found using ordinary Least Squares (LS). However, as the Laguerre or-der L grows, the number of parameters becomes large and the variance in

27

the estimates increases. This makes a sparse estimation method appropriatefor identification, yielding parameter vectors with only few nonzero elements.This may be at the cost of reduced model accuracy, but can also improve modelperformance as overparametrization is avoided.

The most widely used sparse estimation method is perhaps Least AbsoluteShrinkage and Selection Operator (LASSO), which uses the constraint that the`1-norm of the parameter vector is no greater than a given value [33]. LASSOrequires the selection of a user parameter which is usually a daunting task. InPaper II, the SParse Iterative Covariance-based Estimation (SPICE) method isused, which does not suffer from this drawback [28]. SPICE is readily appliedto the system of equations given by (3.7). In fact, when the underlying data isreal, SPICE is reduced to solving a linear program (LP) [28].

3.4.4 Polynomial Wiener ModelIdentification of the polynomial Wiener model given by (3.9) and (3.10) isdone using a novel method proposed in Paper III: First the Volterra-Laguerrecoefficients are found using LS as described in Section 3.4.3. Next, the coeffi-cients pertaining to the first-order kernel are used to generate an approximationof the linear-block output. With the output of the linear block given, standardpolynomial fitting is used to obtain the polynomial coefficients.

As is shown in Paper III, the described way of identifying the model re-sults in parameter bias under model mismatch, i.e. when ε(k) 6= 0 in (3.9).However, an iterative technique for reduction or even elimination of this biasis also derived in Paper III. By applying the bias reduction algorithm, highquality estimates of both the linear and the nonlinear block of the underlyingWiener system are obtained.

3.5 Model Comparison3.5.1 Considered ModelsThe four different models of the SPS presented in the previous sections are allof different character and each model has strengths and weaknesses. In thissection, the relative performance of the models in terms of fit is investigated.

In what follows, the models will be denoted as follows:

• Model 1: The grey-box model in Section 3.2.1

• Model 2: The ARX Wiener model in Section 3.2.2

• Model 3: The Volterra-Laguerre model in Section 3.2.3

• Model 4: The polynomial Wiener model in Section 3.2.4

28

Furthermore, the performance of the models are compared to that of a State-of-the-Art (SotA) model of the SPS suggested in [18]. Finally, a linear modelof the SPS obtained by letting the output nonlinearity of Model 2 be a unitgain is considered.

3.5.2 ValidationIdentification is carried out by minimizing cost function (3.11). However, theresulting value of (3.11) only reflects the performance of the model on the dataused for identification, the identification data. The model may perform worseon other data sets, which is often the case for overparameterized models andcases when the underlying system is not covered by the assumed model struc-ture. It is therefore preferable to validate the obtained models on validationdata, i.e. data sets acquired independently of the identification data.

3.5.3 User ParametersEach of the presented models require the specification of one or more userparameters. In the following, the user parameter values that were found throughexperiments to give the best fit were used. For details on how different choicesof the user parameters affects the identification outcome and model perfor-mance, refer to the appended papers.

3.5.4 ExperimentAn input signal was generated with the stimulus generation method derived inPaper I and presented to a healthy test subject and test subject diagnosed withPD; the subjects were age-matched. The resulting input-output data were usedto identify the models. Then, 10 additional input-output data sets were ac-quired by presenting 10 new stimuli to the two test subjects. The 10 additionalsets were used to validate the identified models by simulating them with theinput signals and comparing the obtained model output with the correspond-ing measured eye movement data, using (3.11). Although eye-movement datafrom only two test subjects are analyzed here, the obtained results are repre-sentative of those obtained from all healthy and PD test subjects participatingin this study (see the appended papers for details).

3.5.5 ResultsTable 3.1 shows the average value of the residual sum of squares (RSS), whichis cost function (3.11) multiplied by N , for the 10 validation sets for the dif-ferent models and test subjects.

29

Table 3.1 reveals some noteworthy results. First, all models of the SPS pro-posed in this thesis except the ARX Wiener one, outperformed the previousSotA model. The best performing models were the Volterra-Laguerre modeland the polynomial Wiener model. These two models impose few restrictionson the system dynamics whereas the remaining models assume more specificstructures for both the linear and the nonlinear components. Evidently, someproperties of the underlying system are not captured by the postulated struc-tures of the grey-box model and the ARX Wiener model.

Another explanation for the weaker performance of the grey-box modelsmay lie in the fact that their associated cost functions are non-convex due totheir nonlinear structure. The minimization algorithm may thus have con-verged to a local minimum so that the optimal parameters were not recovered.

Despite the better performance in terms of fit of the two black-box models3 and 4 compared to that of grey-box model 1, the latter has the advantageof physical meaning in the obtained parameters, which may be of interest insome applications.

The linear model was outperformed by all the nonlinear models. An ex-pected result since studies have shown that the system is in fact nonlinear.

The number of parameters in Model 3 varies depending on the test subject,due to the sparse technique used for model identification. For the healthysubject, the number of nonzero parameters needed to accurately describe thesystem is just 3, whereas 8 are needed for the more complex dynamics of theSPS in the PD patient. Refer to Paper II for more discussion on this matter.

It is argued in Paper II that the SPS is dynamically nonlinear and that thisis the reason for the poorer performance of the ARX Wiener model, wherea static nonlinearity is assumed. On the other hand, the polynomial Wienermodel of Paper III is almost on a par with the Volterra-Laguerre model forhealthy test subjects, despite the lack of any dynamic nonlinearity. However,the results of Table 3.1 show that, in the PD case, the Volterra-Laguerre model

Healthy PDModel #Parameters RSS #Parameters RSS

1 2 2.21 2 10.5

2 9 6.31 9 46.8

3 3 0.66 8 3.9

4 6 0.81 6 15.2

SotA 3 3.14 3 14.8

Linear 4 7.82 4 51.2

Table 3.1. The number of parameters (#Parameters) and the average value of the RSSin the validation of models 1–4 of this thesis, the SotA model suggested in [18] and alinear ARX model.

30

exhibits significantly better performance than both Wiener models, maintain-ing the inference that the system is dynamically nonlinear in PD. In fact, theperformance of all models is significantly worse for PD data than for healthydata. In Paper II, this is explained by stating that the SPS in PD subjects isstrongly nonlinear and perhaps also time-variant, which gives the impressionthat eye movements of PD patients are more random than in healthy subjects,making it more difficult to model the underlying system correctly. Neverthe-less, the overall dynamics are captured well by all four models even in PDpatients and the papers of this thesis show some promising results on the mat-ter.

31

Chapter 4Non-parametric modeling

The obtained dynamical models of Chapter 3 are used in Papers I, II, III andV partly as means for distinguishing between individuals by their estimatedmodel parameters. The parameter values can be used as a biometric to char-acterize the SPS and to tell one individual from another based only on eyemovement.

In contrast to dynamical modeling, which can also be referred to as para-metric modeling, a non-parametric approach is presented here. The non-parametric method is the main topic of Paper IV and is aimed at detectinggaze trajectories or part of gaze trajectories that are deviating from the norm,where the norm is determined from a selected group of individuals. For ex-ample, if the normal trajectory and its probability density is estimated froma group of healthy individuals to establish a healthy eye tracking profile, in-dependently acquired data sets of eye movement can via the presented non-parametric method be tested against said profile to identify possible deteriora-tion of SPS function in the considered subject.

4.1 MethodAssume that Ns data sets of eye movements are recorded from a test subjecttracking the same trajectory of the visual stimulus multiple times on differentoccasions. Due to the complex nature of the oculomotor system, the responseto a visual stimulus will not be the same for repeated exposures. Hence, theNs data sets will not be equal. For each of the Nt time instances at which thegaze direction is sampled, there will be Ns data points, one from each set ofrecorded eye movements. Since horizontal and vertical gaze direction coor-dinates are logged separately, the data points will have two components. The

33

data points at time instance k will be seemingly random with some expectedvalue, and can thus be seen as Ns observations of a two-dimensional stochas-tic variable, X(k). Note that there will be one stochastic variable per timeinstance. Assuming that X(k) are independent for different k, the associateddistributions can be estimated from data.

4.1.1 Anomaly DetectionThe distribution of X(k) will depend on the trajectory of the visual stimulus,but also on the individual tracking ability. If the probability density function(PDF) of X(k) for k = 1, 2, . . . , Nt is known for an individual, i.e. if aneye tracking profile has been established for that individual, it is possible todetermine whether a given data set is likely to come from the same subjector not. For each of the Nt time instances, a hypothesis test with the nullhypothesis that the data are indeed observations of X(k) can be carried out.If the number of time instances, in which the deviation of the data from thedistribution in question is high, the data set is deemed not to be from theconsidered eye tracking profile. The approach generalizes in a straightforwardmanner to the case of a group of test subjects sharing a property, such ashealthy persons or persons of a certain age.

Testing the hypothesis that the data are fromX(k) is done by means of out-lier regions. The outlier region is the set of all possible observations deemedunlikely to come from the distribution in question. The confidence region isits complement. The outlier and confidence regions are properly defined inPaper I.

4.1.2 Distribution EstimationIn practice, the distributions ofX(k), k = 1, 2, . . . , Nt, are not known, but canbe estimated from data. The simplest way is to use the histogram. However,since the data are two-dimensional, a large number of data points is needed toachieve sufficiently small bin widths for reliable statistical testing. To acquirea large number of data points, a test subject would have to track the same vi-sual stimulus a large number of times, which would be time-consuming andtedious. Therefore, other techniques of PDF estimation are used. A commonapproach is to use a normal distribution approximation. However, such anapproach does not always provide satisfactory results. In these cases, moregeneral means for estimating PDFs should be applied. Examples of such tech-niques are presented in both Paper I and Paper IV.

34

Figure 4.1. Six time instances of an eye-tracking profile (black dashed lines: inlierregion boundaries, blue line: trajectory mean) and an independently acquired data set(green/red dashed line) demonstrating a typical case.

4.1.3 Illustrative ExampleTo illustrate how the method works, Fig. 4.1 shows a typical example wheresix time instances of a gaze trajectory are compared to an eye-tracking profile.The black dashed lines represent the boundaries of the PDF outlier regions ineach time step and the blue line shows the mean trajectory of the profile. Thegreen/red dashed line shows the gaze path of the independently acquired dataset. Only in time instance k = 5 does the considered gaze trajectory deviatefrom the eye-tracking profile, detected by the fact that the measurement fallsin the outlier region of the associated PDF.

4.1.4 Method extensionThe assumption that X(k) are independent greatly reduces the dimensionalityof the problem and thus also the number of observations required for accu-rate density estimation. Unfortunately, the independence assumption violatesthe postulation that the SPS is a dynamical system which may result in falsepositives. Such a case is demonstrated in Fig. 4.2 where six time instancesof a simulated gaze trajectory and eye-tracking profile are shown. It is clearthat the characteristics of the red path are different from those of the blueone. Nevertheless, the red path falls within the inlier regions of the consideredeye-tracking profile in each time instance and will therefore be accepted asbeing part of it, resulting in a false positive. In fact, the red path exemplifiesthe typical oscillating SPEM in PD patients. This example therefore demon-

35

Figure 4.2. Six time instances of an eye-tracking profile (black dashed lines: inlierregion boundaries, blue line: trajectory mean) and an independently acquired data set(red dashed line) demonstrating a false positive case of the non-parametric method.

strates how negligence of the dynamical aspects of the system may impairPD-detection performance. In Paper IV, this problem is addressed by also tak-ing the gaze velocity into account when generating eye-tracking profiles. Thispartly accounts for the dynamical nature of the SPS and reduces the proba-bility of false positives. If the red path in the provided example of Fig. 4.2is compared to the blue one in terms of both position and velocity, it wouldlikely yield outliers in all six time instances. Refer to the paper for furtherresults and discussion on the matter.

In addition, Paper IV provides some discussion regarding local deviation,i.e. when certain parts of a gaze trajectory, but not all, deviate from the consid-ered eye-tracking profile. It is demonstrated how the difficulty-level of follow-ing a particular stimulus depends on its dynamics. Specifically, the deviationis stronger in trajectory segments of higher stimulus acceleration. It is furtherdiscussed how this may be exploited in improved stimulus design for quantifi-cation of SPS deterioration.

36

Figure 4.3. Heat map of the OSA estimates of the PDFs of H4 tracking a short segmentof a stimulus. Red indicates high values. The blue line shows a trajectory of P1attempting to track the same stimulus.

4.2 Some Results4.2.1 ExperimentThree stimuli were generated with the stimulus generation method of PaperI and presented three times per stimulus to three healthy test subjects (H1,H2 and H3) and three PD subjects (P1, P2 and P3). A fourth healthy subject(H4) was shown the three stimuli 15 times each. An eye-tracking profile wasestablished based on the gaze data of H4 for each of the three stimuli againstwhich the gaze trajectories of the six other subjects were tested. Both OSA andKDE were used to estimate the PDF at each time instance. User parameterswere chosen through experiment to give the best results.

A heat map of the estimated trajectory distribution of H4 tracking a shortsegment of a stimulus, overlaid by the gaze trajectory of P1 is shown inFig. 4.3. It is apparent that the trajectory of P1 deviates from the eye-trackingprofile of H4 at several time instances.

37

4.2.2 ResultsTab. 4.1 shows the number of outliers in every set of the six subjects at the0.01% significance level when testing against the profiles established from H4using OSA and Tab. 4.2 shows the results obtained using KDE.

The results of Tab. 4.1 and Tab. 4.2 show that the gaze paths of the healthysubjects deviate less from the trajectories of H4 than do those of subjects diag-nosed with PD. This implies that the SPS-function differes between PD sub-

Subject Set1(%)

Set2(%)

Set3(%)

P1 51.8 64.9 72.2

P2 32.8 42.8 41.5

P3 64.1 67.9 70.0

H1 7.2 7.1 9.7

H2 7.6 5.5 5.7

H3 13.1 12.6 12.1

Table 4.1. The number of outliers at the 0.01% significance level in the data sets ofthree PD subjects (P1, P2, P3) and the sets of three healthy subjects (H1, H2, H3)when comparing to the OSA estimates of the trajectory distribution a fourth healthysubject (H4). The numbers are given as percent of the total number of samples in thedata set.

Subject Set1(%)

Set2(%)

Set3(%)

P1 42.2 50.9 51.1

P2 27.1 29.1 29.0

P3 50.0 49.2 52.7

H1 4.5 3.1 5.5

H2 7.5 6.1 6.2

H3 12.9 13.1 11.5

Table 4.2. The number of outliers at the 0.01% significance level in the data sets ofthree PD subjects (P1, P2, P3) and the sets of three healthy subjects (H1, H2, H3)when comparing to the KDE estimates of the trajectory distribution a fourth healthysubject (H4). The numbers are given as percent of the total number of samples in thedata set.

38

jects and healthy subjects. The results also suggest that for this particular ap-plication, the OSA method of PDF estimation is preferable as it shows betterseparation of gaze trajectories of different individuals.

39

Chapter 5Concluding Remarks

The focus of this doctoral thesis is on the modeling and identification of theSPS. While this topic has been treated in the literature before, several keyaspects of the problem have been previously overlooked. In earlier research ofSPS modeling, the design goal has been to match the transient behavior of thesystem, ascertaining model quality for step and ramp inputs. However, due tothe nonlinear features of the SPS, step and ramp responses are not sufficientfor complete characterization of the system.

Attempts at using SPEM as a biometric or a PD diagnosing tool can alsobe found in earlier literature, but few studies manage to produce convincingresults, often because of a negligence of dynamical systems theory. Most stud-ies are in medicine, where a common approach is to use sinusoidal inputs andtheir response in the computation of various quantities for SPS characteriza-tion. The studies fail to acknowledge that by making inference from single-frequency excitation experiments, the dynamical nature of the SPS is ignoredand as a consequence the obtained quantities are neither robust nor represen-tative of the entire system.

This work is the first to devise models intended for extended-time experi-ments and also the first to consider input design in the context of SPS model-ing. The result is a handful of models outperforming current SotA models interms of prediction accuracy for rich input signals. Moreover, by addressingpreviously overlooked problem aspects, a deeper insight into both the SPS andinto practical modeling concerns is achieved.

It is found that even though complex grey-box models provide physicallymeaningful parameter values and yield accurate simulation results, more light-weight black-box models are preferable for SPS characterization in differentindividuals. It is also found that the properties of the visual stimuli have greateffect on the SPS identification results and that more reliable and robust mod-els can be attained if care is put into stimulus design. The importance ofinput design is further stressed by the results of the presented non-parametric

41

method revealing that certain stimulus movements elicit greater differencesbetween the gaze paths of different individuals. Proper input design is thusparamount for technology aiming to use SPEM as a biometric. Another strongresult of this thesis is that with properly designed visual stimuli, high-qualitymodels of the SPS can be obtained by which the deteriorating effects of healthyaging on SPEM are distinguishable from those of PD.

Finally, the results presented herein confirm that the SPS in healthy indi-viduals is indeed nonlinear, but it is concluded that the nonlinearity of thesystem is significantly stronger in PD subjects. Additionally, it is shown thatthe nonlinearity in healthy individuals is well-modeled by a static output func-tion, whereas the stronger nonlinear behavior introduced to the SPS by PD isdynamical.

5.1 Future ResearchThere are several possible directions in which to continue the research of thisthesis. Each of the appended papers presents a new approach to the problemof SPS characterization and naturally gives rise to a plethora of new researchquestions. Some examples of areas for future research are suggested below.

5.1.1 Model StructureDuring the work of this thesis, a large number of model structures have beeninvestigated in terms of propriety for SPS modeling. The five most prominentare analyzed in more detail in the appended papers. However, there are stillstones left unturned and there is room for potential improvement, e.g. byattaining a better understanding of the neurological feedback controller andfinding a suitable way to implement it the grey-box model of Paper V.

5.1.2 Input DesignAlthough the method of stimulus design presented in Paper I generates se-quences which provide satisfactory results, even better performance may beachievable with further improved input design. It would be of interest to eval-uate the performance of existing methods for nonlinear-system input-design(see e.g. [3],[9]) applied to the SPS modeling problem.

5.1.3 Wiener System Identification: The Non-Gaussian CaseThe method for polynomial Wiener system identification suggested in PaperIII is only analyzed under the assumption of white Gaussian input signals. In

42

many applications, and indeed in the case of SPS modeling, white input sig-nals are infeasible. The performance of the method and that of the accompa-nying bias reduction technique should therefore also be evaluated for coloredinput sequences. In this case, the regressors are no longer uncorrelated andthe provided analysis in Paper III is no longer valid. However, bounds forthe obtained bias or exact bias expressions under certain assumptions may bederivable.

5.1.4 Improved Eye-Tracking ProfilesThe non-parametric approach presented in Paper IV may be improved bybuilding eye-tracking profiles based on distributions estimated from short snip-pets of gaze trajectories rather than just single points. In a larger experimentinvolving more test subjects, enough data could be collected to allow for ac-curate density estimation even in this case of high dimensionality. This wouldbetter account for the dynamical nature of the SPS and thereby provide morereliable results.

43

Chapter 6References

[1] T. Bohlin Practical grey-box process identification: theory and applications,Springer Science and Business Media, 2006.

[2] A. K. Churchland and S. G. Lisberger. Gain Control in Human Smooth-PursuitEye Movements. Journal of Neurophysiology, Vol. 87, pp. 2936-2945, 2002.

[3] A. De Cock, M. Gevers, J. Schoukens, A Preliminary Study on Optimal InputDesign for Nonlinear Systems 52nd IEEE Conference on Decision and Control,Florence, Italy, 2013

[4] J. E., Jr. Dennis, Nonlinear least-squares State of the Art in Numerical Analysised. D. Jacobs, Academic Press, pp 269-312, 1977

[5] A. Duchowski A breadth-first survey of eye-tracking applications BehaviorResearch Methods, Instruments, and Computers, 34 (4):455–479, 2002.

[6] J. M. Gibson, R. Pimlott, C. Kennard. Ocular motor and manual tracking inParkinson’s disease and the effect of treatment Journal of Neurology, 50:853–860, 1987.

[7] A. Hagenblad Aspects of the Identification of Wiener Models Thesis No. 793,Linköping University, Sweden, 1999.

[8] R. Hammoud Passive Eye Monitoring: Algorithms, Applications andExperiments Springer Publishing Company, Incorporated , ISBN: 35407541139783540754114, 2008.

[9] H. Hjalmarsson, J. Martensson, B. Ninness Optimal Input Design forIdentification of Non-Linear Systems: Learning from the Linear Case AmericanControl Conference, 2007.

[10] E. R. Kandel, J. H. Schwartz, and T. M. Jessell, Principles of Neural ScienceMcGraw Hill, New York, 2000 ch 43.

[11] N. Kathmann, A. Hochrein, R. Uwer, B. Bondy. Deficits in Gain of SmoothPursuit Eye Movements in Schizophrenia and Affective Disorder Patients andTheir Unaffected Relatives. American Jouran of Psychiatry, Vol. 160, pp.696-702, 2003.

[12] L. Ljung System Identification Theory for the User Englewood Cliffs, NJ:Prentice-Hall, 1987

[13] S. Marino, E. Sessam, G. Di Lorenzo, P. Lanzafame, G. Scullica, A. Bramanti,

45

F. La Rosa, G. Iannizzotto, P. Bramanti, P. Di Bella Quantitative Analysis ofPursuit Ocular Movements in Parkinson’s Disease by Using a Video-Based EyeTracking System. European Neurology, 58:193–197, 2007.

[14] V. Z. Marmarelis Identification of Nonlinear Biological Systems UsingLaguerre Expansions of Kernels Annals of Biomedical Engineering, 21:6573–589, 1993

[15] C. H. Meyer, A. G. Lasker and D. A. Robinson. The upper limit of humansmooth pursuit velocity. Vision Res, Vol. 25, pp. 561-563, 1985

[16] T. Nakamura, R. Kanayama, R. Sano, M. Ohki, Y. Kimura, M. Aoyagi, Y. KoikeQuantitative Analysis of Ocular Movements in Parkinson’s Disease ActaOto-Iaryngologica, Vol. 111, pp. 559–562, 1991

[17] W. T. Newsome, R. H. Wurtz, M. R. Dürsteler, A. Mikami. Deficits in visualmotion processing following ibotenic acid lesions of the middle temporal visualarea of macaque monkey Journal of Neuroscience, Vol. 5, pp. 825-840, 1985

[18] U. Nuding, S. Ono, M. J. Mustari, U. Büttner and S. Glasauer A Theory of theDual Pathways for Smooth Pursuit Based on Dynamic Gain Control Journal ofNeurophysiology, Vol. 99, pp 2798-2808, 2008

[19] C. Ramaker, J. Marinus, A.M. Stiggelbout, B.J van Hilten Systematicevaluation of rating scales for impairment and disability in Parkinson’s diseaseMovement disorders, Vol. 17, pp. 867–876, 2002

[20] C. Rashbass. The relationship between saccadic and smooth tracking eyemovements. Journal of Physiology, Vol. 159, pp. 326-338, 1961

[21] D. A. Robinson. The mechanics of human smooth pursuit eye movement.Journal of Physiology, Vol. 180, pp. 569-591, 1964

[22] W. J. Rugh, Nonlinear System Theory: The Volterra-Wiener Approach JohnHopkins University Press, Baltimore, 1981

[23] M. Schetzen The Volterra and Wiener Theories of Nonlinear Systems Malabar,FL: Krieger, 1980

[24] S. C. Schwartz Estimation of probability density by an orthogonal series TheAnnals of Mathematical Statistics, 38:1261–1265, 1967

[25] A. B. Sereno, P. S. Holzman Antisaccades and Smooth Pursuit Eye Movementsin Schizophrenia Biological Psychiatry, 37:394–401, 1995.

[26] B. W. Silverman Density Estimation for Statistics and Data Analysis London:Chapman & Hall/CRC, ISBN 0-412-24620-1, 1998

[27] T. Soderstrom, P. Stoica System identification, Prentice-Hall, Inc. Upper SaddleRiver, NJ, USA, 1988

[28] P. Stoica, P. Babu SPICE and LIKES: Two hyperparameter-free methods forsparse-parameter estimation Signal Processing, 92:1580–1590, 2012

[29] T.A. Stuve, L. Friedman, J.A. Jesberger, G.C. Gilmore, M.E. Strauss, H.Y.Meltzer The relationship between smooth pursuit performance, motionperception and sustained visual attention in patients with schizophrenia andnormal controls Psychological Medicine, Vol. 27, Issue. 1:143–152, 2000

[30] M. Tarter, R. Kronmal On multivariate density estimates based on orthogonalexpansions The Annals of Mathematical Statistics, 41:718–722, 1970

[31] J. R. Thompson, P. R. A. Tapia Non-parametric function estimation, Modeling& Simulation Misc. Bks. Society for Industrial and Applied mathematics, SIAM,3600 Market Street, Floor 6, Philadelphia PA 19104, 1990

46

[32] K. Tiels, J. Schoukens Wiener system identification with generalizedorthonormal basis functions Automatica, 50:3147?-3154, 2014

[33] R. Tibshirani Regression shrinkage and selection via the lasso Royal. StatisticsSociety B, 58:267–288, 1996

[34] O.B. White, J. A. Saint-Cyr, R. D. Tomlinson, J. A. Sharpe Ocular motordeficits in Parkinson’s Disease, II. Control of the saccadic and smoothp ursuitsystems Oxford Journals of Medicine, Brain, Vol. 106, pp. 571–587, 1983

[35] T. Wigren Recursive identification based on the nonlinear Wiener model Ph.D.thesis, Acta Universitatis Upsaliensis, Uppsala Dissertations from the Faculty ofScience 31, Uppsala University, Uppsala, Sweden, 1990

47

Paper I

d

Visual Stimulus Design in Parameter

Estimation of the Human Smooth

Pursuit System from Eye-Tracking Data∗

Daniel Jansson and Alexander MedvedevDepartment of Information Technology, Uppsala University

Uppsala, SwedenE-mail: [email protected], [email protected]

Abstract

The dynamical properties of the human smooth pursuit system (SPS)are studied. Linear black-box and nonlinear Wiener models of the SPSare identified from eye-tracking data in view of their potential appli-cations in diagnosing and staging various clinical conditions. A novelapproach to visual stimulus design is suggested and evaluated. Accurateestimation of the linear dynamics requires sufficient input frequency ex-citation, while the identification of the nonlinear part is dependent uponthe signal amplitude distribution. Both aspects of input design are takeninto account. Visual stimuli generated using the presented method areshown to yield favorable identification results compared to existing stimu-lus design techniques in terms of reduced variance of parameter estimatesand smaller spread of the parameter clouds pertaining to different indi-viduals. The nonlinear Wiener models of the SPS appear to outperformthe linear ones provided the visual stimuli are properly designed.

1 Introduction

Studies of human eye movements have been greatly simplified by the intro-duction of modern video-based eye-tracking techniques. In particular, studiesaimed at developing technology for eye movements as a biometric have becomemore frequent [4, 13, 15, 17, 18]. The goal of such studies is to use eye-trackingdata to quantify the functionality of the human oculomotor system or/and useit to distinguish between individuals. There are numerous possible applicationsfor example in medicine, security, and disability aids [7, 12].

∗This study is in part financed by Advanced Grant 247035 from European ResearchCouncil entitled ”Systems and Signals Tools for Estimation and Analysis of MathematicalModels in Endocrinology and Neurology”.

51

There are different types of eye movement. All of them are governed by complexneuromuscular systems and are thus susceptible to impairment by disease ornormal aging. Saccades and smooth pursuit are perhaps the most frequentlymentioned types of eye movement, [6].Various clinical conditions such as Huntington’s Chorea [3] and Schizophre-nia [22] are known to impair the smooth pursuit system (SPS). In particular,Parkinson’s disease has been shown to undermine the SPS [10,15], motivatingthe search for accurate quantification methods that could then be further usedas diagnosing or even staging tools.The function of the SPS can be described in control engineering terms as ve-locity tracking since the gaze angular velocity effectively follows the targetvelocity. This remarkable property resulted in a measure of the SPS perfor-mance called smooth pursuit gain (SPG), prevalent in medical SPEM litera-ture [1,2,10,15,24]. The latter is defined as the ratio of the angular velocity ofthe eye to that of the target, typically measured for a certain frequency of the(harmonic) visual input. SPS eye movements are commonly elicited in medicalstudies by requiring the subjects to track a constant velocity or sinusoidally os-cillating target [11]. While being useful in medical examination of SPS, thesesimple types of visual stimuli are hardly sufficient for mathematical modelingof eye movements as they yield only one point in the frequency response of thesystem, [18].In this study, the dynamical nature of SPS is stressed and its entire frequencyresponse is modeled to gain more insight into the system properties. Thefrequency response of SPS is therefore referred to as dynamic SPG (DSPG),indicating that the system gain is evaluated over all frequencies available in thevisual stimuli spectrum. Systems appearing equivalent in terms of SPG, maybe distinguished between using DSPG.A first step towards producing an in-practice useful technology is the devel-opment of a dynamical mathematical model of the SPS and a method thatyields low variance and consistent estimates of the model parameters. Theaim of this paper is to obtain such a model of the SPS from eye-tracking dataand evaluate its performance. Specifically, the obtained models will be appliedto the quantification of Parkinson’s disease symptoms. The results are basedon the analysis of gaze-direction data obtained from healthy and diagnosedwith Parkinson’s disease test subjects attempting to track visual stimuli on acomputer monitor.In previous research, little effort has been put into visual stimulus design.Common input signals have been simple ramps [20] and sinusodials [15] andfocus has been on the transient behavior of the SPS. In the identification ofdynamical systems, the trajectories of the stimuli must be properly designedto excite the nonlinear dynamics (in frequency and amplitude) of the SPS. Theauthors are unaware of any previous attempts at appropriate extended-timestimulus design. Therefore, a new way of designing visual stimuli by specifyingthe excitation properties in the frequency domain and enforcing optimizationconstraints to shape the signal amplitude distribution is suggested and the

52

necessary mathematical tools to obtain these desired properties are derived.The method developed herein augments the linear dynamical model of theSPS in [18] with a static output nonlinearity, forming a Wiener model of theSPS. The parameters of the model are then estimated from a large number ofrecorded data sets from the same individual. The parameter estimates are usedtogether with orthogonal series approximation (OSA) to capture the shape ofthe probability density function (PDF) describing the distribution of the pa-rameters specific to the individual in question. Parameter estimates of othertest subjects can then be tested against said distribution to statistically deter-mine whether they belong to it or not; thereby providing means to distinguishbetween individuals based on their eye movement.The paper is composed as follows: In Section 2, the SPS models under con-sideration are introduced. The excitation necessary for model identification isthen discussed and a method of generating suitable input signals is provided inSection 3. Section 4 describes how the input signals are generated and imple-mented as visual stimuli. Probability density estimation through OSA is thendescribed in Section 5 followed by the definition of the outlier region and meansto find it in Section 6. In Section 7, the experimental setup is described and theexperiments conducted are explained. The results of the experiments are thenpresented in Section 8 followed by conclusions and a discussion in Section 9.

2 Modeling

The initial structure to model the SPS in this study is a linear time-invariantARX model given by

(1 + a1q−1 + a2q

−2 + a3q−3)x(n) = bq−4s(n) + e(n), (1)

where x(n) is the output at time nTs, s(n) is the input at time nTs, e(n) iszero-mean white Gaussian noise with variance σ2, θ = [a1, a2, a3, b]

T is theparameter vector of the model, q is the forward time shift operator, and Ts isthe sampling time. It is assumed that the SPS is described by two independentsingle-input-single-output (SISO) models in parallel, one for horizontal and onefor vertical movements of the eye. This particular ARX model structure waschosen since it through repeated experiments was determined to produce thebest identification results out of all considered linear models. The model orderwas chosen using the Rissanen minimum description length (MDL) criterion[19].The simple linear model (1) captures the overall dynamics of the SPS wellenough to be used in some applications [18], but for other applications, ahigher modeling accuracy is needed. A fifth-order nonlinear model derivedfrom the physical properties of the external oculomotor system is presentedin [17]. Although such a model gives insight into the physical nature of theSPS, estimating its parameters is difficult due to a multitude of local minimain the resulting loss function. The most apparent flaw of the linear models

53

used to model the SPS is their inability to accurately predict the output ampli-tudes [17]. One way to alleviate this problem is to augment model (1) with astatic output nonlinearity, f(x), yielding a Wiener-type model. In this paper,f was for simplicity chosen to be a continuous piecewise-linear function definedon x ∈ [−∞,∞] and specified by its values at the grid points xj , where {xj}Kj=1

is a uniform grid on the symmetric interval

I = [−λ, λ]. (2)

The lines in the first and last subintervals are extrapolated to x = ±∞. Thenumber of unknown parameters in the static nonlinearity is thus K − 1, thesame as the number of subintervals, since this assumes that zero input giveszero output, so that the center grid point will always be at the origin. Theresulting nonlinear model of the SPS is given by{

(1 + a1q−1 + a2q

−2 + a3q−3)x(n) = q−4s(n),

y(n) = f(x(n)) + e(n).(3)

The parameter b has been set to 1 in this model to remove ambiguities in thegain estimation. Note that no attempt is made herein to assign any physiolog-ical meaning to the nonlinearity.Identification of the model parameters of (1), in the absence of the nonlinearityf , was carried out by solving the standard linear least-squares problem, i.e. byminimizing the square sum of the residuals between the model output and themeasured output data. When the full nonlinear model (3) was used, the param-eters of the model were estimated using a MATLAB toolbox for identificationof Wiener systems [27].

3 Input Design

The models in (1) and (3) aim at capturing the dynamics of the SPS. Whenestimating the model parameters from experimental data, the parameter valuesare individual and may depend on e.g. age, gender and clinical conditions. Theproperties of the model input have great influence on the resulting parameterestimates and hence effort should be put into designing appropriate input sig-nals. In this study, the input is dynamical visual stimuli in form of a whitemoving circle on a computer monitor. Horizontal and vertical components arestudied separately, and an input signal is thus a one-dimensional function oftime constituting the horizontal or vertical movements of the stimulus. Themovement of the circle must be smooth and rich enough to provide excitationfor consistent parameter estimation. Two important measures are taken toachieve this:

• For the linear dynamics, the spectrum of the input must be designed sothat it excites all the important frequency modes of the system. High

54

frequencies must be suppressed to avoid additional nonlinear phenomenanot described by models (1) and (3). If the stimuli are too rapid, thesmooth pursuit mechanism will transfer control to other mechanisms,e.g. the saccadic system [6].

• The static nonlinearity determines how different input values are scaledand can only be accurately estimated if its input signal possesses suffi-ciently many values in the range of interest (I in (2)). By introducinga peak-to-average power ratio (PAR) constraint, the signal distributionover its dynamic range may be brought closer to uniform and previouslyunder-represented amplitudes become more frequent. This effect of thePAR constraint was discovered through experiment.

Since estimation of the linear and the nonlinear part cannot be done separately,the designed input signal must have both of the properties mentioned above.Let s(t) be the sought continuous input signal over the time interval [0, Tn].Discretize the interval in N steps by 0 = t0 < t1 < . . . < tN−1 < Tn. Thediscrete version of the signal is then sn = s(tn), n = 0 . . . N − 1. Let s =[s0 s1 . . . sN−1]T be the signal written in a vector form. Assume that thereis some desired power spectral density (PSD) d = [d0 d1 . . . dN−1]T that thespectrum of s should be as close to as possible in some appropriate metric.For s to be real-valued, d needs to be symmetric around its center point.Furthermore, it is also desired for s to have a maximum allowed PAR. If thevector 2-norm of s, denoted by ‖ ·‖, is constrained to be equal to

√N , the PAR

can be written as

PAR(s) =maxn |sn|2

1N

∑Nn=1 |sn|2

= maxn|sn|2.

Since ‖s‖2 = N , the upper bound for the PAR is N . The following optimizationproblem can be formulated to obtain an s with the two properties mentionedabove:

minsf =

N−1∑p=0

||zp| −√dp|2

s.t ‖s‖2 = N

PAR(s) ≤ ρ(4)

where ρ is the maximum allowable PAR and z = [z0 z1 . . . zN−1]T is theFourier domain representation of s which can also be written as z = FHswhere ·H denotes the conjugate transpose operator. FH is the Fourier matrix

55

given by

FH =1√N

1 1 . . . 1

1 ω . . . ωN−1

1 ω2 . . . ω2(N−1)

......

......

1 ωN−1 . . . ω(N−1)(N−1)

N×N

with ω = e−2πi/N .This way of generating real-valued input sequences with a desired spectraldensity and a maximum allowable PAR is a modified version of the methodderived in [23], where complex-valued white sequences are generated.Expanding the expression in the criterion function of problem (4) gives

f =N−1∑p=0

||zp| −√dp|2 =

N−1∑p=0

(|zp|2 − 2|zp|

√dp + dp

).

By introducing the auxiliary variables ϕ = [ϕ0 ϕ1 . . . ϕN−1]T with ϕp ∈[−π, π], and denoting arg zp by θp, problem (4) can be rewritten as

mins,ϕ

f =

N−1∑p=0

(|zp|2 − 2|zp|

√dp cos (θp − ϕp) + dp

)

=N−1∑p=0

(|zp|2 − |zp|

√dp

(ei(θp−ϕp) + e−i(θp−ϕp)

)+ dp

)

=N−1∑p=0

(|zp|2 − zp

√dpe−iϕp − z∗p

√dpe

iϕp + dp

)s.t ‖s‖2 = N

PAR(s) ≤ ρ

Thus (4) is equivalent to the following minimization problem

mins,ϕ

N−1∑p=0

|zp −√dpe

iϕp |2

s.t ‖s‖2 = N

PAR(s) ≤ ρ

The problem can be rewritten in a more convenient vector form

mins,ϕ

f = ‖FHs− v‖2 (5)

s.t ‖s‖2 = N

PAR(s) ≤ ρ

56

Algorithm 1 Minimization of the criterion in (5)

1. Initialize s to a random vector s0. Iterate steps 2 and 3 below, for i =0, 1, . . . until convergence.

2. ϕi = arg minFHsi.

3. Let si+1 be the solution to the nearest-vector problem solved in [26] andset i← i+ 1.

where v = [√d0e

iϕ0√d1e

iϕ1 . . .√dN−1e

iϕN−1 ].The optimization problem in (5) can be solved in a cyclic way. Fix s to anyreal sequence and compute the v that minimizes f . The v that minimizes ffor a fixed s is obtained by letting

ϕp = arg(the p : th element of FHs) (6)

where arg(·) denotes the complex argument. Next fix v and write the mini-mization problem as

mins‖s− Fv‖2 (7)

s.t ‖s‖2 = N

PAR(s) ≤ ρ

where the fact that FHF = I, where I is the unitary matrix, was used torewrite the criterion function. Problem (7) has been called the ”nearest-vector”problem and can be solved using the methodology in [26]. To disregard thePAR constraint, ρ can be set to N , which is the maximum possible PAR forany sequence with the norm equal to

√N . Iterating between (6) and (7)

until convergence gives a signal s satisfying the design specifications. Thedesign procedure is outlined in Algorithm 1. The algorithm has the propertyof monotonically decreasing the criterion as the iteration proceeds [8].

4 Data Generation

Identifying the models of Section 2 requires input and output data of the sys-tem. Input data are visual stimuli in the form of a moving circle on a computermonitor that are generated using two different methods:

1. Method 1 (M1): Using a method presented in [18], where the accelera-tion of the circle is white Gaussian noise in both the horizontal and thevertical direction and the circle ”bounces” when it reaches the windowboundaries. Using this method the spectrum of the signal cannot beassigned beforehand and there is no control of the signal distribution.

57

2. Method 2 (M2): Using Algorithm 1 to obtain two signals, one for thehorizontal and one for the vertical component of the stimuli, with thedesired spectral content d and a maximum PAR, ρ.

The latter method requires the specification of the desired signal amplitudespectrum, denoted by d in (4), and the maximum allowable PAR, denoted byρ in (4). As mentioned in Section 3, signals with rich spectral content up toa certain cutoff frequency above which the SPS will transfer control to othermechanisms are sought. The ideal spectrum is thus the one that is constantfor all frequencies smaller than the chosen cutoff and drops to zero for allfrequencies above it. When specifying the sequence dp in (4) that representsthe desired spectrum of the discrete signal, the sampling frequency, fs, andthe number of samples, N , must be considered. The sequence dp must also bemade symmetric around its center in order for the corresponding time domainsequence to be real-valued. The desired sequence dp is thus

dp =

fs2fc

0 ≤ p < N fcfs

0 N fcfs≤ p < N(fs−fc)

fsfs2fc

N(fs−fc)fs

≤ p ≤ N(8)

where fc is the desired cutoff frequency. The constant value of the spectrum forfrequencies smaller than fc is chosen so that the total signal energy is indeedN , as was required in (4). The cutoff frequency was chosen to be fc = 1.5 Hzand the maximum allowable PAR was set to ρ = N/10, see Section 9 for detailson the choice of spectrum and PAR constraint. The signals are scaled to fitinside the window in which they are displayed on the monitor, so the signalenergy may vary between different stimuli sets.

5 Distribution Estimation

The normal distribution (Gaussian distribution) is commonly used to approx-imate the statistical properties of the underlying stochastic variable of a givendata set. However, normal distribution fitting may not always give satisfactoryresults, depending on the studied quantity. Should the data appear to not benormally distributed, an alternative method to distribution estimation is OSA.In OSA nothing is presumed about the underlying distribution to be estimated.It is thus a non-parametric way of estimating an unknown PDF, in contrast tothe parametric method of normal distribution fitting.

5.1 Orthogonal Series Approximation

A non-parametric way of obtaining smooth estimates of the PDF of an unknowndistribution is by OSA [21,25]. Assume that the PDF f , of the N -dimensionalcontinuous random variable X is to be estimated. Further assume that X is

58

supported in the domain D, i.e. P (X ∈ D) = 1. If f is square integrable(f ∈ L2(D)), the density may be approximated with arbitrary accuracy by atruncated orthogonal series

f(x) =∑j∈J

cjϕj(x), x ∈ D,

where

cj =

ˆD

f(x)ϕj(x)dx,

J is a finite set of N -tuples of integers and {ϕj(x)} is an orthonormal basis. Thelargest integer in each dimension in J gives the highest approximation orderof that dimension and must be chosen by the user. The highest order in eachdimension will decide the number of the basis functions that will be used in theapproximation. Note that because f is a probability density, each coefficientin the above mentioned partial sum can be written as the expectation

cj =

ˆD

f(x)ϕj(x)dx = E{ϕj(X)}.

Hence, estimating cj can be done via the sample mean

cj =1

Ns

Ns∑i=0

ϕj(xi),

where (x1, x2, . . . , xNs), xi ∈ RN are independent observations of the un-

derlying stochastic variable. In this study, all considered distributions are ofdimension two and the orthonormal basis is chosen to be

{ϕn1,n2(x)} =

{√det(Γ)φn1,n2

(Γ(x− µ))}, (9)

where {φn1,n2} is the complete set of two-dimensional Hermite functions [14].

They are orthonormal with respect to the L2 inner product. The Hermitefunctions, {φn1,n2

(x)}, x ∈ R2, are given by

φn1,n2(x) =

1√2n12n2n1!n2!π

e−(xT x)/2Hn1,n2(x),

whereHn1,n2(x) = 2

n1+n22 Hn1,n2(

√2x)

are the physicists’ Hermite Polynomials [14] and

Hn1,n2(x) = (−1)n1+n2e(x

T x)/2 ∂n1

∂xn11

∂n2

∂xn22

e−(xT x)/2

are the probabilists’ Hermite Polynomials [14].

59

The reason for using Hermite functions is their effectiveness in approximatingdistributions similar to normal ones, but in practice the choice of basis usuallyinfluences estimation only slightly [5]. The vector µ ∈ RN and the matrixΓ ∈ RN×N are the user parameters for scaling and translating the functions.The choice of the user parameters µ and Γ will depend on the data. Choos-ing µ to be the sample mean of the observations reduces the number of theHermite functions required in the truncated series to achieve a given estima-tion error [21]. If Γ is chosen as a diagonal matrix, its diagonal elements willdecide the width of the functions in the corresponding dimension. The choiceof the diagonal elements should be based on the variance of the consideredobservations in each dimension. Choosing the functions to be too narrow willincrease the required function order for accurate estimation, and choosing thefunctions to be too wide will ’smudge’ the estimated distribution, reducing thesignificance of single observations.

6 Outlier Region

Recall that all considered distributions in this study are of dimension two.Assume that an observation, x ∈ R2, is made and that it must be determinedwhether it is likely to be an observation of a given random variable X, or not.A hypothesis test with the null hypothesis:

• H0: x is an observation of X

must be carried out. One way to do this is to define an outlier region, S, of therandom variable, being the set of all possible observations deemed unlikely tocome from the considered distribution, i.e all x for which H0 is rejected. Theprobability that an observation of X lies in S should be low. The complementof the outlier region is the confidence region corresponding to the set of allpossible observations deemed likely to come from the distribution. Define αsuch that

P (X ∈ S) =

ˆS

f(x)dx = α.

Thus, α is the probability with which an observation of the considered randomvariable is incorrectly deemed to be from some other distribution. The choiceof α will influence the size of the outlier region S.The outlier region of a random variable X with probability density f is givenby

S =

{x :

ˆS

fdx = α, f(x) ≤ f(xc),∀xc ∈ Sc}, (10)

where Sc is the complement set of S in R2. If f is positive and continuous,S will be uniquely defined by (10). This can be realized by considering the

60

integral

J =

ˆQ

f(x)dx (11)

where

Q = {x : f(x) ≤ γ} , (12)

and γ is some positive constant. If f is positive and continuous, increasing γin (12) will continuously increase the value of J in (11). For some value of γ,denoted by γT , J will be equal to α and thus Q will be equal to S2. Since thereis only one value of γ which yields J = α, S2 is unique.Determining whether a given observation x is part of S is not straightforwardfrom (10) and finding S analytically is not possible in the general case, but canbe evaluated numerically through Algorithm 1.

Algorithm 2 Finding the outlier region of a PDF numerically

1. Calculate f for the finite set of uniformly spaced grid points {xi}Mi=1 toobtain {fi}Mi=1.

2. Let {f(i)}Mi=1 be {fi}Mi=1 sorted in ascending order.

3. Find K such that∑Ki=1 f(i) ≤

αA <

∑K+1i=0 f(i), where A is the area of a

grid element.

4. An approximation, S, of S is then given by

S = {x : f(x) ≤ f(K) = γT }. (13)

The idea of the algorithm is as follows. Keep summing the smallest elements of{fi}Mi=1 while the sum does not exceed α

A . Denote the largest term in the sumby γT . Let S be the set of all xi corresponding to the summed fi. For a givenobservation x, H0 is rejected if x ∈ S, which is easily checked by inserting xinto (13).The error in approximating S by S is not discussed here, but depends on thegrid size and the smoothness of f . The grid size should be chosen so that fdoes not change significantly within one grid element.

7 Experiments

Gaze direction data of test subjects attempting to track the moving circleon a computer monitor were recorded using a video-based eye tracker fromSmart Eye AB, Sweden. Test subjects were placed 50 cm from the monitor

61

with the monitor center at eye height. The eye tracker output is the distancein centimeters (horizontal and vertical components separately) between themonitor center and the point where the gaze direction line intersects the theplane of the monitor. Eye-tracking data were sampled at a sampling frequencyof fs = 60 Hz.The size of the window limiting the movements of the circle was 25 × 25cm. The interval in (2) is thus chosen to be [−12.5, 12.5] cm, which will bethe maximum dynamic range of the input and output signals’ horizontal andvertical components, respectively.Stimuli of length N = 2560 samples (about 42 seconds) were generated withboth M1 and M2 of Section 4 and displayed to test subjects of different ages.The conducted experiment involved three healthy test subjects:

• H1: Man, 26 years old



and five test subjects diagnosed with Parkinson’s disease:

• P1: Woman, 57 years old

• P2: Man, 71 years old




Data sets obtained using stimuli generated with M1 will in what follows bereferred to as M1 sets and sets obtained using M2 generated stimuli will bereferred to as M2 sets.Two experiments were conducted.

7.1 Experiment 1

The first experiment was designed to investigate the importance of input signalfrequency content when estimating the linear dynamics (1) of the SPS. Stimulishown to the test subjects were generated with both M1 and M2 of Section 4.In M2, the PAR constraint was omitted, as only frequency excitation was tobe considered in this experiment.

7.2 Experiment 2

The second experiment was designed to hint at the importance of the inputsignal amplitude distribution when estimating static nonlinearity (3) of theSPS. Using M2 of Section 4, 25 stimuli were generated with maximum allowable

62

0 1 2 3 4 50

100

200

300

Frequency (Hz)

PSD

a)

0 1 2 3 4 50

10

20

30

40

Frequency (Hz)

b)

0 1 2 3 4 50

10

20

30

40

Frequency (Hz)

c)

Figure 1: The PSD of three signals generated with a) Method 1, b) Method 2without PAR constraint, c) Method 2 with PAR ≤ ρ.

−12.5 −6.25 0 6.25 12.50

100

200

300

Centimeters

Num

ber o

f occ

uren

ces

a)

−12.5 −6.25 0 6.25 12.50

100

200

300

Centimeters

b)

−12.5 −6.25 0 6.25 12.50

100

200

300

Centimeters

c)

Figure 2: 25-bin histograms of three signals generated with a) Method 1, b)Method 2 without PAR constraint, c) Method 2 with PAR ≤ ρ.

PAR ρ = N/10, and another 25 using the same method, but without the PARconstraint. H1 was asked to watch each stimulus, generating a total of 50 datasets.

8 Results

8.1 Input Design

Fig. 1 depicts the spectra (periodograms) of three signals. One obtained bymeans of Method 1, one obtained by means of Method 2, but without the PARconstraint and one by means of Method 2 with ρ = N/10 (i.e constraining thePAR to be less than or equal to N/10). Each signal is typical to the method itwas generated by. The spectrum in Fig. 1a) is far from the desired spectrum, d,in (8). It has a few large peaks and is lacking excitation for many frequencies.The spectrum in Fig. 1b) is significantly closer to the desired spectrum, d.Fig. 1c) shows a similar spectrum to that in Fig. 1b). Fig. 2 shows 25-binhistograms of the three signals whose spectra are given in Fig. 1, after theywere scaled to fit the display window. The histograms are approximations of thesignal distributions. The histogram in Fig. 2a) shows that for this signal, somevalues occur far more frequently than others. The same can be said about thehistogram in Fig. 2b), where values close to zero seem to have higher occurrencerate. The histograms of Fig. 2a) and b) show that the signal distribution isalmost zero over some intervals. The histogram of the PAR constrained signalis closer to uniform than the two others, as can be seen in Fig. 2c), and all

63

−1 0 1 20

100

200

300

Normalized amplitude

Num

ber o

f occ

uren

ces

−1 −0.5 0 0.5 10

100

200

300


Num

ber o

f occ

uren

ces

−1 −0.5 0 0.5 10

100

200

300


Num

ber o

f occ

uren

ces

−1 −0.5 0 0.5 10

100

200

300


Num

ber o

f occ

uren

ces

Figure 3: 25-bin histograms showing the distributions of the input (left) andthe output (right) of the linear block in (3) for two different signals.

values in its dynamic range are well-represented.

8.2 Input to the Nonlinear Block

Fig. 3 shows that the input signal distribution is nearly preserved when passingthrough the linear part of (3) with typical parameters identified using the dataof this paper. This means that input signal design by distribution shaping toallow for more accurate identification of the statical nonlinearity makes sense.

8.3 Experiment 1

The parameters of linear model (1) were estimated for the horizontal part of20 M1 sets and 20 M2 sets of all test subjects. The parameter estimates wereused together with OSA to approximate the distribution of the parameters ineach test subject for each stimulus type. To facilitate visual presentation, onlythe parameters a1 and b were considered in the distribution estimation.In the OSA estimation, the user parameter µ in (9) was chosen to be the samplemean of the observations. Γ in (9) was chosen to be a diagonal matrix withdiagonal elements ri = 2

σi, i = 1, 2, where σi is the sample standard deviation in

each dimension of the considered sample of observations, i.e of the horizontaland vertical data in each time instant. This choice of Γ was made throughexperiment and shown to give the most satisfactory results. The highest orderof the Hermite functions was set to 4 in each dimension. Using higher ordersshowed little or no improvement in the obtained results.

64

Fig. 4 and Fig. 5 depict the boundary of the confidence regions of the dis-tribution estimates of all subjects and highlight three important results: Theconfidence regions shrink (reduced variance in the parameter estimates), andthe estimated distributions for different test subjects are more separated whenappropriately designed stimuli are used. Furthermore, it can be seen that themodel parameters of the PD patients differ significantly from those of healthysubjects. In fact, since there is no intersection between the outlier regions ofthe model parameters corresponding to the PD patients and those of healthysubjects, statistically testing each PD parameter set against the healthy distri-butions is redundant.

a)

a1

-2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2 -1.9

b

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12 H1, M1H1, M2H2, M1H2, M2H3, M1H3, M2

Figure 4: The confidence regions for α = 0.05 of the estimated parameterdistributions for different test subjects and stimulus types. Dashed vs solidlines show the effects of using stimuli generated with M1 compared to M2.

Table 1 gives the standard deviation of the estimates of b and a1 for H1,H2 andH3. The intra-subject variance is significantly lower when using M2 comparedto M1 for input signal generation which can also be seen in Fig. 4. Similarresults were obtained for vertical movements.

65

b)

a1

-2.3 -2.2 -2.1 -2 -1.9 -1.8

b

0

0.02

0.04

0.06

0.08

0.1H1H2H3P1P2P3P4P5

Figure 5: The confidence regions for α = 0.05 of the estimated parameterdistributions for different test subjects and stimulus types. Dashed and solidlines are associated with PD and healthy parameter distributions respectively.

Person Method a1 (STDV) b (STDV)

H1 M1 0.12 0.024

M2 0.048 0.013

H2 M1 0.087 0.026

M2 0.051 0.010

H3 M1 0.14 0.020

M2 0.036 0.010

Table 1: The standard deviation (STDV) of the estimates of a1 and b for H1,H2 and H3 using different input design methods.

8.4 Experiment 2

For the horizontal part of each of the 50 data sets obtained from the experimentin Section. 7.2, the nonlinear Wiener model (3) was identified with two differentchoices of the static nonlinearity f . One with 6 and one with 10 parameters,i.e. one with the interval I in (2) divided into 6 subintervals and one with 10

66

subintervals. To show the effects of using a PAR constrained input compared toan unconstrained input, the mean ± one standard deviation of the 25 estimatednonlinearities f was plotted for each input type in Fig. 6 and Fig. 7. Thestandard deviations of each parameter are given in Table 2. It can be seenthat the standard deviation is significantly smaller in the cases where PARconstrained inputs, i.e inputs with closer-to-uniform amplitude distributions,were used. Similar results were obtained for vertical movements.

6-parameter nonlinearity 10-parameter nonlinearity

Parameter ρ = N ρ = N/10 ρ = N ρ = N/10

(No const.) (No const.)

1 0.27 0.071 0.22 0.11

2 0.21 0.034 0.16 0.067

3 0.17 0.027 0.14 0.047

4 0.17 0.029 0.10 0.041

5 0.18 0.040 0.085 0.030

6 0.30 0.087 0.075 0.041

7 0.11 0.055

8 0.13 0.059

9 0.21 0.082

10 0.32 0.11

Table 2: The standard deviations of the estimated parameters of the 6- and10-parameter static nonlinearities in (3) calculated using the horizontal partof 25 datasets for each input type. Using ρ = N in (4) disregards the PARconstraint.

8.5 Linear vs. Wiener

For the horizontal part of each of the 25 data sets in Section 7.2 with PAR con-strained inputs, both the linear model (1) and the nonlinear Wiener model (3)were identified. The remaining 24 data sets were then used for validation. Theaverage decrease in the Mean Square Error (MSE) between the modeled andthe measured output when using nonlinear models compared to using linearmodels was then calculated. Static nonlinearities with 4, 6, 10 and 14 parame-ters were used. The results are shown in Table 3. Similar results were obtainedfor vertical movements.

67

Nonlinearity Average decrease in MSE (%)compared to linear model

4 parameters 12.5

6 parameters 13.0

10 parameters 12.1

14 parameters 11.9

Table 3: Average decrease in MSE between the modeled and the measuredoutput when using nonlinear models compared to using linear models. 25models were validated with 24 data sets each.

−13 −6.5 0 6.5 13−13

−6.5

0

6.5

13

Input−13 −6.5 0 6.5 13

−13

−6.5

0

6.5

13

Input

Out

put

Figure 6: The mean ± one standard deviation of the 25 estimated 6-parameternonlinearities f (horizontal movements) using stimuli generated with a) Method2 with no PAR constraint b) Method 2 with ρ = N/10.

−13 −6.5 0 6.5 13−13

−6.5

0

6.5

13

Input

Out

put

−13 −6.5 0 6.5 13−13

−6.5

0

6.5

13

Input

Figure 7: The mean ± one standard deviation of the 25 estimated 10-parameternonlinearities f (horizontal movements) using stimuli generated with a) Method2 with no PAR constraint b) Method 2 with ρ = N/10.

68

9 Conclusions and Discussion

The purpose of this study was to explore the dynamical properties of the humanSPS and to develop means for reliable identification of the suggested models.The main contribution of the paper was the suggestion and derivation of a novelmethod for input signal design, allowing for accurate identification of both thelinear dynamics in (1) and the static nonlinearity in (3) of the SPS.Estimation of the dynamic part requires a sufficiently exciting input with richspectral content. However, if the input contains components of too high fre-quency, the control of the eye movements will be transferred from the smoothpursuit system to other mechanisms. The PSD of the input signals was thusdesigned to resemble that of (8). Fig. 1 shows that the resulting spectra ofthe frequency domain designed signals are indeed close to d, and that goodresults are achieved even when constraining the PAR of the signal. Repeatedexperiments showed that a cutoff frequency fc in (8) of about 1.5 Hz gave themost favorable results, although any value between approximately 1 and 2 Hzproduced similar outcome.Fig. 4 and Table 1 suggest that the intra-subject variance of the parametersis significantly decreased when the stimuli generated using the presented inputdesign method are used compared to using stimuli with poorer spectral excita-tion. With low intra-subject variance, individuals will be easier to distinguishbetween, even those with similar SPS features. This is particularly importantin applications aimed at diagnosing and staging diseases, where even smalldiscrepancies in the SPS must be detected.Estimation of the static nonlinearity requires information about all input valuesin the dynamic range of the signal. By constraining the generated input signalsto have a certain peak-to-average power ratio, the signal distribution can bebrought closer to uniform, with more contribution of each value in the signalrange, as shown in Fig. 2. The significance of this can be seen in Fig. 6, Fig. 7and Table 2 where the estimates of the nonlinearities using PAR constrainedinputs are more consistent than those obtained with arbitrarily distributedsignals. The choice of the maximum allowable PAR will affect the resultingsignal distribution. In this study the maximum PAR was set to N/10, becauseit was shown through repeated experiments to give the most favorable signaldistributions.To motivate the use of nonlinear model (3), it must be shown to performbetter than the less complex linear model in (1). Table. 3 indicates that animprovement of performance is indeed achieved when using a nonlinear modelof the SPS. The MSE reduction is the greatest with a 6-parameter nonlinearity.Using only 4 parameters may not be sufficient to model the true nonlinearity,and using 10 or 14 parameters may over parameterize the system, makingmodel validation difficult.A secondary objective of the study was to evaluate the ability of the suggestedmethod to distinguish between healthy individuals and individuals diagnosedwith Parkinson’s disease based on smooth pursuit eye movements. Fig. 5 in-

69

dicates that the parameters, and thereby also the dynamics, of the SPS inindividuals with Parkinson’s disease differ from those in healthy individuals.The parameters of the SPS also vary depending on age, as can be seen inFig. 5. However, it is made apparent that deviations in the parameters dueto Parkinson’s disease are larger and of different character than the deviationsdue to age. This is supported by the fact that the parameters of healthy andParkinson subjects differ even when the compared subjects are of similar age,as can be seen in Fig. 5.The results suggesting that there indeed are differences between the smoothpursuit eye movements of the considered patients with Parkinson’s disease andthe healthy individuals are indicative of the potential of the presented methodand encourage future research to further investigate the methods as tools fordiagnosing or staging of Parkinson’s disease.

References

[1] L.A. Abel, L. Friedman, J. Jesberger, A. Malki, H.Y. Meltzer. Quantitativeassessment of smooth pursuit gain and catch-up saccades in schizophreniaand affective disorders. Biological Psychiatry, 29:1063–1072, 1991.

[2] A.P. Accardo, S. Pensiero, P. Perissutti. Differences in Smooth PursuitParameters Evaluated in Adults and Children. 18th Annual InternationalConference of the IEEE Engineering in Medicine and Biology, Amsterdam,1996.

[3] G. Avanzini, F. Girotti, T. Carazeni, R. Spreafico. Oculomotor disorders inHuntington’s chorea Journal of Neurology, Neurosurgery and Psychiatry,42:581–589, 1979.

[4] R. Bednarik, T. Kinnunen, A. Mihaila, P. Frnti Eye-Movements as a Bio-metric Lecture Notes in Computer Science, 3540:780–789, 2005.

[5] S. T. Buckland Fitting Density Functions with Polynomials Journal of theRoyal Statistical Society, series C, Applied Statistics, 41:63–76, 1992.

[6] R. Dodge Five types of eye movements in the horizontal meridian plane ofthe field of regard American Journal of Physiology, 8:307–329, 1903.

[7] A. Duchowski A breadth-first survey of eye-tracking applications BehaviorResearch Methods, Instruments, and Computers, 34 (4):455–479, 2002.

[8] J. R. Fienup Phase retrieval algorithms: a comparison Applied Optics, 21:2758–2769, 1982.

[9] R. W. Gerchberg, W. O. Saxton A practical algorithm for the determinationof the phase from image and diffraction plane pictures Optik (Stuttgart),35:237–246, 1972.

70

[10] J. M. Gibson, R. Pimlott, C. Kennard. Ocular motor and manual trackingin Parkinson’s disease and the effect of treatment Journal of Neurology, 50:853–860, 1987.

[11] M. Gorges, E. Pinkhardt, J. Kassubek Alterations of Eye Movement Con-trol in Neurodegenerative Movement Disorders Journal of Ophtalmplogy,2014.

[12] R. Hammoud Passive Eye Monitoring: Algorithms, Applications and Ex-periments Springer Publishing Company, Incorporated , ISBN: 35407541139783540754114, 2008.

[13] P. Kasprowski Eye Movements in Biometrics Lecture Notes in ComputerScience, 3087:248–258, 2004

[14] T. H. Koornwinder, R. Wong, R. Koekoek, R. Swarttouw OrthogonalPolynomials NIST Handbook of Mathematical Functions Camebridge Uni-versity Press, ISBN 978-0521192255, 2010

[15] S. Marino, E. Sessam, G. Di Lorenzo, P. Lanzafame, G. Scullica, A. Bra-manti, F. La Rosa, G. Iannizzotto, P. Bramanti, P. Di Bella QuantitativeAnalysis of Pursuit Ocular Movements in Parkinson’s Disease by Using aVideo-Based Eye Tracking System. European Neurology, 58:193–197, 2007.

[16] A. Meyer, M. Bhme, T. Martinetz, E. Barth. A Single-Camera RemoteEye Tracker. Lecture Notes in Computer Science, 4021:208–211, 2006.

[17] D. Jansson, A. Medvedev, H. W. Axelson Mathematical modeling andgrey-box identification of the human smooth pursuit mechanism IEEEMulti-conference on Systems and Control, Yokohama, Japan, 2010.

[18] D. Jansson, A. Medvedev Dynamic Smooth Pursuit Gain Estimation fromEye Tracking Data IEEE Conference on Decision and Control, Orlando,Florida, 2011.

[19] J. Rissanen Modeling by shortest data description Automatica, 14:465–4711978.

[20] D. A. Robinson, J. L. Gordon and S.E. Gordon. A Model of the SmoothPursuit Eye Movement System. Biological Cybernetics Vol. 55, pp. 43-57,1986

[21] S. C. Schwartz Estimation of probability density by an orthogonal seriesThe Annals of Mathematical Statistics, 38:1261–1265, 1967

[22] A. B. Sereno, P. S. Holzman Antisaccades and Smooth Pursuit Eye Move-ments in Schizophrenia Biological Psychiatry, 37:394–401, 1995.

[23] H. He, J. Li, P. Stoica Waveform Design for Active Sensing Systems - Acomputational approach Camebridge University Press, New York, 2011.

71

[24] T.A. Stuve, L. Friedman, J.A. Jesberger, G.C. Gilmore, M.E. Strauss,H.Y. Meltzer The relationship between smooth pursuit performance, mo-tion perception and sustained visual attention in patients with schizophre-nia and normal controls Psychological Medicine, Vol. 27, Issue. 1:143–152,2000

[25] M. Tarter, R. Kronmal On multivariate density estimates based on or-thogonal expansions The Annals of Mathematical Statistics, 41:718–722,1970

[26] J. A. Tropp, I. S. Dhillon, R. W. Heath, T. Strohmer Designing structuredtight frames via an alternating projection method IEEE Transactions onInformation Theory, 51:188–209, 2005.

[27] T. Wigren MATLAB Software for Recursive Identification of Wiener Sys-tems Systems and Control, Department of Information Technology, UppsalaUniversity, 2007.

72

Paper II

d

Volterra Modeling of the Smooth

Pursuit System with Application to

Motor Symptoms Characterization in

Parkinson’s Disease∗

Daniel Jansson and Alexander MedvedevDepartment of Information Technology, Uppsala University


Abstract

A new way of modeling the Smooth Pursuit System (SPS) in hu-mans by means of Volterra series expansion is suggested. Together withGaussian Mixture Models (GMMs), it is utilized to successfully distin-guish between healthy controls and Parkinson patients based on their eyemovements. To obtain parsimonious Volterra models, orthonormal ex-pansion of the Volterra kernels in Laguerre functions with the coefficientsestimated by SParse Iterative Covariance-based Estimation (SPICE) isused. A combination of these two techniques is shown to greatly reducethe number of model parameters without significant performance loss.In fact, the resulting models outperform the Wiener models of previousresearch despite the significantly lower number of model parameters. Fur-thermore, the results of this study indicate that the nonlinearity of thesystem is likely to be dynamical in nature, rather than static, which waspreviously presumed. The difference between the SPS in healthy con-trols and Parkinson patients is shown to lie largely in the higher orderdynamics of the system. Finally, without the model reduction providedby SPICE, the GMM estimation fails, rendering the model unable toseparate healthy controls from Parkinson patients.

1 Introduction

With the help of modern video-based eye trackers, attempts at developingtechnology to enable the use of human eye movements as a biometric have


75

increased in number and quality [1,5,7,8,10,11]. The goal of such studies is touse eye-tracking data to quantify the functionality of the human oculomotorsystem and use it to distinguish between individuals. A particular type of eyemovement is smooth pursuit that is activated when an individual observes asmoothly moving target, keeping it within the visual field. The smooth pursuitsystem (SPS) is, alike all other subsystems governing eye movements, a complexneuromuscular system and, as such, is susceptible to impairment by variousmedical conditions. In [4], it is shown that Parkinson’s disease affects the SPSnegatively and this is confirmed in [7–11], where possibilities of quantifying thedifferences in smooth pursuit eye movements between healthy individuals andindividuals diagnosed with Parkinson’s disease are brought to light. However,none of the mentioned papers show any results suggesting what the nature ofthe discrepancies is.In [7], the only considered metric is the smooth pursuit gain (SPG) that ispointed out in [9] to be only one point in the frequency characteristics of thesystem. The SPS is modeled by a fifth-order nonlinear dynamical system in [8]with good results, but at the cost of a heavy computational burden imposed inits individualization. A reduced linear model of SPS is investigated in [9] andshown to perform well compared to more complicated nonlinear ones. In [10],the linear block is augmented with a static nonlinearity, yielding a Wienersystem, which extension improves on the results of the linear model withoutsignificant increase in model complexity.In this paper, SPS is modeled by a Volterra series [18]. A number of impor-tant applications and theoretical studies have established the strengths andlimitations of this approach [12]. One of the most commonly mentioned short-comings of Volterra models is the large number of parameters required foraccurate modeling of even simple nonlinear systems. Here, two measures aretaken to overcome overparamerization. First, by expanding the Volterra ker-nels in the orthonormal Laguerre basis to yield so called Volterra-Laguerremodels [12], the number of required parameters is greatly reduced withoutperformance loss. Moreover, the method presented herein utilizes a sparse esti-mation technique when identifying the unknown model parameters, to furtherincrease model parsimony. Specifically, the SParse Iterative Covariance-basedEstimation (SPICE) method is used, which is hyperparameter-free and derivedfrom a statistically and computationally sound covariance fitting criterion [15].It yields sparse solutions, i.e solutions with few non-zero elements, at the costof somewhat reduced model accuracy. However, it is shown that the benefitsgained from increased parsimony outweigh this slight loss of performance.The obtained models are exploited in two ways. First, the inter-subject vari-ation of the model parameters is evaluated using a Gaussian mixture model(GMM), which is a probabilistic model for identifying subpopulations withinan overall population [3]. This is in an attempt to distinguish between healthyindividuals and patients diagnosed with Parkinson’s disease based on their eyemovements. Secondly, the estimated higher-order dynamics of the models ofhealthy controls are compared to those of test subjects diagnosed with Parkin-

76

son’s disease, in search for consistent condition-specific discrepancies.The paper is composed as follows: In Sec. 2, the applied method is summarized,including the design of visual stimuli, Volterra-Laguerre and Wiener modeling,details on how the models are identified, and a short discussion on populationpartitioning. The experimental setup and conducted experiments are describedin Sec. 4. The obtained results are presented in Sec. 5. Finally, a discussionand conclusions are provided in Sec. 6.

2 Method

2.1 Visual stimulus

The visual stimuli of this study are generated using the approach in [10]. Thestimulus generation method presented therein gives input sequences with therich spectral and amplitude excitation needed to accurately identify the nonlin-ear Wiener-type models used to portray the SPS . It should also be appropriatefor identification of more general nonlinear models such as the Volterra modelsas they essentially represent the same dynamics. The stimuli presented to thetest subject are smooth random movements of a white circle in a 25 cm × 25black background window on a computer monitor.

2.2 Laguerre Representation of Linear Dynamics

The discrete Laguerre functions, φj(k), are given in time domain by

φj(k) = αk−j2

√1− α

j∑`=0

(−1)`(k

`

)(j

`

)αj−`(1− α)`,

for all non-negative j and k [12]. They constitute an orthonormal basis in`2[0,∞) [14] with respect to the inner product

〈f(k), g(k)〉 =∞∑k=0

f(k)g(k).

Widely used in system identification due to their ability to accurately approx-imate the transfer functions of linear systems [17], Laguerre functions may beviewed as the impulse responses of a family of linear systems parameterized inthe Laguerre parameter α, 0 < α < 1 that determines their exponential decaybehavior [12]. The discrete Laguerre functions are given in Z-domain by

Φj(z) =

√1− α

z −√α

(S(z))j, (1)

where j = 0, 1, 2, . . . is the Laguerre order and

S(z) =1−√αz

z −√α

77

is the discrete Laguerre shift operator. By inspection, the recurrence

Φj+1(z) = Φj(z)S(z) (2)

with the initialization

Φ0(z) =

√1− α

z −√α

is obtained.Given the Laguerre functions in time-domain, φj(k), the impulse response of astable discrete-time linear time-invariant system can be written as

h(k) =

∞∑j=0

cjφj(k).

The response y`(k) of the system to input u(k), where u(k) = 0, k < 0, is then

y`(k) = (h ∗ u)(k) =

∞∑j=0

cj(φj ∗ u)(k),

where ∗ denotes the discrete convolution operator and k = 0, 1, . . . is the dis-crete time-variable. Introduce the Laguerre filter outputs

ψj(k) , (φj ∗ u)(k) =k∑l=0

φj(l)u(k − l), (3)

whose Z-transform isΨj(z) = Φj(z)U(z). (4)

Inserting (4) into (2) results in

Ψj(z) = Φj−1(z)S(z)U(z),

that can be rewritten recursively as

Ψj+1(z) = Ψj(z)S(z), (5)

with the initialization

Ψ0(z) =

√1− α

z −√αU(z). (6)

Equations (5) and (6) can be inverse-transformed to give the two-variable re-cursion

ψj+1(k + 1) = ψj(k) +√α(ψj+1(k)− ψj(k + 1))

initialized withψ0(k + 1) =

√αψ0(k) +

√1− αu(k).

78

Carrying out the recursion in j reveals that the Laguerre filter outputs satisfythe state-space equation

ψ(k + 1) = Fψ(k) + Gu(k), (7)

where ψ(k) = [ψ0(k) ψ1(k) ψ2(k) . . .]T and

F =

√α 0 0

1− α√α 0 · · · · · ·

√α(1− α) 1− α

√α

√α2(1− α)

√α(1− α) 1− α

√α

......

......

. . .

,

G =√

1− α

1√α

√α2

√α3

. . .

T

.

The eigenvalues of the Toeplitz matrix F are all equal to√α and the system

is thus asymptotically stable for all admissible values of α.To compute ψ(k), the initial conditions of the filters φj(k) have to be specified.The initial values may be set to zero, which is a logical choice if the systemis at rest at the beginning of data collection. Alternatively, one can estimatethe initial conditions by including them as unknown parameters in the model.However, for large data sets, the choice of initial conditions is insignificant as themodel quality will be almost completely defined by its stationary performance.

2.3 The Volterra-Laguerre Model

Assuming the SPS to be composed of two independent dynamical systemsin parallel, one governing the horizontal eye movements and one the verticalmovements, each system can be modeled in terms of a Volterra-Laguerre model.A discrete time-invariant system with input u(k) ∈ R (visual stimulus) andoutput y(k) ∈ R (gaze direction) k = 0, . . . ,K − 1, can be approximated byVolterra series

y(k) = y0 +N∑n=1

Hnu(k) + e(k), (8)

79

where e(k) ∈ R is a noise term, N ∈ N is the chosen Volterra order and

Hnu(k) =∞∑i1=0

· · ·∞∑in=0

hn(i1, . . . , in)u(k − i1) . . . u(k − in) (9)

are the Volterra functionals. The functions hn are called the Volterra kernels.In system identification, an appropriate Volterra representation is found by e.gminimizing the sum of squared errors. In general, this minimization requires thesolution of a simultaneous set of series equations which in most practical casesis difficult or even impossible to obtain. Instead, the kernels can be expandedin terms of orthogonal basis functions e.g. the discrete Laguerre functions.Using the Laguerre functions, the Volterra kernels may be expanded as

hn(i1, . . . , in) =L∑

j1=0

· · ·L∑

jn=0

γn(j1, . . . , jn)φj1(i1) · · ·φjn(in), (10)

where L is the chosen Laguerre order of the expansion. Substituting this into(9) yields

Hnu(k) =L∑

j1=0

· · ·L∑

jn=0

γn(j1, . . . , jn)ψj1(k) · · ·ψjn(k), (11)

where the sequences ψj(k) are given in (3). The number of coefficients in (11) is

Nc =∑Nn=1(L+1)n = (L+1) (L+1)N−1

L . However, due to the commutativity ofmultiplication, γn is symmetric with respect to its indices. Thus, the numberof nonzero coefficients can be reduced by writing (11) in terms of the newcoefficients γn as

Hnu(k) =

L∑j1=0

· · ·L∑

jn=jn−1

γn(j1, . . . , jn)ψj1(k) · · ·ψjn(k). (12)

The number of model parameters increases drastically with increased Volterraorder. In this paper, second-order models, i.e with N = 2 in (8), producesufficiently accurate results. Then, the number of nonzero coefficients in (11)

is Nc = L2 + 3L+ 2 but is reduced to L2

2 + 5L2 + 2 in (12). Note that

γ2(j1, j2) =

γ2(j1, j2) j1 = j2

2γ2(j1, j2) j1 > j2

0 otherwise

.

In the case of a general orthogonal basis, the Volterra-Laguerre model withfunctionals (12) cannot be implemented directly because of the infinite upperlimits in the convolutions. The convolutions must generally be truncated, but

80

in the case of Laguerre functions, the filter responses ψj(k) can be computedusing (7).Taking into account all of the above, the final Volterra-Laguerre model used inthis paper is

y(k) = y0 +L∑

j1=0

γ1(j1)ψj1(k) +L∑

j1=0

L∑j2=j1

γ2(j1, j2)ψj1(k)ψj2(k) + e(k). (13)

Given measurements of y(k) at time instances k = 0, . . . ,K − 1, a system of

equations for the unknown parameters c = [y0 γ1(0) . . . γ1(L) γ2(0, 0) . . . γ(L,L)]T

(length Nc + 1) and the noise terms e = [e(0) . . . e(K − 1)] can be formulatedand is given by

y = [A I]

[c

e

]= Bβ, (14)

where A ∈ RK×(Nc+1) is the coefficient matrix constructed from (13) and I isthe identity matrix of order K.It will be of interest to calculate the `22-norm of the estimated Volterra kernels.The `2-norm of a discrete function f(i1, . . . , in), denoted by ‖f‖2, is defined as

‖f‖2 =

√√√√ ∞∑i1=0

· · ·∞∑in=0

|f(i1, . . . , in)|2 (15)

Due to the orthonormality of the Laguerre functions, the `22-norm of a kernelcan be evaluated via its expansion (10), to

‖hn‖2 =

√√√√ ∞∑j1=0

· · ·∞∑jn=0

|γn(j1, . . . , jn)|2.

2.4 The Wiener Model

In [10], the SPS is parametrically modeled by two parallel SISO Wiener systems,for horizontal and vertical movements separately. In each dimension, the modelis given by {

(1 + a1q−1 + a2q

−2 + a3q−3)x(k) = q−4u(k),

y(k) = f(x(k)) + e(k),(16)

where y(k) is the gaze direction at time kTs, u(k) is the visual stimuli, x(k)is the output of the linear block, e(k) is zero-mean white Gaussian noise withvariance σ2

0 , q is the time-shift operator, θ = [a1, a2, a3]T is the parametervector of the model, f is a static nonlinearity, and Ts is the sampling time.The static nonlinearity is parameterized in [10] as a continuous piece-wise linearfunction with 6 unknown parameters to achieve satisfying results.

81

2.5 Identification

2.5.1 The Volterra-Laguerre Model

System identification of (13) is carried out by finding the Nc + 1 unknown pa-rameters c in (14) through minimization of an appropriate criterion. Ordinaryleast squares (LS) is a straightforward choice, but as the Laguerre order Lgrows, the number of parameters becomes large, and the variance in the esti-mates increases. It is therefore of interest to reduce the number of parametersin the model. This motivates the use of a sparse estimation method, yieldingparameter vectors with only few nonzero elements at the cost of reduced modelaccuracy. Simply neglecting the parameters with the lowest modulus is sub-optimal since the regressors in (13) are not orthogonal. Hence, more rigoroussparse estimation methods must be utilized.The most popular sparse estimation method is perhaps Least Absolute Shrink-age and Selection Operator (LASSO) which constrains the `1-norm of the pa-rameter vector to be no greater than a given value [16]. However, the selectionof the `1 threshold in LASSO is usually a daunting task. Herein, the SParseIterative Covariance-based Estimation (SPICE) method is used which does notsuffer from this drawback [15].In SPICE, it is assumed that the elements of β in (14) are uncorrelated randomvariables with zero means and variances {pi}Nc

i=0 for {ci}Nc+1i=1 and {σi}Ki=1 for

{ei}Ki=1 where xp denotes the p:th element of vector x. The covariance matrixof y is then

R = E(yyT ) = BPBT ,

where P = diag([p1 . . . pNc+1 σ1 . . . σK−1]T

). SPICE minimizes the followingweighted covariance fitting criterion

‖R−1/2(R− yyT )‖2F ,

where ‖ · ‖F denotes the Frobenius norm and R−1/2 is the Hermitian squareroot of the inverse matrix R−1. In the case of real-valued data, the SPICEmethod can be reduced to solving a linear program (LP) [15]. Introduce theweights

wk =‖bk‖2‖y‖2

,

where bk are the columns of B and ‖ · ‖2 denotes the `2-norm. The LP to besolved is then

minαi,βi

Nc+1+K∑i=1

wiαi

s.t.− αi ≤ βi ≤ αi,αi ≥ 0, i = 1, . . . , Nc + 1 +K,

y = Bβ,

where {αi} are auxiliary variables.

82

2.5.2 The Wiener Model

Wiener model (16) is estimated using a MATLAB toolbox for identification ofWiener systems [19]. It requires the user to specify the structure of the linearblock and has specifically designed recursive algorithms for identification ofWiener systems with continuous piece-wise linear static nonlinearities.

3 Population Partitioning

An important application of SPS modeling lies in using the obtained mathematicalmodels from different test subjects to partition the subjects into groups of dif-ferent properties or features. A convenient way to approach this problem isthrough GMMs [3]. A GMM is a probabilistic model for identifying subpopu-lations within an overall population. The mixture model framework is extensiveand adapted in different ways to a number of problems in various fields, seee.g. [3, 6, 13] for details. In this study, it is of interest to distinguish healthycontrols from Parkinson patients by studying the parameters of the estimatedmodels. A mixture model with two components, healthy and Parkinson, isthus assumed and the posterior probability of each parameter vector to belongto a certain component is evaluated. In this way, eye movement data can beused to make inference about whether the SPS in the considered test subjectis deteriorated by Parkinson’s disease or not.To estimate a GMM, initial component parameters must be chosen. One wayis to specify the assumed component affiliation for some of the collected ob-servations. This requires prior knowledge about which group a subset of thesubjects in the experiment belongs to, which is usually the case in medicalstudies where there is a known control group of healthy test subjects. TheGMM is thus trained using the data of the heatlhy control group.

4 Experiment

Gaze direction data of test subjects attempting to track the moving circleon a computer monitor were recorded using a video-based eye tracker fromSmart Eye AB, Sweden. Test subjects were placed 50 cm from the monitorwith the monitor center at eye height. The eye tracker output is the distancein centimeters (horizontal and vertical components separately) between themonitor center and the point where the gaze direction line intersects the theplane of the monitor. Eye-tracking data were sampled at a sampling frequencyof fs = 60 Hz.Input signals of length T = 26 s, N = 1560 samples, were generated using themethod mentioned in Sec. 2.1.The conducted experiment involved three healthy controls:


83



and five test subjects diagnosed with Parkinson’s disease:






Ten data sets were obtained from each healthy control and six from each Parkin-son patient.

5 Results

The Volterra-Laguerre model suggested in this paper (13) and the Wienermodel of previous research (16) were identified for all acquired data sets usingboth SPICE and LS. Model (13) was estimated for four different values of theLaguerre order L. In total for each estimation method and Laguerre order,60 models were identified; 10 models per healthy control and 6 models perParkinson patient. The choice of the Laguerre parameter α in (1) was madeper data set to minimize the residual sum of squares (RSS) through griddingthe parameter in 200 steps.Fig. 1 and Fig. 2 show the Laguerre coefficients of the second order Volterrakernel, γ2(j1, j2) in (11), for models of H3 and P4 respectively estimated usingboth SPICE and LS with Laguerre order L = 3. The figures demonstrate theeffects of sparse estimation.The models were validated per test subject using the remaining data sets notused for identification. Typical results are shown in Tab. 1 and Tab. 2 wherethe mean RSSs when validating models associated with H1, P1 and H3, P4are shown. Models estimated from other data sets showed similar values. Thenumber in brackets after each mean RSS value gives the number of nonzeroparameters in the corresponding model. Here it should be noted that whenusing SPICE, parameters with little impact on the model behavior are forcedto zero and that for different data sets, the model parameters forced to zero arenot necessarily the same. However, estimation results showed that the nonzero-parameter indices did not vary much for models of the same test subject. Forthe few data sets that yielded a deviating number of nonzero parameters, themodels were simply re-identified using LS under the assumption that all param-eters were zero except for the nonzero-parameter indices found for the otherdata sets.

84

Figure 1: The Laguerre coefficients of the second order Volterra kernel of H3,γ2(j1, j2) in (11), estimated using a) LS (RSS: 2.21), b) SPICE (RSS: 2.64),with Laguerre order L = 3.

Figure 2: The Laguerre coefficients of the second order Volterra kernel of a P4,γ2(j1, j2) in (11), estimated using a) LS (RSS: 8.05), b) SPICE (RSS: 9.24),with Laguerre order L = 3.

There are some noteworthy results in Tab. 1 and Tab. 2. First of all, theVolterra-Laguerre models performed better than the Wiener models, despitethe lower number of parameters.Secondly, SPICE seems to be preferable over LS as the results of the formerwere similar to the latter, but achieved with significantly fewer parameters. Infact, with Laguerre order L = 3 and L = 4, using LS sometimes reduced modelaccuracy, likely due to overparametrization.Furthermore, the model accuracy in healthy controls did not improve when theLaguerre order was increased beyond L = 2 in Tab. 1 and L = 3 in Tab. 2;it was also the case in other subjects. For the Parkinson subjects, a Laguerreorder of L = 3 or higher seemed to produce the best results.Finally, the RSS of models pertaining to the Parkinson subjects was substan-

85

tially higher than for the healthy controls, even in the aged-matched pair shownin Tab. 2.Tab. 3 shows the average `2-norm, as defined in (15), of the first and secondorder Volterra kernels, h1 and h2 in (10), of the estimated models for differenttest subjects. It is evident that the second-order kernel norms were larger inParkinson patients than in healthy controls, but that there was no significantdifference in the norm of the first-order kernels. This indicates that the lin-ear dynamics of the SPS are essentially linear in healthy subjects, whereas inParkinson subjects it exhibits significantly stronger nonlinear dynamics.

H1 P1

L SPICE LS SPICE LS

1 1.08 (2) 0.98 (6) 36.4 (3) 36.8 (6)

2 0.66 (3) 0.56 (10) 15.0 (4) 14.4 (10)

3 0.66 (3) 0.67 (15) 5.6 (7) 11.4 (15)

4 0.66 (3) 0.69 (21) 3.9 (8) 12.0 (21)

5 0.66 (3) 0.70 (28) 3.9 (8) 15.2 (28)

Wiener 6.31 (9) 46.8 (9)

Table 1: The mean RSS when validating models estimated from one data set ofH1 and one of P1. The number of parameters in each model is written withinbrackets after each mean RSS value.

H3 P4

L SPICE LS SPICE LS

1 3.81 (2) 3.21 (6) 32.1 (3) 29.1 (6)

2 3.26 (3) 3.09 (10) 12.0 (4) 12.0 (10)

3 2.01 (4) 1.99 (15) 5.7 (7) 9.5 (15)

4 2.01 (4) 2.19 (21) 4.2 (8) 10.1 (21)

5 2.01 (4) 2.20 (28) 4.2 (8) 11.2 (28)

Wiener 8.13 (9) 39.1 (9)

Table 2: The mean RSS when validating models estimated from one data set ofH3 and one of P4. The number of parameters in each model is written withinbrackets after each mean RSS value.

The parameter vectors of the models were used to estimate GMMs to identifytwo subpopulations, namely healthy subjects and Parkinson subjects. The bestresults were achieved when the Laguerre order was L = 3. In this case, thenumber of model parameters when using LS is 15, whereas the maximum num-ber of nonzero parameters in any of the models identified using SPICE was 7.

86

Subject ‖h1‖2 ‖h2‖2 (·10−3)

H1 0.22 1.8

H2 0.29 2.2

H3 0.31 0.5

P1 0.49 13.1

P2 0.26 13.2

P3 0.31 9.6

P4 0.27 15.0

P5 0.21 10.5

Table 3: The average `2-norms of the second order Volterra kernel of the esti-mated models for different test subjects.

As was indicated in Tab. 3, the difference between healthy controls and Parkin-son subjects seems to lie in the second-order Volterra kernel. For this reason,for the SPICE-identified models, GMMs were fitted both for the parametersdescribing the first order kernel (the linear part of the dynamics), and for thosedescribing the second order kernel (the nonlinear part of the dynamics). Forthe LS-identified models only the second-order kernel was considered.Because LS yields parameter vectors of full dimension, attempts at estimat-ing GMMs to the data failed due to ill-conditioned covariance matrix esti-mates. This was overcome by regularization of the covariance matrix. Fig. 3presents the posterior probabilities of each mixture model. Blue markers rep-resent healthy controls and red markers represent Parkinson patients. FromFig. 3a it is apparent that it was difficult to distinguish the two groups fromeach other by only studying the linear dynamics. Parkinson and healthy sub-jects alike were appointed to both groups. However, Fig. 3b and Fig. 3c showthat when considering nonlinear dynamics, the differences were far more appar-ent. With the prior information that the control subjects belong to the healthycomponent, the partitioning was near perfect with only one false affiliation.Even without any presumptions regarding the affiliation of the parameter vec-tors, the groups were well separated with few errors. The mixture model wasnot successful in correctly partitioning the populations based on parametersestimated using LS, as can be seen in Fig. 3d).

6 Discussion and Conclusions

A new way of modeling the SPS in humans by means of Volterra series expan-sion is suggested in this paper and used together with GMMs to successfullydistinguish between healthy controls and Parkinson patients based on their eyemovements. The problem of high-dimensional parameter spaces is overcomeby Laguerre expansion of the Volterra kernels in combination with the use of

87

C1 C2

0

10

20

30

40

50

60

70

Dat

aset

a)

C1 C2

0

10

20

30

40

50

60

70

b)

C1 C2

0

10

20

30

40

50

60

70

c)

C1 C2

0

10

20

30

40

50

60

70

d)

Figure 3: The posterior probabilities of the parameter vector of each data setto belong to either component 1 (C1) or component 2 (C2) when studyingthe parameters of a) the first order kernel (estimated with SPICE), b) thesecond order kernel (estimated with SPICE), c) the second order kernel (esti-mated with SPICE) with the initial assumption that the parameter vectors ofthe healthy controls are in C1, d) the second order kernel estimated with LS.Markers above the C1 label are in the first component with probability 1 andin the second component with probability 0. The opposite is true for markersabove the C2 label.

sparse methods for Laguerre coefficient estimation. Based on the results of thiswork, the SPS is well-suited for Volterra-Laguerre modeling. It is shown thatthe Volterra kernels of the obtained models are sparse in Laguerre domain andthat the performance of the models is preserved even when the dimension ofthe parameter space is heavily reduced through SPICE. Moreover, the modelsof this paper are shown to yield better results than models of previous research,even at higher parsimony. Finally, it is shown that the model reduction pro-vided by SPICE is not only helpful, but also necessary in order to successfullydistinguish the models of healthy controls from those of Parkinson patients bymeans of GMMs.Another insight from this study is the fact that the SPS in the test subjectswith Parkinson’s disease exhibited stronger nonlinear behavior than that in thehealthy controls. In fact, it seems that the discrepancies in the SPS betweenhealthy and Parkinson subjects are far more apparent when studying the non-linear compared to the linear dynamics. It should be noted here that thoughthe Laguerre functions have been shown to accurately approximate the impulseresponse of well-damped systems with dominant first-order dynamics [17], theymay perform worse for oscillating dynamics. Thus if the underlying linear blockhas complex poles, the Volterra-Laguerre model may compensate for the linear-block model-mismatch by altering the nonlinear part. However, the authors areunaware of any literature suggesting that the SPS dynamics in Parkinson’s dis-ease oscillates more than healthy dynamics, nor has uch behavior been observedin experimental data.Tab. 1 and Tab. 2 show that the considered Volterra-Laguerre models generally

88

performed better than the Wiener models considered in previous research, evenwith significantly fewer parameters. This may suggest that the nonlinearityis in fact dynamical in nature, and that the static nonlinearities of Wienermodels merely approximate the true nonlinear behavior. Another reason forthe relatively poor performance of the Wiener models may be the restrictiveassumptions imposed in terms of model structure of both the linear and thenonlinear parts. The Volterra-Laguerre models assume a more general structurewhich may be beneficial for their relative performance.The fact that the RSS of models pertaining to Parkinson subjects was sub-stantially higher than for healthy controls indicates that the SPS dynamics aremore complex in Parkinson’s disease than in health.The results of Tab. 3 suggests that there are differences in the nonlinear partsof the SPS between healthy controls and Parkinson patients. It is also hintedwhen comparing Fig. 1 to Fig. 2. This is an interesting discovery that shouldbe more thoroughly investigated in order to further improve techniques aimedat using eye movements as a biometric and diagnosing tool.Fig. 3 shows that healthy controls and Parkinson patients can be success-fully distinguished between using Volterra-Laguerre models estimated from eyemovement data. However, the use of sparse estimation techniques for modelidentification is necessary to achieve sufficient population partitioning. In thecase of LS-estimated models, regularization had to be carried out on the co-variance matrix estimates in the mixture model estimation. Regularizationadds bias causing in this case the population partitioning to fail. If more datawere available, estimation of mixture models without bias-inducing regulariza-tion may be possible even for models estimated using LS. However, more datarequires more experimental time, which further supports the use of sparse es-timation methods, as it is often preferred in practice to keep the experimentsas time-efficient as possible.The results of Sec. 5 imply that there are indeed differences between the smoothpursuit eye movements of the considered patients with Parkinson’s disease andthe healthy individuals. Although the number of participating test subjects inthis study was small, and although no effort was made herein to explain themedical reason for the deviating parameters and gaze trajectories in Parkin-son subjects, these results are still indicative of the potential of the presentedmethod and encourage future research to further investigate the methods astools for diagnosing or staging of Parkinson’s disease.

References

[1] R. Bednarik, T. Kinnunen, A. Mihaila, P. Frnti Eye-Movements as a Bio-metric Lecture Notes in Computer Science, 3540:780–789, 2005.

[2] K. H. Chon, N.H. Holstein-Rathlou, D. J. Marsh, V. Z. Marmarelis Compar-ative nonlinear modeling of renal autoregulation in rats: Volterra approach

89

versus artificial neural networks IEEE Transactions on Neural Networks,Vol. 9, Issue. 3:430–435, 1998.

[3] N. E. Day Estimating the Components of a Mixture of Normal DistributionsBiometrika, Vol. 56, Issue. 3:463–474, 1969.


[5] P. Kasprowski Eye Movements in Biometrics Lecture Notes in ComputerScience, 3087:248–258, 2004

[6] S.M. Khansari-Zadeh, A. Billard Learning Stable Nonlinear DynamicalSystems With Gaussian Mixture Models European Neurology, 58:193–197,2007.

[7] S. Marino, E. Sessam, G. Di Lorenzo, P. Lanzafame, G. Scullica, A. Bra-manti, F. La Rosa, G. Iannizzotto, P. Bramanti, P. Di Bella QuantitativeAnalysis of Pursuit Ocular Movements in Parkinson’s Disease by Using aVideo-Based eye-tracking System. European Neurology, 58:193–197, 2007.


[9] D. Jansson, A. Medvedev Dynamic Smooth Pursuit Gain Estimation fromeye-tracking Data IEEE Conference on Decision and Control, Orlando,Florida, 2011.

[10] D. Jansson, A. Medvedev Visual Stimulus Design in Parameter Estima-tion of the Human Smooth Pursuit System from Eye-Tracking Data IEEEAmerican Control Conference, Washington D.C., 2013.

[11] D. Jansson, O. Rosn, A. Medvedev Non-parametric analysis of eye-tracking data by anomaly detection IEEE European Control Conference,Zrich, Switzerland 2013.


[13] D. A. Reynolds, T. F. Quatieri, R. B. Dunn Speaker Verification UsingAdapted Gaussian Mixture Models Digital Signal Processing, Vol. 10, Issue.1–3:19–41, 2000.

[14] W. J. Rugh, Nonlinear System Theory: The Volterra-Wiener ApproachJohn Hopkins University Press, Baltimore, 1981

90

[15] P. Stoica, P. Babu SPICE and LIKES: Two hyperparameter-free methodsfor sparse-parameter estimation Signal Processing, 92:1580–1590, 2012

[16] R. Tibshirani Regression shrinkage and selection via the lasso Royal.Statistics Society B, 58:267–288, 1996

[17] Wahlberg. B, System Identification Using Laguerre Models IEEE Trans-actions on Automatic Control, 36:551–562, 1991

[18] N. Wiener Nonlinear problems in random theory Wiley, New York, 1958

[19] T. Wigren MATLAB Software for Recursive Identification of Wiener Sys-tems Systems and Control, Department of Information Technology, UppsalaUniversity, 2007.

91

Paper III

d

Identification of Polynomial Wiener

Systems via Volterra-Laguerre Series

with Application to Smooth Pursuit

System Characterization∗

Daniel Jansson, Alexander MedvedevDepartment of Information Technology, Uppsala University,


Abstract

This paper presents a novel approach to the identification of discretepolynomial Wiener systems in absence of a priori information aboutthe linear part. To capture the system structure, the identification isperformed via a Volterra series model whose kernels are parameterizedin terms of Laguerre functions. A property of the resulting Volterra-Laguerre (VL) model enables a straightforward estimation of the outputpolynomial coefficients. It is shown that, under model structure mis-match, the identified VL model maintains the polynomial Wiener struc-ture, but with biased estimates of both the linear and nonlinear block.Explicit expressions for the bias are derived and exploited in the designof an iterative bias reduction technique that yields improved estimatesof the linear block as well as of the nonlinearity and also estimates themodel error magnitude. The proposed approach is shown to outperformstandard Wiener system identification methods in terms of both modelfit and computational burden. Furthermore, the bias reduction algorithmis proven to uniformly converge for models with cubic output polynomi-als, but also demonstrated in numerical experiments to be effective forhigher-order polynomials. Finally, the utility of the proposed methodis demonstrated by applying it to human smooth pursuit system (SPS)characterization, where experimental eye-tracking data is used in theidentification. The resulting models of the SPS provide better predictionaccuracy than the previously studied ones.


95

1 Introduction

Nonlinear system identification is one of the most challenging system-theoreticalproblems with ubiquitous applications. Numerous identification approacheshave been developed for a plethora of system structures. Comprehensive re-views of nonlinear system models and identification algorithms are provided ine.g. [3], [22], [29].A general black-box model for Single-Input-Single-Output (SISO) discrete time-invariant nonlinear systems is the discrete Volterra series [28], [36]. The Volterraseries employ higher-order impulse responses, thus generalizing the convolu-tion description of linear time-invariant (LTI) operators to nonlinear dynam-ics. A number of important applied and theoretical studies have examined thestrengths and limitations of this approach, e.g. [4], [8], [20], [24]. One of themost commonly mentioned shortcomings of the Volterra models is the largenumber of parameters required for accurate modeling of even simple nonlinearsystems. To partially alleviate the over-parametrization problem in Volterramodels, the Volterra-Laguerre (VL) model, where the Volterra kernels are pa-rameterized in terms of the orthonormal multivariate Laguerre functions [15],can be used. The identification problem is thus reduced to the estimation of theexpansion coefficients. The VL model is a special case of the Wiener-Schetzenmodel [30].A common class of nonlinear models are the Wiener systems, comprising an LTIblock cascaded with a memoryless output nonlinearity. Wiener systems havebeen the topic of a vast number of studies where various parametric and non-parametric approaches for identification have been proposed and explored indetail (see e.g. [1], [7], [10], [25], [32], [34], [35], and [37]). Many of the methodsfor Wiener system identification require the nonlinearity to be known, differen-tiable, odd or invertible. Such assumptions simplify the problem considerably.This paper considers a more general case of a polynomial nonlinearity, whereno assumptions are made regarding its invertibility or the parity. PolynomialWiener systems can be seen as practical approximations of Wiener systemswith more general nonlinear output mappings.In Wiener system identification, there are two plausible and generally distinctgoals. One is to obtain accurate estimates of each of the underlying constitutingblocks, i.e. to approximate the coefficients in the output polynomial and thoseof the linear dynamics closely. An alternative goal is to find the Wiener modelthat yields the best data fit, i.e. that most accurately predicts the output fromthe input. It is herein shown that the two goals are not equivalent under modelmismatch.In this paper, a novel approach to the identification of polynomial Wienersystems is presented. The proposed method provides means for achieving bothof the aforementioned objectives. It is semi-parametric and neither requires apriori information about the linear block nor relies on the output polynomialnonlinearity being invertible.An advantage of the presented method is that it effectively splits a nonlin-

96

ear optimization problem into two separate linear ones, thus significantly im-proving the computation time. Similar approaches have been proposed pre-viously. In [5] and [21], identification methods based on the same type ofover-parametrization are derived. A limitation of the methods presented in [5]and [21] is the requirement of the linear subsystem to have a finite impulseresponse (FIR) structure, resulting in a significant estimation bias and perfor-mance loss if the underlying system has a long-tail infinite impulse response(IIR). It is not possible to extend the methodology in [5] and [21] to incorporateIIR structures, as it would require observations of the linear block output thatis an internal variable and generally not available for measurement.Here, the above problem is circumvented by replacing the FIR representationwith a Laguerre basis expansion via the VL approach. The Laguerre func-tions have been shown to accurately approximate the impulse response of well-damped systems with dominant first-order dynamics [30], [33] and will generallyyield a better fit than FIR models, particularly if the underlying system is IIR.Expressing the linear block in a truncated Laguerre basis rather than assuminga FIR structure may thus reduce model mismatch and thereby improve modelquality.In [30], the linear dynamics are represented in terms of the Generalized Or-thonormal Basis Functions (GOBFs) [12]. The use of the GOBF basis furtherimproves model fit compared to Laguerre functions, particularly in cases ofoscillating dynamics, but also increases model complexity and computationalburden of identification. Provided that the orders of the linear and nonlinearblocks are known, the obtained GOBF approximation of the former is shownin [30] to be highly accurate. However, if no a priori information about linearblock is available, or if a less general basis is used in the expansion, it is likelythat there will be model mismatch. As a consequence, the estimates of themodel parameters of both the linear and nonlinear blocks will be biased.In this paper, a detailed analysis of the obtained parameter estimates andthe associated bias due to model mismatch is provided. The analysis enablesthe construction of an algorithm that improves the estimates of the output-polynomial coefficients and yields a more accurate approximation of the lineardynamics. Although the Laguerre functions are in focus here, the bias expres-sions and the bias reduction technique derived are independent of the basischoice. Specifically, the analysis in its entirety also applies in the GOBF case.The utility of the proposed method is demonstrated by applying it to humansmooth pursuit eye movements (SPEM) in order to obtain a model of thesmooth pursuit system (SPS). SPEM and the SPS have been intensively stud-ied recently in medicine (see e.g [11,13,23]). Lately, as video-based eye-trackingtechniques have been greatly improved and become more accessible, a plethoraof engineering applications has emerged [2, 18, 40]. In [14] and [15], the SPSis modeled as a dynamical system relating gaze direction to visual input. Itis found that the SPS is accurately described by a Wiener model. In [14], anARX-structure of fixed order is chosen to represent the linear dynamics. Gen-erally, in many applications where Wiener models are considered, the structure

97

of the linear block is assumed to be known. However, because the SPS is sucha vastly complex neuromuscular system, it is difficult to make well-groundedassumptions regarding the structure of the LTI block. The problem of SPSidentification is therefore well-suited for the identification approach of this pa-per.The main contribution of this work is the novel identification method of poly-nomial Wiener systems with unknown structure of the linear block and thecomplete mathematical analysis of the method for general output polynomials,provided in most part in Appendix A and Appendix B. The analysis facilitatesinsights into both the performance of the method itself and the connection ofthe underlying Wiener model to the VL series and the Wiener-Schetzen model.Based on the analysis, an iterative bias reduction technique is designed for theproposed identification method, thus constituting another contribution of thepaper. Furthermore, theoretical results on the convergence of the bias reduc-tion algorithm are presented and relevant numerical experiments illustratingthe theoretical results are conducted.The paper is composed as follows: Sec. 2 and Sec. 3 detail the Laguerre andthe VL model frameworks. The system to be identified is described in Sec. 4.In Sec. 5, the method is presented followed by bias derivations in Sec. 6, withproofs of theorems in the appendices. In Sec. 7 the bias reduction algorithmis outlined and its convergence is discussed. Numerical simulations illustrat-ing the analytical results are presented in Sec. 8 together with a performanceevaluation of the method compared to previously proposed approaches. Themethod is also evaluated on experimental SPS data. Finally, a discussion onthe results are provided in Sec. 9.

2 Laguerre Representation of Linear Dynamics

The discrete-time Laguerre functions φj(k) are given by

φj(k) = αk−j2

√1− α

j∑`=0

(−1)`(k

`

)(j

`

)αj−`(1− α)`,

for all non-negative j, k, and 0 < α < 1 [15]. They constitute an orthonormalbasis in `2[0,∞) [28] with respect to the inner product

〈f(k), g(k)〉 =∞∑k=0

f(k)g(k). (1)

Discrete Laguerre functions are widely used in system identification [33] andmay be viewed as the impulse responses of a family of linear systems param-eterized in the Laguerre parameter α that determines their exponential decaybehavior. In Z-domain, the discrete Laguerre functions are

Φj(z) =

√1− α

z −√α

(S(z))j,

98

where j = 0, 1, 2, . . . is the Laguerre order and

S(z) =1−√αz

z −√α

is the discrete Laguerre shift operator.Interpreting Φj(z) as a transfer function of a filter, introduce the outputs ofthe Laguerre filters

ψj(k) , (φj ∗ u)(k) =k∑l=0

φj(l)u(k − l), (2)

where k = 0, 1, . . . is the discrete time-variable, u(k) is the filter input withu(k) = 0, k < 0. It is shown in [16] that the Laguerre filter outputs satisfy thestate-space equation

ψ(k + 1) = Fψ(k) + Gu(k),

where ψ(k) = [ψ0(k) ψ1(k) ψ2(k) . . .]T with

F =

√α 0 0

1− α√α 0 · · · · · ·

√α(1− α) 1− α

√α

√α2(1− α)

√α(1− α) 1− α

√α

......

......

. . .

, (3)

and

G =√

1− α[1√α√α2 √

α3

. . .]T. (4)

The eigenvalues of the Toeplitz matrix F are all equal to√α and the system

is thus asymptotically stable for all admissible values of α.To compute ψ(k), the initial conditions of the filters φj(k) have to be specified.The initial values may be set to zero, which is a logical choice if the system isat rest at the beginning of the data collection. Alternatively, one can estimatethe initial conditions by including them as unknown parameters in the model.However, for large data sets, the choice of initial conditions is insignificant as themodel quality will be almost completely defined by its stationary performance.The impulse response of a stable discrete-time linear time-invariant system canbe approximated in terms of the L+ 1 first Laguerre functions by

h(k) =L∑j=0

cjφj(k).

The actual impulse response is then given by

h(k) =L∑j=0

cjφj(k) + ν(k) = h(k) + ν(k),

99

where ν(k) =∑∞j=L+1 cjφj(k) represents the truncation error, i.e. the part of

the impulse response that is orthogonal in the sense of (1) to the subspace of`2[0,∞) spanned by the L+ 1 first Laguerre functions. It should be noted that

‖ν(k)‖22 =∞∑k=0

ν2(k) =∞∑k=0

∞∑j=L+1

cjφj(k)

2

=∞∑

j=L+1

c2j , (5)

where the last equality is due to the orthonormality of the Laguerre functions.In the following, it is assumed that {cj}Lj=0 are the coefficients that minimize

‖h(k)− h(k)‖2. Consequently, the output y`(k) of the system with input u(k)can be approximated in terms of the L+ 1 first Laguerre filter responses by

y`(k) =L∑j=0

cjψj(k). (6)

The approximation y(k) ≈ y(k) turns into an equality if the impulse response ofthe system is given by a linear combination of the first L+1 Laguerre functions.If not, the actual system output will be in the form

y`(k) =L∑j=0

cj(φj ∗ u)(k) + (ν ∗ u)(k) =L∑j=0

cjψj(k) + ε(k),

where

ε(k) =∞∑

j=L+1

cjψj(k) (7)

is the error in the output due to the impulse response truncation error.

3 Volterra-Laguerre Representation of Nonlin-ear Dynamics

A stable discrete nonlinear system with input u(k) and output y(k), k =0, . . . ,K − 1, can be approximated by Volterra series (see e.g. [28])

y(k) = y0 +N∑n=1

Hnu(k) + ζ(k),

where ζ(k) is a noise term, N ∈ N is the chosen Volterra series order, and

Hnu(k) =∞∑i1=0

· · ·∞∑in=0

hn(i1, . . . , in)u(k − i1) . . . u(k − in) (8)

are the Volterra functionals [28]. The functions hn are called the Volterrakernels.

100

By approximating the Volterra kernels in terms of Laguerre functions, thefunctionals in (8) may be written as

Hnu(k) =L∑

j1=0

· · ·L∑

jn=0

γn(j1, . . . , jn)n∏`=1

ψj`(k),

where γn(j1, . . . , jn) are the VL coefficients and ψj` are the Laguerre filteroutputs defined in (2). The VL model is thus

y(k) = y0 +

N∑n=1

L∑j1=0

· · ·L∑

jn=0

γn(j1, . . . , jn)

n∏`=1

ψj`(k) + ζ(k). (9)

In what follows, the time-index k will occasionally be omitted from functionsfor brevity. The expression in (9) may thus be written as

y = ϕTγ + ζ, (10)

where γ ∈ RM is the parameter vector containing the VL coefficients andϕ ∈ RM is the regression vector given by

ϕ = (11)

[ 1b︸︷︷︸ϕ0

ψL︸︷︷︸ϕ1

T (ψL ⊗ψL︸︷︷︸ϕ2

)T ((ψL ⊗ψL)⊗ψL︸︷︷︸ϕ3

)T . . . ︸︷︷︸ϕN

]T1×M ,

with ⊗ representing the Kronecker product and

ψL(k) = [ψ0(k) ψ1(k) . . . ψL(k)]T , (12)

is a vector composed of the L + 1 first Laguerre filter responses at time k.The regression vector ϕ thus contains the Laguerre filter outputs {ψj}Lj=0 andall possible products of them. The constant M is the total number of VLparameters in the model (in practice the number of parameters is reduced bygrouping equivalent products, e.g. ψiψj and ψjψi). The sub-vectors ϕn are oflengths (L+ 1)n. Using these sub-vectors, the output can be rewritten as

y = ϕTγ + ζ =N∑n=0

ϕTnγn + ζ (13)

where γn denotes the vector of VL coefficients associated with Volterra kernelof order n.The VL model is a generalization of (6) to nonlinear dynamics as it incor-porates higher-order impulse responses, modeled by the multivariate Laguerrefunctions, the latter being products of the univariate ones.

101

4 The Considered System

In this paper, the system to be identified is a SISO Wiener system driven byzero-mean i.i.d. Gaussian process u(k), k = 0, 1, . . . of variance σ2. Nothing isassumed about the linear subsystem other than stability and time-invariance.The linear part can be written as{

ψL(k + 1) = FLψL(k) + GLu(k),

y`(k) = ψTL(k)c + ε(k),

where FL ∈ RL+1×L+1 and GL ∈ RL+1×1 are truncated versions of (3) and(4), respectively, ψL is defined in (12), c = [c0 c1 . . . cTL]T and ε(k) is thetruncation error given in (7).The output mapping is characterized by m : R→ R, i.e.

y(k) = m(y`(k)) + ζ(k) k = 0, 1, . . . (14)

where ζ(k) is the part of the output that cannot be explained by the input,e.g. measurement noise. The function m(·) is a polynomial of order N ∈ N,i.e. m(x) = d0 + d1x + d2x

2 + . . . + dNxN . To eliminate gain ambiguities,

d1 = 1 will be assumed. Allowing a non-zero offset is a manageable technicalcomplication, but for the sake of simplifying result interpretation, d0 = 0 willbe assumed in the following analysis. Hence, the output nonlinearity m is givenby

m(x) = x+ d2x2 + . . .+ dNx

N .

The goal is to estimate c and {dn}Nn=2 from observations {u(k), y(k)}, k =0, 1, . . . ,K − 1The system output is in the form

y = ψTLc + d2(ψTLc)2 + . . .+ dn(ψTLc)N + g(ε) + ζ, (15)

where g(ε) represents the effect on the system output due to the dynamics inthe linear block not captured by the first L+ 1 Laguerre functions. Expandingthe powers of ψTLc yields an expression in the form

y = ϕTγ∗ + g(ε) + ζ,

where ϕ is the same as in (11) and

γ∗ = [0 c︸︷︷︸γ∗1

T d2(c⊗ c)︸︷︷︸γ∗2

T d3((c⊗ c)⊗ c)︸︷︷︸γ∗3

T . . .]T1×M . (16)

Modeling system (14) by the VL model in (10) apparently results in a modelmismatch, whose effect is represented by g(ε).An important observation to be made here is the relationship

ϕTnγ∗n = dn(ϕT1 γ

∗1)n n = 1, 2, . . . (17)

that follows directly from (16) and (11).

102

4.1 Summary of Assumptions

The following assumptions are made

1. The linear block is stable, proper and finite-dimensional

2. The output nonlinearity is a polynomial of order at most N

3. The input to the system is zero-mean i.i.d. Gaussian noise with varianceσ2

4. There is no constant offset in the output nonlinearity (d0 = 0)

5. The output nonlinearity is non-even (d1 = 1)

The third assumption is restricting, but as is demonstrated below, the proposedmethod works well and sometimes even better in certain cases of colored input.The fourth and fifth assumptions are common in Wiener system identification[30] and are for the application of this work motivated by the nature of theSPS nonlinearity described in [14].

5 Estimating the Coefficients of the Output Poly-nomial

The method below yields estimates of the linear block output y`(k) and theunknown polynomial coefficients {dn}Nn=2 by sequentially solving two linearleast-squares (LS) problems.Given observations of the input and output {u(k), y(k)}, k = 0, . . . ,K − 1 ofsystem (14), the method first obtains estimates γ of the VL coefficients γ in(10) through solving

γ = arg minγ

K−1∑k=0

|y(k)−ϕT (k)γ|2, (18)

with ϕ as in (11), where the output-polynomial order N , the largest functionorder in the Laguerre expansion L, and the Laguerre parameter α must bechosen. The optimization problem in (18) is solved by LS. When γ is found,an explicit approximation of the output of the linear block is

y`(k) = ϕT1 γ1.

Consequently, the coefficients in the output nonlinearity can be estimated usingstandard polynomial fitting, i.e. by LS, solving

d = arg mind

K−1∑k=0

|z(k)− φT (k)d|2, (19)

103

where d = [d2 . . . dN ]T , z = y − γ0 −ϕT1 γ1 and

φ = [(ϕT1 γ1)2 . . . (ϕT1 γ1)N ]T . (20)

Note that even though it is assumed that γ0 = 0, the same cannot be done forγ0 and it must thus be included in the criterion of (19).

6 Bias due to Model Mismatch

Under model mismatch, the estimates γ of the VL coefficients obtained by themethod in Sec. 5 will be biased, i.e. generally not in the form of (16). Thus,the estimates d obtained by solving (19) will also be biased. However, forany structure of the linear block, as long as it is well-approximated by a basisexpansion, the form of the vector of estimated coefficients, γ, will not deviatestrongly from γ∗ and good estimates of {dn}Nn=2 may still be achievable.To derive explicit expressions for the bias due to model mismatch in the asymp-totic case, some prerequisites are required. The following analysis assumes thechoice of expansion basis to be the discrete-time Laguerre functions, but theanalysis holds for any choice of orthonormal basis.

6.1 Properties of the Laguerre Filter Outputs

If u(k) is a zero-mean i.i.d. Gaussian process with variance σ2 for k ≥ 0 andu(k) = 0 for k < 0, then ψj(k) is an asymptotically stationary Gaussian process[39] with E{ψj(∞)} = µ∞ and V {ψj(∞)} = η2∞, where ψj(∞) = lim

k→∞ψj(k)

is a Gaussian stochastic variable. Because ψj(k) is a linear combination ofzero-mean Gaussian i.i.d. variables for a given k, as seen in (2), E{ψj(k)} = 0for all k so that µ∞ = 0. The variance η2∞ is given by

η2∞ =E{(ψj(∞))2} =

limk→∞

E

(

k∑l=0

φj(k − l)u(l)

)2 =

σ2∞∑m=0

(φj(m))2 = σ2.

The equalities follow from the fact that E{u(l1)u(l2)} = σ2 if l1 = l2 andE{u(l1)u(l2)} = 0 if l1 6= l2. Thus, E{ψj(∞)} = µ∞ = 0 and V {ψj(∞)} =η2∞ = σ2.

Proposition 6.1. Let u(k) be a zero-mean i.i.d. Gaussian process with vari-ance σ2 for k ≥ 0 and u(k) = 0 for k < 0 and let ψj(k) be given by (2).Then,

limK→∞

1

K

K−1∑k=0

ψi(k)ψj(k) =

{σ2 i = j,

0 i 6= j.

104

Proof. First consider the case i = j. Parseval’s theorem [27] gives

limK→∞

1

K

K−1∑k=0

(ψi(k))2 = limK→∞

1

K

K−1∑k=0

(φi ∗ u)2(k) =

limK→∞

1

K2

K−1∑n=0

|Φi(n)|2|U(n)|2 = limK→∞

σ2

K

K−1∑n=0

|Φi(n)|2

where Φi(n) and U(n) denote the discrete Fourier transforms (DFTs) of φi(k)and u(k) respectively. The last equality follows from the fact that u(k) is ani.i.d. Gaussian process with variance σ2. Applying Parseval’s theorem againyields

limK→∞

σ2

K

K−1∑n=0

|Φi(n)|2 = limK→∞

σ2K−1∑k=0

(φi(k))2 = σ2.

Similarly, for the case i 6= j Parseval’s theorem for cross-energy gives

limK→∞

1

K

K−1∑k=0

ψi(k)ψj(k) =

limK→∞

σ2

K

K−1∑n=0

Φi(n)Φ∗αj (n) =

limK→∞

σ2K−1∑k=0

φi(k)φj(k) = 0.

Lemma 6.2. Let u(k) be a zero-mean i.i.d. Gaussian process with varianceσ2 and ψj(k) be given by (2). Then,

limK→∞

1

K

K−1∑k=0

ψj1(k)ψj2(k) . . . ψjm(k) = E{wj1wj2 . . . wjm},

for any j1, j2, . . . , jm ∈ N, where {wj} are zero-mean i.i.d. Gaussian variableswith variance σ2.

Proof. It is shown in [19] that for any asymptotically stationary process X(k)it holds that

limK→∞

1

K

K−1∑k=0

X(k) = E{X(∞)}, (21)

meaning that the initial transient behavior of the process has no effect on theaverage over a long time interval. Thus,

limK→∞

1

K

K−1∑k=0

ψj1(k)ψj2(k) . . . ψjm(k) = E{ψj1(∞)ψj2(∞) . . . ψjm(∞)},(22)

105

where ψj(∞) are zero-mean Gaussian variables with variance σ2, as was pre-viously argued. Furthermore, Proposition 6.1 yields

E{ψi(∞)ψj(∞)} = limK→∞

1

K

K−1∑k=0

ψi(k)ψj(k) = 0, i 6= j,

proving that the Gaussian variables {ψj(∞)}∞j=0 are independent which to-gether with (22) concludes the proof.

Lemma 6.2 implies that a stationary i.i.d. Gaussian process filtered by twofilters orthogonal to each other with respect to inner product (1) produces twoasymptotically independent Gaussian processes.

6.2 Properties of the Truncation Error

The severity of the model mismatch is communicated by ε(k) in (7). Notice

that ε(k) is orthogonal to∑Lj=0 cjψj with respect to the asymptotic averaging

operator, due to Proposition 6.1. An important property of the truncationerror in (7) is proved by the following lemma.

Lemma 6.3. For ε(k) given by (7), it holds that

limK→∞

1

K

K−1∑k=0

εn(k) =

{(n− 1)!!‖ν‖n2σn n even,

0 n odd,

where p!! denotes the double factorial, i.e. the product of every number from pto 1 that has the same parity as p, and ‖ν‖2 is calculated according to (5).

Proof. Equation (21) gives

limK→∞

1

K

K−1∑k=0

εn(k) = E{εn(∞)}.

As was shown in the proof of Lemma 6.2, {ψj(∞)}∞j=L+1 are zero-mean i.i.d.Gaussian variables. Therefore, ε(∞) is also a zero-mean Gaussian variable withvariance

V {ε(∞)} = σ2∞∑

j=L+1

c2j = σ2‖ν‖22.

E{εn(∞)} thus represents the n:th moment of a zero-mean normal distributionof variance σ2‖ν‖22 which according to [38] is evaluated to (n − 1)!!‖ν‖n2σn, ifn is even and zero if n is odd.

106

6.3 Derivation of an Expression for Bias

Making use of Lemma 6.2 and Lemma 6.3, explicit expressions for the bias inthe estimates γ of the VL coefficients γ∗ due to model mismatch are providedin the following theorem.

Theorem 6.4. Let system (14) be identified using the method of Sec. 5 underthe assumptions in Sec. 4.1. Then, the asymptotically obtained bias ∆γ in theestimate of the VL coefficients γ∗, i.e. the deviation of γ from γ∗ in (16), isgiven by

∆γ = I0∆γ0 +N−2∑n=2

In∆γn, (23)

where∆γ0 = δ0,

∆γn =δndnγ∗n,

δn =N∑

m=n+2

(−1)m−n + 1

2Cm,n, (24)

and where Cm,n =(mn

)dm · (m − n − 1)!!(σ‖ν‖2)

m−n. ∆γn ∈ R(L+1)n is the

deviation of γn from γ∗n due to model mismatch.

Proof. See Appendix A.

Using Theorem 6.4, the bias in the estimates {dn}Nn=2 of the output polynomialcoefficients {dn}Nn=2 is given in the following theorem.

Theorem 6.5. Let system (14) be identified using the method of Sec. 5 underthe assumptions in Sec. 4.1. Then, the asymptotically obtained bias ∆dn in theestimate of the polynomial coefficients {dn}Nn=2 is given by

∆dn = dnδn − βn1 + βn

,

where

βn =n∑p=1

(n

p

)δp1 ,

and δn as defined in Theorem 6.4.

Proof. See Appendix B.

107

6.4 Bias Expression Analysis

Useful observations can be made from the proof and results of Theorem 6.4and Theorem 6.5. First, it follows from (60) that

lim‖ν‖2→0

∆dn = 0,

implying that the estimates of the polynomial coefficients are exact if thereis no model mismatch. Furthermore, from (53), the deviation due to modelmismatch in the estimated coefficients associated with the Volterra kernel oforder n is

∆γn =

δ0 n = 0

δndnγ∗n n = 1, 2, . . .

, (25)

where δn is given in (24). Apparently, the bias in the coefficients of kerneln depends only on the true coefficients of the same kernel. Additionally, thecoefficients in kernel n are only affected if the output polynomial is of degreeN ≥ n + 2. Thus, γN−1 = γ∗N−1 and γN = γ∗N meaning that the modelmismatch does not affect the estimation of the coefficients associated with thetwo kernels of highest order.According to (57), the estimated VL model is indeed a SISO Wiener model, butwhere the approximation of the linear dynamics, y` = ϕT1 γ1, is suboptimal andwhere the output-polynomial coefficients are different from those of the truesystem. This demonstrates that in the case of model mismatch, estimating thelinear and nonlinear parts of the Wiener system separately yields a model witha suboptimal fit.Another noteworthy result is that although ∆γn goes to infinity as ‖ν‖2 →∞,the same is not true for ∆dn. In fact, letting ‖ν‖2 approach infinity in (59)gives lim‖ν‖2→∞∆dn = −dn. This means that as the model mismatch becomesmore severe, the estimates of the output-polynomial coefficients go to zero anddo not diverge.It is also important to note that bias (60) depends directly on ‖ν‖2 and that thelatter scales with the norm of the linear block impulse response. This impliesthat, with everything else being equal, a linear block with a high gain resultsin less accurate estimates of the system parameters.Finally, (60) suggests that the bias increases with the variance σ2 of u and

that the lower this variance is, the closer dn will be to dn. It would seem thatchoosing a very low energy input signal would yield better estimates of thepolynomial coefficients. However, a small σ2 will result in poor estimates ifthe number of samples is finite. This is partly due to the fact that for a finitenumber of samples, the measurement noise will not be white and thus cannotcompletely be averaged out. A certain input energy is then required to achievean adequate Signal-to-Noise-Ratio (SNR). The choice of input energy will thuspresent a trade-off between SNR and bias.

108

6.5 A Special Case

To illustrate the results of Theorem 6.4 and Theorem 6.5, the special case ofthird-order output polynomial nonlinearities is examined here. The estimatesof the VL coefficients for N = 3 in (53) are

γ = γ + ‖ν‖22σ2 (d2s1 + 3d3(c0s2 + . . .+ cLsL+2)) , (26)

with sj defined in Theorem 6.4 and where {cj}Ln=0 are the LS optimal Laguerrecoefficients for approximation of the linear subsystem. It can be seen in (26)that if d3 = 0, i.e. if the output polynomial is of order 2, only the first coefficientin γ will be different from γ∗. The obtained parameters, γ, will thus still begiven by (16), but with a non-zero offset γ0 = d2‖ν‖22σ2. This means that, asthe number of time samples K goes to infinity, regardless of how poorly thelinear part is approximated by the first L+ 1 Laguerre functions, it is possibleto determine d2 exactly through (17), i.e. the estimate of d2 will be consistent.Moreover, once d2 is computed, it is possible to exactly calculate the truncationerror squared norm

‖ν‖22 =γ0

d2σ2, (27)

where γ0 is readily available as the first element in the parameter vector γ. If d3is non-zero, i.e if the output polynomial is of degree three, (26) shows that thefirst coefficient of γ will still be non-zero and equal to d2‖ν‖22σ2. The squarednorm of the truncation error is thus still given by (27). However, there will alsobe terms added to the coefficients of γ corresponding to the first-order kernelof (9) (i.e. γ∗1), which will result in bias in the estimates of the polynomialcoefficients d2 and d3.From (58), d2 and d3 are obtained:

d = [d2 d3]T =

d2

(1 + 3d3‖ν‖22σ2)2d3

(1 + 3d3‖ν‖22σ2)3

, (28)

which can be put as the true values plus a bias term according to

d = d + ∆d = d−

3d2d3V (2 + 3d3V )

(1 + 3d3V )2

9d23V(1 + 3d3V + 3d23V

2)

(1 + 3d3V )3

,where V = ‖ν‖22σ2.

7 Bias Reduction

If the identification goal is to obtain accurate estimates of both the linearand nonlinear parts of the Wiener system separately, the bias in the output-

109

polynomial coefficients derived in the proof of Theorem 6.5 is undesirable. Thissection presents means for reducing the estimation bias due to model mismatch.Since γ∗0 = 0, it follows from (25) that

γ0 = ∆γ0 =N∑m=2

(−1)m + 1

2dm(m− 1)!!‖ν‖m2 σm. (29)

This expression together with (58) suggest an iterative way of reducing thebias in the estimates of dn. First, estimates of the VL coefficients γ, andthe polynomial coefficients {dn}Nn=2, are obtained using the method of Sec. 5.Inserting γ0 and the estimated polynomial coefficients into (29) gives an bN2 c:thorder equation in ‖ν‖22, where b·c denotes the floor operator. Let w = ‖ν‖22.The equation is solved to obtain an estimate of the Laguerre truncation errorsquared norm, denoted by w. For N ≤ 3, the equation is linear in ‖ν‖22 andhas a unique solution. For 4 ≤ N ≤ 5 the equation is quadratic and has twosolutions. In the latter case, the smallest positive solution is chosen, if thereis one. If the model mismatch is severe, there may be no positive solutionor the solutions may be complex. The output of the proposed bias reductionmethod is not considered for such cases. For larger N , the equation has severalsolutions and it is not obvious which solution should be chosen. Therefore,only the case of N ≤ 5 is treated here. See Sec. 9 for further discussion.With w obtained from (29), updated values of the coefficients are obtained bysolving the equations

dn0=dni+1 + δni

(1 + δ1i)nn = 2, . . . , N, (30)

with respect to dni+1, where the current estimates of the polynomial coefficients,

{dni}Nn=2, are used to evaluate {δni

}Nn=1 and dn0denotes the initial coefficient

estimate. The updated coefficient estimates can in turn be plugged into (29),which is solved to update the truncation error etc.Assume that the iterative method described above yields better estimates of{dn}Nn=2 and ‖ν‖22 than those initially obtained. Denote the updated estimates

by {d†n}Nn=2 and w†. An improved approximation of the linear subsystem isthen given by

y†` = ϕT1 γ†1,

where γ†1 is obtained by inserting {d†n}Nn=2 and w† into (55) and solving for γ∗1.The bias reduction method is summarized in Alg. 1.

7.1 Convergence Discussion

The convergence of Alg. 1 is governed by that of the discrete nonlinear dynam-ical system obtained by solving (30) with respect to dni+1

, i.e.

dni+1= dn0

(1 + δ1i)n − δni

, n = 2, . . . , N, (31)

110

Algorithm 1 Reduction of the bias in γ1 and in {dn}Nn=2 for N ≤ 5

1. Obtain the estimates γ0 and {dn0}Nn=2 using the method in Sec. 5.

2. Iterate steps 3, 4 and 5 below, for i = 0, 1, . . . until convergence, then goto step 6.

3. Solve γ0 = d2i‖ν‖22σ2 + 3d4i‖ν‖42σ4 with respect to ‖ν‖22 (where d4i = 0if N ≤ 3) and set wi to be the positive solution.

4. Evaluate {δni}Nn=1 by inserting wi and {dni}Nn=2 into (24).

5. dni+1= dn0

(1 + δ1i)n − δni

, n = 2, . . . , N .

6. γ†1 = γ1/(1 + δ1i)

and can be written asdi+1 = f(di), (32)

where di = [d2i . . . dNi]T and f : RN−1 → RN−1.

By construction, the point d = [d2 . . . dN ]T , i.e. the true polynomial co-efficient values, is an equilibrium point of the system, a fact which is easilyverified. Local stability of the equilibrium can be evaluated by examining theeigenvalues of the Jacobian of (32). However, the complexity of the Jacobianof f quickly grows with N and providing general conditions for the stability ofequilibrium point d is difficult.It is instructive to study the convergence properties of (31) for the special caseof N = 3. The involved equations are relatively simple in this case, allowingfor a concise and presentable analysis.With N = 3, an estimate of the Laguerre truncation error is given by wi =γ0

d2iσ2

which inserted into (31) for n = 2 and n = 3 yields the system for bias

reduction [d2i+1

d3i+1

]=

d20

(1 + 3

d3i

d2iγ0

)2

d30

(1 + 3

d3i

d2iγ0

)3

. (33)

Rewriting (33) with dn0given by (28) and exploiting the fact that γ0 =

111

d2‖ν‖22σ2 leads to

[d2i+1

d3i+1

]=

d2

(1 + 3Vd3id2id2)2

(1 + 3V d3)2

d3

(1 + 3Vd3id2id2)3

(1 + 3V d3)3

, (34)

where V = ‖ν‖22σ2. The Jacobian of f evaluated at the equilibrium point

di = [d2i d3i ]T is

Jf

∣∣∣∣di=d

=9d3

1 + 3V d3

2d23d31

[ −d3d2

1

].

Clearly rank Jf

∣∣∣∣di

= 1 and the nonzero eigenvalue is λ = 3V d31+3V d3

. The equi-

librium point is stable if |λ| < 1 i.e.∣∣∣∣ 3V d31 + 3V d3

∣∣∣∣ < 1,

which is solved to yield the condition

d3 > −1

6V= − 1

6‖ν‖22σ2. (35)

The stability of the equilibrium point guarantees only local convergence as theremay be other equilibria. However, since the initialization d0 of the algorithm isfixed and readily evaluated by (28), it is possible to prove convergence withoutaddressing all equilibria of the system.

Lemma 7.1. System (33) converges asymptotically and uniformly to the equi-librium point d = [d2 d3]T if and only if

d3 > −1

6V, V = ‖ν‖22σ2.

Proof. See Appendix C.

Due to the uniform convergence of the algorithm, each iteration improves theestimate of the parameters. A conclusion to be drawn from Lemma 7.1 is thatthe convergence rate of Alg. 1 is determined by the Laguerre truncation error ofthe linear subsystem impulse response ‖ν‖22. The smaller the truncation error,the weaker the constraint on d3 is and the faster the convergence.

112

8 Experiments

In this section, numerical experiments demonstrating the strengths and weak-nesses of the proposed approach compared to existing methods are provided.Time index k will be omitted from functions for brevity.In all examples, the input signals were realizations of an i.i.d. zero-mean Gaus-sian process. Realizations of a different i.i.d. zero-mean Gaussian process wereadded as measurement noise. Furthermore, in what follows, αt denotes the trueLaguerre parameter of the underlying system whereas αa denotes the assumedLaguerre parameter of the current model.

8.1 Numerical Experiment: Bias Verification

This example is meant to verify the bias derivations of Theorem 6.4 and The-orem 6.5.For each of 20 distinct K ∈ (100, 100000), 300 output signals of length Kwere generated by simulation using a linear block with the impulse responsecomposed of a linear combination of three Laguerre functions with Laguerreparameter αt = 0.4:

g = φ0 + 0.5φ1 + 0.2φ4, (36)

and the static nonlinearity

y =x+ d2x2 + d3x

3 =

x+ 0.2x2 − 0.4x3. (37)

The input variance was σ2 = 1.For each data set, the polynomial coefficients {dn}3n=2 were estimated through(18) and (19) with N = 3 and αa = αt = 0.4. The assumed maximum orderof the Laguerre functions was L = 1, leaving the 4 :th order Laguerre functionin impulse response (36) unaccounted for and resulting in a truncation errorν = 0.2φ4 in the estimate of the linear block. Fig.1 shows the identificationresults as a function of K. The variance of the measurement noise was chosenso that the SNR was 0 dB. Alg. 1 for bias reduction was not applied here.It is apparent from Fig. 1 that the estimates of both d2 and d3 were biased, aspredicted. With K = 100 000, the obtained means of d2 and d3 were 0.2206and −0.4641, respectively. The estimated biases were thus ∆d2 = 0.0193 and∆d3 = −0.0571. The theoretical bias values according to Theorem 6.5 are∆d2 = 0.0187 and ∆d3 = −0.0549.Because N = 3, it is possible to estimate the Laguerre truncation error squarednorm ‖ν‖22 via (27). With K = 100 000, the obtained mean value of γ0 was

0.0199, which inserted together with d2 in (27) gives the estimate of the trun-cation error square-norm ‖ν‖22 = 0.0398. This agrees with the theoretical value‖ν‖22 = ‖0.2φ4‖22 = 0.04.

113

102 1040

0.2

0.4a)

d 2

102 104−0.8

−0.6

−0.4

−0.2b)

K

d 3

Figure 1: The obtained means of a) d2 and b) d3 in (37) ± one standarddeviation as a function of the number of samples K. The linear subsystemimpulse response was given by (36). N = 3, L = 1, αa = 0.4, SNR = 0 dB. Nobias reduction was applied. The dotted lines show the true coefficient values.

8.2 Numerical Experiment: Severity of Model Mismatch

In this example, the effect of the choice of L on the output-polynomial coeffi-cient estimates is demonstrated.For each of L = 0, 1, . . . , 9, 300 output signals of length K = 10 000 weregenerated by simulation using a linear block with an impulse response composedof a linear combination of ten Laguerre functions with Laguerre parameterαt = 0.4, all coefficients cj = 0.1

g = 0.19∑j=0

φj , (38)

and the static nonlinearity

y =x+ d2x2 + d3x

3 =

x+ 0.3x2 − 0.6x3. (39)

The input variance was σ2 = 1.The values N = 3 and αa = 0.4 were used in the identification of d2 and d3.As the assumed maximum Laguerre order L increases, the truncation error inthe impulse response approximation decreases. Hence, the model mismatch isthe most severe for L = 0, whereas for L = 9 there is no model mismatch.Fig. 2 shows the identification results as a function of L. The variance of themeasurement noise was chosen so that the SNR was 20 dB. Alg. 1 for biasreduction was not applied here.Fig. 2 verifies that as the truncation error decreases, the bias goes to zero.

8.3 Numerical Experiment: Bias Reduction Performance

Here, the problem of obtaining accurate estimates of the linear and nonlinearblocks separately is considered.

114

0 2 4 6 80

0.5

1d 2

a)

0 2 4 6 8−4

−2

0

2

d 3

b)

L

Figure 2: The obtained means of a) d2 and b) d3 in (39) ± one standarddeviation as a function of the assumed maximum function order in the Laguerreexpansion L. The linear subsystem impulse response was given by (38). K =10 000, N = 3, αa = 0.4, SNR = 20 dB. No bias reduction was applied. Thedotted lines show the true coefficient values.

The performance of Alg. 1 for bias reduction was evaluated on two systems;one with a third-order and one with a fifth-order polynomial nonlinearity.With the linear block impulse response

g = 0.25φ0 + 0.13φ1 + 0.1φ4 + 0.1φ5, (40)

and the static nonlinearities

y =x+ d2x2 + d3x

3 =

x+ 0.3x2 − 0.6x3. (41)

and

y =x+ d2x2 + d3x

3 + d4x4 + d5x

5 =

x+ 0.3x2 − 0.6x3 + 0.1x4 + 0.1x5, (42)

300 output signals of length K = 40 000 were generated from each system bysimulation.The input variance was σ2 = 1. The large number of samples was chosen inorder to capture asymptotic behavior.Identification was carried out using αa ∈ (0, 1), L = 1 and N = 3 for (41),N = 5. for (42). Alg. 1 for bias reduction was then applied.

Figure 3 shows the means ± one standard deviation of d2 and d3 in (41) as a

function of α. Fig. 4 shows the means ± one standard deviation of {dn}5n=2 asa function of αa.The biased estimates of the polynomial coefficients were successfully debiasedby Alg. 1, see Fig. 3 and Fig. 4. The method produced high-quality estimatesof the polynomial coefficients even for the values of αa far from the true one,αt = 0.4. However, there is a noticeably larger variance in the estimates

115

obtained after application of the bias reduction algorithm. The variance in theestimates is an effect of the finite number of samples and reduces as the samplesize grows.For αa = αt = 0.4, the theoretical value of the squared norm of the truncationerror is ‖ν‖22 = 0.12 + 0.12 = 0.02. The truncation error accounted for about20% of the impulse response energy. The mean of the 300 truncation errorestimates, associated with αa = 0.4, produced by the bias reduction algorithmin Sec. 7 was ‖ν‖22 = 0.0201 in the case of N = 3 and ‖ν‖22 = 0.0221, in thecase of N = 5, being in close agreement with the theoretical value.For the case of αa = αt, the obtained approximation of the linear subsystemwas evaluated. The optimal approximation of impulse response (40) for L = 1(i.e using only the first two Laguerre functions) is in this case given by g∗ =0.25φ0+0.13φ1. Tab. 1 shows the obtained means and standard deviations of γ1

and γ†1, representing the Laguerre coefficients ({cj}Lj=0 in (6)) in the expansionof the linear subsystem impulse response before and after the application ofbias reduction. It can be seen that the bias reduction algorithm resulted in animproved approximation of the linear part of the Wiener system.

0.2 0.4 0.6 0.80.2

0.3

0.4

0.5

d 2

a)

0.2 0.4 0.6 0.8−0.8

−0.7

−0.6

−0.5

d 3

α

0.2 0.4 0.6 0.80.2

0.3

0.4

0.5b)

0.2 0.4 0.6 0.8−0.7

−0.6

−0.5

α

Figure 3: The obtained means of d2 and d3 in (41) ± one standard deviationas a function of α. The linear subsystem impulse response was given by (40).a) No bias reduction, b) Bias reduced using Alg. 1. N = 3, L = 1, K = 40 000,SNR = 15 dB. The dotted lines show the true coefficient values.

116

0.2 0.4 0.6 0.80.20.40.6

d 2a)

0.2 0.4 0.6 0.8−0.8−0.6−0.4

d 3

0.2 0.4 0.6 0.8−0.1

00.10.2

d 4

0.2 0.4 0.6 0.80

0.10.2

d 5

α

0.2 0.4 0.6 0.80.20.40.6

b)

0.2 0.4 0.6 0.8−0.8−0.6−0.4

α

0.2 0.4 0.6 0.8−0.1

00.10.2

0.2 0.4 0.6 0.80

0.10.2

α

Figure 4: The obtained means of {dn}5n=2 in (42) ± one standard deviation asa function of αa. The linear subsystem impulse response was given by (40). a)No bias reduction, b) Bias reduced using Alg. 1. N = 5, L = 1, K = 40 000,SNR = 15 dB. The dotted lines show the true coefficient values.

c0 c1

γ1 (Before bias reduction) 0.2139 (0.0016) 0.1111 (0.0016)

γ†1 (After bias reduction) 0.2495 (0.0048) 0.1295 (0.0031)

γ∗1 (Optimal) 0.25 0.13

Table 1: The obtained means and standard deviations of the estimates of theimpulse response Laguerre coefficients for αa = αt = 0.4 before (γ1) and after

(γ†1) application of bias reduction through Alg. 1.

8.4 Numerical Experiment: Performance Comparison

This example compares the performance of the method of this paper to that of anonlinear LS (NLS) method and that of the approach used in [5] and [21] wherethe linear block is modeled as a FIR system. The NLS method assumes a gen-eral black-box model for the linear subsystem and a polynomial nonlinearity.It then uses iterative search to minimize the squared error between the simu-

117

lated model output and the measured output. The described NLS method isa standard approach for identifying Wiener models and is implemented in theMathworks MATLABr System Identification Toolbox.Two Wiener systems were formed. The first using a linear block with transferfunction

G(z) =B(z)

A(z)(43)

where B(z) = 0.30z−1−0.97z−2 + 1.25z−3−0.78z−4 + 0.24z−5−0.029z−6 andA(z) = 1− 3.87z−1 + 6.12z−2− 5.05z−3 + 2.29z−4− 0.54z−5 + 0.052z−6. G(z)thus has six real poles and five zeros. The second with linear block

G(z) =0.2z−1

1− z−1 + 0.4z−2, (44)

which has two complex poles. The static nonlinearity was in both cases givenby the Hill function

y = 1.2(x+ 1)4

1 + (x+ 1)4− 0.6. (45)

Hill functions are widely used in biology to model smooth saturations [9].The considered systems are not polynomial Wiener systems. Therefore, theresulting VL models are not in the form of (15). However, using the proposedmethod of this paper, the VL models are projected on the set of polynomialSISO Wiener systems and polynomial approximations of nonlinearity (45) areobtained.By simulation of the two systems given by (43), (44) and (45), 50 output signalsof length K = 1000 were generated. The input variance was σ2 = 1. For the50 data sets of the first system, models were identified using all three methodsmentioned above, assuming N = 5. For the data sets of the second system, onlythe proposed method of this paper and the NLS method were evaluated. Inthe former, the value of αa that yielded the lowest mean squared error (MSE)between the predicted output and the actual output was used. Ten differentvalues of αa ∈ (0, 1) were tested and α∗a = 0.44 was shown to give the best fitfor the first system and α∗a = 0.52 for the second system. Figure 5 shows themean MSE in the models of the first system as a function of α.Note that here the focus is laid on maximizing model fit and therefore no biasreduction was applied.The obtained models were validated using realizations of the input and outputsignals not used in the estimation. Tab. 2 summarizes the mean MSE obtainedin the validation of the models of the real pole system together with computa-tion times for different L in the proposed method and different orders of thelinear black-box and FIR models assumed in the NLS method and the FIRapproach, respectively.Fig. 6 shows the Hill function static nonlinearity in (45) and the polynomialapproximation when linear block (43) was used. The two methods yielded

118

nearly identical polynomial approximations which is why only one is shown inthe figure.Tab. 3 summarizes the mean MSE obtained in the validation of the models ofthe complex pole system (with linear block (44)) together with computationtimes for different L in the proposed method and different orders of the linearblack-box assumed in the NLS method.It is evident that in the case of real poles in the underlying linear block, themodels produced by the method of this paper outperformed those of the NLS

0 0.2 0.4 0.6 0.8 10.02

0.04

0.06

0.08

0.1

0.12

α

MSE

Figure 5: The mean MSE ± one standard deviation in the models of the systemgiven by (43) and (45) obtained using the proposed method when for differentvalues of α.

−1 −0.5 0 0.5 1−1

−0.5

0

0.5

1

Time−index (k)

Gai

n

ApproximatedTrue

Figure 6: The true Hill function static nonlinearity (45) and the polynomialapproximation yielded by the proposed method. The approximation obtainedwith the NLS method was nearly identical to the latter and is therefore notshown.

119

Method L / Order MSE ·102 Computationtime (s)

#P

ProposedMethod

0 0.88 (0.12) 0.0011 1

1 0.86 (0.11) 0.0018 2

2 0.51 (0.095) 0.0042 3

3 0.55 (0.11) 0.0087 4

NonlinearLeastSquares

1 13.9 (5.1) 0.48 1

2 0.71 (0.11) 0.41 4

3 0.49 (0.082) 0.51 6

4 0.52 (0.089) 0.64 8

FIRApproach

1 22.1 (0.54) 0.0012 1

2 10.34 (0.41) 0.0018 2

3 5.65 (0.39) 0.0043 3

4 3.62 (0.32) 0.0086 4

Table 2: The mean validation MSE (standard deviation) and computation timeper data set for the method proposed in this paper, the standard NLS approachand the FIR approach when identifying the system given by (43) and (45). Foreach model configuration, the number of unknown parameters estimated in thelinear block (#P) is given. K = 1000.

Method L / Order MSE ·102 Computationtime (s)

#P

ProposedMethod

0 4.24 (0.19) 0.0013 1

1 2.50 (0.15) 0.0019 2

2 2.33 (0.12) 0.0036 3

3 1.38 (0.12) 0.0081 4

NonlinearLeastSquares

1 13.9 (3.21) 0.46 1

2 0.30 (0.094) 0.26 4

3 0.30 (0.079) 0.51 6

4 0.31 (0.081) 0.75 8

Table 3: The mean validation MSE (standard deviation) and computationtime per data set for the method proposed in this paper and the standardNLS approach when identifying the system with complex poles, given by (44)and (45). For each model configuration, the number of unknown parametersestimated in the linear block (#P) is given.

120

method in terms of parsimony. A linear block with six parameters was requiredin the models of the NLS method to achieve the same performance as theproposed method delivered with three parameters. Furthermore, the proposedmethod was 70 - 500 times faster than the NLS method, although it should beremembered that the former required ten executions to find an appropriate αa.The performance of the FIR approach was significantly worse than that of thetwo other methods.In the case where the underlying linear block had complex poles, the proposedmethod struggled and was not capable of producing results on a par withthose of the NLS method. The reason is that Laguerre functions have realpoles, requiring higher-order expansions to model oscillative behavior. This isdemonstrated in Tab. 4 by the fact that the obtained MSE decreased as themaximum Laguerre order L increased. For oscillating nonlinear dynamics, theKautz [31] or GOBF bases would be more suitable for representing the Volterrakernels.

8.5 Numerical Experiment: Non-White Input

This example shows the effects on the identification result caused by a coloredinput signal. The experiment of Sec. 8.3 was repeated using linear block (40)and static nonlinearity (39). The 300 input signals were realizations of an i.i.d.zero-mean Gaussian process with variance σ2 = 1 filtered by a digital low-passButterworth filter [6] with normalized cut-off frequency ωc ∈ (0, 1), resultingin a correlated signal u(k), k = 0, . . . ,K − 1. After filtering, the signals were

scaled to have energy∑K−1k=0 |u(k)|2 = 1. The inputs were K = 10 000 samples

long. No bias reduction was applied.Fig. 7 shows the means ± one standard deviation of d2 and d3 in (39) as afunction of αa for different values of the cut-off frequency ωc. Apparently, thebias in the obtained estimates of the output-polynomial coefficients decreasesas ωc is made smaller and the same is true for the variance for values of αanear αt. For ωc = 0.003, the estimates are of high quality for all values ofαa. However, when the filter cut-off frequency is as low as ωc = 0.0005, thevariance is drastically increased.The results in Fig. 7 show that better estimates of the output-polynomial coef-ficients are attainable by manipulating the power spectral density (PSD) of theinput. It is apparent from the analysis of this paper that whether the obtainedparameter estimates will be biased or not depends on the properties of ε in (7).In the frequency domain ε can be expressed as

E(n) =Y`(n)−∞∑

j=L+1

cjΨj(n) = (46)

H(n)−∞∑

j=L+1

cjΦj(n)

U(n).

121

A conclusion to be drawn from (46) is that the linear model only needs tobe accurate for the frequencies present in the input in order for ε to be zeroand the method to provide unbiased estimates of the polynomial coefficients.Optimal input design thus involves finding input sequences that only containsexcitation in frequency ranges over which the underlying linear block can bewell approximated by a finite set of basis functions.The fact that high quality estimates were obtained for some values of αa differ-ent from the true value, as can be seen in e.g. Fig. 7b, suggests that Laguerrefunctions with different Laguerre parameters may exhibit similar approxima-tion properties over certain frequency ranges.Even though input of poor excitation may yield accurate output polynomialcoefficient estimates, the quality of the linear part of the resulting Wiener modelmay be poor and only representative of the underlying system at frequenciespresent in the input signal.A final result to note is in Fig. 7d: In the non-asymptotic case, if the PSD ofthe input is concentrated to a narrow band close to zero, the identification willoften fail to produce satisfactory results. The reason is that in this case, theinput is near constant and thus contains little amplitude excitation. Matrix Bin (58) then becomes ill-conditioned. This does not happen in the asymptoticcase unless the input is a pure constant.

0.2 0.4 0.6 0.80.2

0.3

0.4

0.5

d 2

a)

0.2 0.4 0.6 0.80.2

0.3

0.4

0.5b)

0.2 0.4 0.6 0.80.2

0.3

0.4

0.5c)

0.2 0.4 0.6 0.80.2

0.3

0.4

0.5d)

0.2 0.4 0.6 0.8−0.8

−0.7

−0.6

−0.5

α0.2 0.4 0.6 0.8

−0.8

−0.7

−0.6

−0.5

α0.2 0.4 0.6 0.8

−0.8

−0.7

−0.6

−0.5

α0.2 0.4 0.6 0.8

−0.8

−0.7

−0.6

−0.5

d 3

α

Figure 7: The obtained means of d2 and d3 in (39) ± one standard deviationas a function of αa for different values of the input-filter normalized cut-offfrequency ωc. a) ωc = 1 b) ωc = 0.013 c) ωc = 0.003 d) ωc = 0.0005. The linearsubsystem impulse response was given by (40). N = 3, L = 1, K = 10 000,SNR = 10 dB. The dotted lines show the true coefficient values.

122

8.6 Application to SPS Identification

To examine the usability of the method in an actual application, it was appliedto experimental eye-tracking data in order to obtain a model of the human SPS.Gaze direction data of test subjects attempting to track a white moving circleagainst black background on a computer monitor were recorded using a video-based eye tracker from Smart Eyer AB, Sweden. Test subjects were placed50 cm from the monitor with the monitor center at eye height. The eye-trackeroutput was the distance in centimeters (horizontal and vertical componentsseparately) between the monitor center and the point where the gaze directionline intersects the the plane of the monitor. Eye-tracking data were sampledat fs = 60 Hz.Input signals were visual stimuli designed using a method presented in [14].The method generates inputs that possess properties suitable for identificationof the SPS. Specifically, the frequency content and amplitude distributionsof the signals are formed to excite the essential dynamics and statics of theoculomotor system.The estimated power spectral density (PSD) and signal amplitude distributionof the horizontal part of a typical visual stimulus is shown in Figure 8.The VL parameters were estimated for 10 different data sets of three differenttest subjects: H1, age 27, H2, age 27 and H3, age 54. Each data set wasK = 1560 samples long (∼ 26 s). Laguerre order L = 3 was used and theidentification was done for 20 different values of αa. For the αa that yieldedthe best fit, Wiener models with polynomial order N = 3 were estimated usingthe proposed method of this paper.Tab. 4 shows the mean and the standard deviation of the 10 obtained estimatesof d2 and d3 in (15) per test subject. In Tab. 5, the performance on validationdata (other data sets than the identification ones) of the model of each testsubject is compared to that of the second-order (N = 2) VL model of theSPS presented in [15] and to the Wiener model presented in [14]. The Wiener

0 2 40

10

20

30

40

Frequency (Hz)

a)

−1 0 10

100

200

300


b)

Figure 8: Typical a) PSD and b) amplitude distribution of a generated SPSvisual stimulus input signal.

123

Coefficient H1 H2 H3

d2 -0.02 (0.009) -0.01 (0.007) -0.06 (0.043)

d3 -0.09 (0.029) -0.07 (0.031) -0.12 (0.088)

Table 4: The mean (standard deviation) of the estimated polynomial coeffi-cients for 10 data sets of eye tracking data in three test subjects: H1, H2 andH3.

model in [14] is comprised of a linear block for which the model is presumedto be of ARX-structure and a piecewise-linear static nonlinearity. In the table,the models are compared in terms of the mean residual sum of squares (RSS)between the measured output and the model output for the 9 different data setsused for validation (the 10th data set being used for identification). Model 1refers to the VL model in [15], Model 2 refers to the Wiener model in [14]and Model 3 refers to the polynomial Wiener model of this paper. The results

Model H1 H2 H3 #Parameters

1 0.61 (0.15) 0.67 (0.19) 0.89 (0.31) 28

2 6.31 (2.01) 5.21 (2.02) 8.31 (3.77) 9

3 0.81 (0.29) 1.13 (0.30) 2.18 (0.81) 6

Table 5: The mean RSS when validating the models estimated from one datasetof each of three test subjects (H1, H2 and H3) using the other 9 data sets. Thestandard deviation of is given within brackets after each mean RSS value. Thelast column gives the total number of parameters in the obtained model.

in Tab. 5 show that the polynomial Wiener models of this paper (Model 3)outperformed those in [14] (Model 2) and that the performance loss whenprojecting full VL models (Model 1) on the set of polynomial SISO Wienermodels (Model 3) was small.


The analyses of this paper reveal that in the case of model mismatch, theestimated VL model of a polynomial SISO Wiener system with i.i.d. Gaussianinput is also a polynomial SISO Wiener system but with the linear and thenonlinear blocks different from those of the true system. The deviation inthe parameter estimates from the true values is referred to as bias due tomodel mismatch and explicit expressions for it are provided in Theorem 6.4and Theorem 6.5. Furthermore, the derived bias expressions are used to designa bias reduction algorithm. The algorithm is instrumental when the goal ofidentification is to accurately model the linear and nonlinear blocks separately,

124

rather than finding an overall model that gives the best fit. It is demonstratedthat in the case of model mismatch, those two objectives are not equivalent.The strength of the proposed method compared to previous similar approaches(see e.g. [5] and [21]) is its ability to yield satisfactory results even when thedynamics of the linear block are IIR. Moreover, when the underlying system hasreal poles, the models produced by this method are shown to outperform thoseof standard NLS approaches in terms of parsimony. Finally, since the methodeffectively splits the nonlinear problem into two linear problems, computationtimes are significantly reduced compared to the NLS method.An expected weakness of the method is that its performance declines when theunderlying linear subsystem has complex poles. In such cases, increasing themaximum function order in the Laguerre expansion improves the quality of thelinear block model, but at the cost of a significant increase in the number of VLcoefficients to be estimated in the first step of the method. This, in turn, callsfor a larger amount of input-output observations to preserve identifiability. Anapproach to alleviate this problem would be to replace the Laguerre functionsin the expansion of the impulse response with a more general basis such as theKautz functions [31] or the GOBFs [30].The presented method requires the selection of the Laguerre parameter α. Theresults indicate that the choice of the parameter is not crucial for output-polynomial estimation, but that it has a substantial effect on the overall modelfit. In the examples of this paper, α was found by gridding. Owing to the lowcomputation times of the method and the scalar nature of the parameter, suchan approach is tractable. However, if the basis of choice is multi-parametric,other means of establishing the relevant time constants of the linear dynamicsmay be more appropriate. In [30], the GOBFs are constructed using poleestimates obtained from the Best Linear Approximation (BLA), as describedin [26]. This approach would also be suitable for a Kautz basis.Another drawback of the presented method is the fact that the bias reductionalgorithm, in the form it is presented here, is only applicable if the output-polynomials are of at most fifth order. Furthermore, the algorithm is shown todiverge under certain conditions. It may be possible to modify the algorithm toaccount for situations with higher output-polynomial orders by making someappropriate approximations, but it was deemed to be outside of the scope ofthis paper.Throughout the theoretical part of the paper, it is assumed that there is nomodel mismatch in the static nonlinearity. i.e. that the true order of theoutput-polynomial is not higher than the assumed order. In the presentednumerical example involving the Hill function nonlinearity, the assumption ofa finite order output-polynomial is invalid. As a result, the obtained parameterestimates are subject to an additional bias. The effects of model mismatch inthe nonlinear block on the parameter estimates is a topic for future studies.In the numerical example in Sec. 8.3, the estimate of the truncation errorsquare-norm produced by the bias reduction algorithm is shown to be in closeagreement with the theoretical value. This is a useful result, as knowledge of

125

the truncation error is often important.When applying the method to experimental eye-tracking data, the producedSPS models are shown to perform better than models obtained in previousresearch. The reason is that the generality of the VL model removes the poten-tially invalid assumptions about the linear part. Further analysis not presentedhere shows that the properties of the obtained polynomial static nonlinear-ities from real eye-tracking data described in this paper resembles those ofthe piecewise-linear nonlinearities given in [14] and by that strengthens theassumption that the SPS behaves as a Wiener system.

A Proof of Theorem 6.4

Proof. The output of system (14) is given by

y =(ϕT1 γ∗1 + ε) + d2(ϕT1 γ

∗1 + ε)2 + . . . (47)

+ dn(ϕT1 γ∗1 + ε)n + ζ = ϕT1 γ

∗1 + d2(ϕT1 γ

∗1)2 + . . .

+ dN (ϕT1 γ∗1)N + g(ε) + ζ = ϕTγ∗ + g(ε) + ζ,

where γ∗1 = c, given in (16), are the coefficients that render the best Laguerreapproximation of the linear subsystem in the least squares sense, γ∗ are theVL coefficients without model mismatch, ε is the truncation error defined in(7), ζ is measurement noise, ϕ1 is a vector of the first L + 1 Laguerre filtersconvoluted with the input signal u, and ϕ is given in (11). The term g(ε)represents the effect on the system output due to the dynamics in the linearblock that are not captured by the first L + 1 Laguerre functions and can beexpressed as

g(ε) =N−1∑n=0

N∑m=n+1

(m

n

)dmε

m−n(ϕT1 γ∗1)n.

Note that ϕn, can be written in terms of ϕ as

ϕn = (ϕT In)T n = 0, 1, . . . , (48)

where the matrix In ∈ RM×(L+1)n is given by

In = [si1 si2 . . . si(L+1)n] (49)

with sj being the j:th column of the M×M identity matrix I and i1, . . . i(L+1)n

the indices of ϕn in ϕ.With samples y(0), y(1), . . . , y(K − 1) of the output in (47) and ζ being zero-

126

mean i.i.d. Gaussian noise, the LS estimate of the VL coefficients γ is

γ =

(K−1∑k=0

ϕϕT

)−1 K−1∑k=0

ϕy =

(K−1∑k=0

ϕϕT

)−1 K−1∑k=0

ϕ(ϕTγ∗ + g(ε) + ζ) =

γ∗ +

(K−1∑k=0

ϕϕT

)−1 K−1∑k=0

ϕ(g(ε) + ζ) = γ∗ + ∆γ.

Letting K approach infinity gives the deviation of γ from γ∗ due to modelmismatch:

∆γ = A−1 limK→∞

1

K

K−1∑k=0

ϕg(ε) =

A−1N−1∑n=0

N∑m=n+1

limK→∞

1

K

K−1∑k=0

ϕ

(m

n

)dmε

m−n(ϕT1 γ∗1)n =

A−1N−1∑n=0

N∑m=n+1

Tm,n, (50)

where A = limK→∞

1

K

∑K−1k=0 ϕϕ

T . The zero-mean i.i.d. Gaussian process ζ is

independent of all elements in ϕ and is therefore averaged out.The truncation error ε defined in (7) is independent of the elements in ϕ andϕ1 with respect to the averaging operator, due to Lemma 6.2. Hence, the termsof the sum in (50) are

Tm,n =

(m

n

)dm lim

K→∞

1

K

K−1∑k=0

ϕεm−n(ϕT1 γ∗1)n =

(m

n

)dm lim

K→∞

1

K

K−1∑k=0

εm−n1

K

K−1∑k=0

ϕ(ϕT1 γ∗1)n.

Applying Lemma 6.3, it can be concluded that Tm,n is zero if m − n is odd.

127

For n 6= 0 and m− n even, Tm,n is given by

Tm,n =

(m

n

)dm(m− n− 1)!!·

‖ν‖m−n2 σm−n limK→∞

1

K

K−1∑k=0

ϕ(ϕT1 γ∗1)n =

Cm,ndn

limK→∞

1

K

K−1∑k=0

ϕϕTnγ∗n = (51)

Cm,ndn

limK→∞

1

K

K−1∑k=0

ϕϕT Inγ∗n =

Cm,ndn

AInγ∗n,

where

Cm,n =

(m

n

)dm(m− n− 1)!!(‖ν‖2σ)

m−n.

In the simplification of (51), (48) and (17) were used. The term Tm,0, for evenm, is given by

Tm,0 = Cm,0 limK→∞

1

K

K−1∑k=0

ϕϕT I0, (52)

where it was used that ϕT I0 = 1, I0 defined in (49).Inserting (52) and (51) into (50) yields

∆γ = I0δ0 +N−2∑n=2

Inδndnγ∗n, (53)

where

δn =N∑

m=n+2

(−1)m−n + 1

2Cm,n.

The changes in the limits of the sums are due to zero terms.

B Proof of Theorem 6.5

Proof. Because γ∗0 = 0, (see (16)), the obtained estimate of the constant termis

γ0 = ∆γ0 =N∑m=2

(−1)m + 1

2Cm,0 = δ0.

The expression for the estimated coefficients associated with the n:th-orderVolterra kernel for n 6= 0 is obtained through (53):

128

γn =γ∗n + ∆γn = (54)

γ∗n

(1 +

1

dn

N∑m=n+2

(−1)m−n + 1

2Cm,n

)= γ∗n(1 +

δndn

).

Here, it should be noted that δn = 0 for n = N − 1 and n = N .Since d1 = 1, the estimated coefficients associated with the first-order Volterrakernel are obtained as

γ1 =γ∗1 + ∆γ1 = (55)

γ∗1

(1 +

N∑m=3

(−1)m−1 + 1

2Cm,1

)= γ∗1(1 + δ1).

With the help of (13), (17), (54) and (55), the model output can be written as

y =ϕT γ =N∑n=0

ϕTn γn = (56)

δ0 +ϕT1 γ∗1(1 + δ1) +

N∑n=2

ϕTnγ∗n(1 +

δndn

) =

ϕT1 γ∗1 +

N∑n=2

dn(ϕT1 γ∗1)n+

δ0 + δ1(ϕT1 γ∗1) +

N−2∑n=2

δn(ϕT1 γ∗1)n.

By means of (55), the expression in (56) can be written as

y =δ0 +ϕT1 γ1 +N∑n=2

(dn + δn)

(1 + δ1)n(ϕT1 γ1)n. (57)

. Eq. (57) can then be rewritten as

z =N∑n=2

(dn + δn)

(1 + δ1)n(ϕT1 γ1)n = φT

[d2 + δ2

(1 + δ1)2. . .

dN + δN(1 + δ1)N

]T,

where z = y − δ0 −ϕT1 γ1 and φ is given by (20).The asymptotic estimates of the polynomial coefficients, d = [d2 d3 . . . dN ]T ,obtained by the method in Sec. 5 are

d = B−1 limK→∞

1

K

K−1∑k=0

φz =

d2+δ2(1+δ1)2

...dN+δN(1+δ1)N

, (58)

129

where B = limK→∞

1

K

∑K−1k=0 φφ

T . The expression in (58) can be written as

d = d + ∆d,=

d2...

dN

+

∆d2

...

∆dN

where ∆d denotes the bias whose elements are

∆dn = dn1− (1 + δ1)n

(1 + δ1)n+

δn(1 + δ1)n

. (59)

In virtue of the equality (1 + δ1)n =∑np=0

(np

)δp1 , (59) becomes

∆dn = dnδn −

∑np=1

(np

)δp1

1 +∑np=1

(np

)δp1

. (60)

C Proof of Lemma 7.1

Proof. Consider the quotient

qi =d3i

d2i. (61)

By combining (33) and (61), the dynamics of qi are

qi+1 =d3i+1

d2i+1

=d30

d20

(1 + 3

d3i

d2iγ0

)= 3γ0q0qi + q0, (62)

that is initialized at q0 using (28):

q0 =d30

d20=

d3(1 + 3V d3)3

/d2

(1 + 3V d3)2=d3d2· 1

1 + 3V d3.

The closed-form solution to (62) is

qi = q0

i∑m=0

(3γ0q0)i = q0

i∑m=0

(3V d3

1 + 3V d3

)i= q0

i∑m=0

λi, (63)

where γ0 = ‖ν‖22σ2d2 = V d2 was used. Letting the number of iterationsapproach infinity turns (63) into a geometric series converging if and only if|λ| < 1. The convergence condition is thus equivalent to (35). The sum q ofthe series is

q = limi→∞

qi = limi→∞

q0

i∑m=0

(3γ0q0)i =q0

1− 3γ0q0=d3d2. (64)

130

Note that since qi is a geometric series, it converges uniformly to q and thatthe convergence rate is determined by λ. Combining (34) and (64) hence gives

limi→∞

di+1 =

d2(1+3V limi→∞

d3id2i

d2)2

(1+3V d3)2

d3(1+3V limi→∞

d3id2i

d2)3

(1+3V d3)3

=

d2(1+3V

d3d2d2)

2

(1+3V d3)2

d3(1+3V

d3d2d2)

3

(1+3V d3)3

=

[d2

d3

]= d,

if and only if d3 > −1

6V. Since qi converges uniformly, so does di.

References

[1] E. W. Bai An Optimal Two-Stage Identification Algorithm forHammerstein-Wiener Nonlinear Systems Automatica, 34:3 333-338, 1998.

[2] R. Bednarik, T. Kinnunen, A. Mihaila, P. Frnti Eye-Movements as a Bio-metric Image analysis, Springer Berlin Heidelberg, 780–789, 2005.

[3] S. A. Billings Identification of nonlinear systems – A survey IEEE Proceed-ings D (Control THeory and Applications), 127:6 272–285, 1980.

[4] S. Boyd, L. Chua Fading Memory and the Problem of ApproximatingNonlinear Operators with Volterra Series IEEE Transactions on Circuitsand Systems, 32:11 1150–1161, 1985.

[5] P. Celka, N. J. Bershad, J-M. Vesin Stochastic Gradient Identification ofPolynomial Wiener Systems: Analysis and Application IEEE Transactionson Signal Processing, 49:2 301–313, 2001.

[6] B. Giovanni, R. Sorrentino Electronic filter simulation and design McGraw-Hill Professional 17–20, 2007

[7] F. Giri, E. W. Bai Block-oriented Nonlinear System Identification LectureNotes in Control and Information Sciences, Springer, 2010

[8] G. A. Glentis, P. Koukoulas, N. Kalouptsidis Efficient Algorithms forVolterra System Identification IEEE Transactions on Signal Processing,47:11, 3042–3057, 1999

[9] S. Goutelle, M. Maurin, F. Rougier, X. Barbaut, L. Bourguignon, M.Ducher, P. Maire The Hill equation: a review of its capabilities in pharma-cological modelling Fundamental and Clinical Pharmacology, 22:6 633–648,2008

[10] W. Greblicki Nonparametric identification of Wiener systems by orthog-onal series IEEE Transactions on Automatic Control, 39:10 2077–2086,1994

131

[11] S. Heinen, E. Potapchuk, S. Watamaniuk Small foveal stimuli rendersmooth pursuit less smooth Journal of Vision, 14:10 article 494, 2014.

[12] P. S. C. Heuberger, P. M. J. Van den Hof, B. Wahlberg (Eds.). Mod-elling and identification with rational orthogonal basis functions. London:Springer, 2005

[13] Izawa. Y, Hisao. S, Activity of Fixation Neurons in the Monkey FrontalEye Field During Smooth Pursuit Eye Movements Journal of neurophysi-ology, 112:2 249–262, 2014.

[14] D. Jansson, O. Rosen, A. Medvedev Parametric and Nonparametric Anal-ysis of Eye-Tracking Data by Anomaly Detection IEEE Transactions onControl Systems Technology, Issue 99, 2015.

[15] D. Jansson, A. Medvedev Volterra modeling of the Smooth Pursuit Systemwith application to motor symptoms characterization in Parkinson’s diseaseIEEE European Control Conference, Strasbourg, 2014.

[16] D. Jansson, A. Medvedev System Identification of Wiener Systems viaVolterra-Laguerre Models: Application to Human Smooth Pursuit AnalysisIEEE European Control Conference, Linz, 2015.

[17] D. Jansson, A. Medvedev Identification of Polynomial Wiener Systemsvia Volterra-Laguerre Series with Model Mismatch 1st Conference on Mod-elling, Identification and of Nonlinear Systems (MICNON), Saint Peters-burg, 2015.

[18] P. Kasprowski, J. Ober Eye Movements in Biometrics Biometric Authen-tication, Springer Berlin Heidelberg, 248–258, 2004

[19] B.V. Gnedenko, A.N. Kolmogorov Limit distributions for sums of inde-pendent random variables. Addison-Wesley Mathematics Series, Addison-Wesley, Camebridge, MA (1954)

[20] M. Korenberg, I. Hunter The Identification of Nonlinear Biological Sys-tems: Volterra Kernel Approaches Annals of Biomedical Engineering, 24:2250–268, 1996.

[21] S. Lacy, D. Bernstein Identification of FIR Wiener systems with unknown,non-invertible, polynomial non-linearities International Journal of Control,76:15 1500–1507, 2003.

[22] L. Ljung System Identification Theory for the User Englewood Cliffs, NJ:Prentice-Hall, 1987

[23] S. Marino, E. Sessam, G. Di Lorenzo, P. Lanzafame, G. Scullica, A. Bra-manti, F. La Rosa, G. Iannizzotto, P. Bramanti, P. Di Bella QuantitativeAnalysis of Pursuit Ocular Movements in Parkinson’s Disease by Using aideo-Based eye-tracking System. European Neurology, 58:4 193–197, 2007.

132


[25] M. Pawlak, Z. Hasiewicz, P. Wachel On Nonparametric Identification ofWiener systems IEEE Transactions on Signal Processing, 55:2 482–492,2007

[26] R. Pintelon, J. Schoukens System identification: a frequency domainapproach (2nd ed.) Wiley-IEEE Press., 2012

[27] L. Rayleigh On the character of the complete radiation at a given tem-perature Philosophical magazein series 5, 27460 – 469, 2000.

[28] W. J. Rugh, Nonlinear System Theory: The Volterra-Wiener ApproachJohn Hopkins University Press, Baltimore, 1981

[29] M. Schetzen The Volterra and Wiener Theories of Nonlinear SystemsMalabar, FL: Krieger, 1980

[30] K. Tiels, J. Schoukens Wiener system identification with generalized or-thonormal basis functions Automatica, 50:3147?-3154, 2014

[31] P. M. J. Van Den Hof, P. S. C. Heuberger, J. Bokor System Identificationwith Generalized Orthonormal Basis Functions Automatica, 31:1821–1834,1995

[32] S. VanVaerenbergh, J. Via, I. Santamaria Blind Identification of SIMOWiener Systems Based on Kernel Canonical Correlation Analysis IEEETransactions on Signal Processing, 61:9 2219–2230, 2013

[33] Wahlberg. B, System Identification Using Laguerre Models IEEE Trans-actions on Automatic Control, 36:5 551–562, 1991

[34] D. Wang, F. Ding Least squares based and gradient based iterative iden-tification for Wiener nonlinear systems Signal Processing, 91:5 1182–1189,2011

[35] D. Westwick, M. Verhaegen Identifying MIMO Wiener systems usingsubspace model identification methods Signal Processing, 52:2 235–258,1996

[36] N. Wiener Nonlinear problems in random theory Wiley, New York, 1958

[37] T. Wigren Convergence analysis of recursive identification algorithmsbased on the Wiener model IEEE Transactions on Automatic Control,39:11, 2191–2206, 1994

[38] A. Winkelbauer Momments and absolute moments of the normal distri-bution arXiv preprint arXiv:1209.4340 (2012)

133

[39] W. B. Wu Asymptotic theory for stationary processes Statistics and ItsInterface, 0:1–20, 2011

[40] Chong Zhang, Yu Zhao, Triesch, J., Shi, B.E. Intrinsically motivatedlearning of visual motion perception and smooth pursuit IEEE Interna-tional Conference on Robotics and Automation, 2014

134

Paper IV

d

Non-Parametric Analysis of

Eye-Movement Data by Anomaly

Detection∗

Daniel Jansson,and Alexander Medvedev

Department of Information Technology, Uppsala UniversityUppsala, Sweden

E-mail: [email protected],[email protected]

Abstract

A non-parametric approach for distinguishing between individuals bymeans of recorded eye movements is suggested. The method is basedon the principles of stochastic anomaly detection and relies on measureddata for probability distribution estimation and evaluation. For visualstimuli that excite the essential nonlinear dynamics of the oculomotorsystem, mean gaze trajectories and characterizations of their uncertaintyare approximated per individual on the basis of eye-tracking data. Withthis information, non-parametric models called eye-tracking profiles areestablished, against which independently acquired data sets are statisti-cally tested to evaluate their anomaly scores, i.e. to give the probabilitythat they do not belong to said profiles. Both Gaussian function fittingand kernel density estimation (KDE) techniques are used for probabilitydensity estimation. It is shown that the presented method yields promis-ing results in terms of individual classification based on eye movementsand that using the KDE method for trajectory distribution estimationis favorable compared to normal distribution fitting. Furthermore, it isshown that by complementing the gaze direction measurements with es-timates of gaze velocity and thereby taking the dynamical nature of theSPS into account, even better results in terms of individual classificationare achieved. In particular, position/velocity profiles are shown to betterdistinguish between the eye movements of Parkinson patients and thoseof healthy controls.


137

1 Introduction

Research has shown that various medical conditions impair the oculomotorsystem to different degrees. For example, Huntington’s Chorea [1], Schizophre-nia [10] and Parkinson’s Disease (PD) [3] affect the smooth pursuit system(SPS) negatively, motivating the search for accurate quantification methods,which could then be used as diagnosing or even staging tools.If measures can be found that effectively distinguish between healthy individu-als using eye movements, they are also likely to detect the discrepancies in theoculomotor system resulting from a clinical condition.An attempt at such a measure, mentioned in previous research [5, 9], is thesmooth pursuit gain (SPG). The SPG is defined by the ratio of the angularvelocity of the eye to that of a moving target during smooth pursuit and hasbeen claimed to vary between individuals of different age. It is also said tobe lower in subjects with PD [9]. Although this is an interesting finding, theSPG is not an exhaustive measure of the oculomotor system’s function as itis descriptive of only one point in the SPS frequency characteristics. In [6], itwas suggested to model the SPS as a dynamical system and use eye-trackingdata to identify its unknown parameters, thus introducing the dynamic SPG(DSPG) measure.DSPG yields full dynamic models per individual, but may be unreliable whenthe model structure does not correctly represent the underlying system. Prob-lems may also arise when the studied data set contains saccades or other typesof eye movement not governed by the smooth pursuit mechanism and hence notproperly described by the selected model. It is therefore of interest to find anon-parametric approach to use as a supplement to the model-based methods.Non-parametric methods rely entirely on data and no modeling assumptionsabout the studied system are made.In this paper, such a model-free approach for distinguishing between individualson the basis of their recorded eye movements is derived. It relies on the prin-ciples of anomaly detection and uses statistical methods to find deviating datasets. Visual stimuli consisting of a white moving circle against a black back-ground on a computer monitor are presented to test subjects multiple times.Normal (mean) position/velocity trajectories and associated gaze trajectorydistributions are estimated from the gathered eye-tracking data to establish in-dividualized non-parametric models or eye-tracking profiles. An independentlyacquired data set of eye movement can then be tested against the estimatedprobability density functions (PDFs) to establish any significant deviation fromthe profile. The trajectory PDFs are estimated using kernel density estimation(KDE). In KDE nothing is presumed about the underlying distribution to beestimated. It is thus a non-parametric way of estimating an unknown PDF, incontrast to parametric methods such as normal distribution fitting.To include gaze velocity data in the eye-tracking profiles, means for numericaldifferentiation of the measured gaze direction data are required. For this pur-pose, a simple method for signal differentiation involving the continuous-time

138

Laguerre functions is suggested.Finally, as a demonstration of the potential of the proposed method, its perfor-mance in terms of distinguishing between healthy individuals and individualsdiagnosed with PD is evaluated.The paper is composed as follows: Section 2 describes how normal and KDEdistributions can be approximated from data. The definition of the outlierregion and means to find it for normal and general distributions is given inSection 3. Section 4 presents the suggested method for signal differentiation.In Section 5 an overview of the method of generating eye-tracking profiles fromdata is presented. The experimental setup and conducted experiments aredescribed in Section 6 and the corresponding results are given in Section 7.Finally, conclusions are stated and the methods and results are discussed inSection 8.

2 Probability Density Estimation

The normal (Gaussian) distribution is commonly used to approximate the sta-tistical properties of the underlying stochastic variable of a given data set.However, normal distribution fitting may not always give satisfactory results,depending on the studied quantity.Testing for normality can be done in different ways. A histogram of the datacan be examined and verified to resemble a normal distribution PDF. A Q-Qplot may indicate that the data are not normally distributed [12]. Here, the Q-Q plot and the Lilliefors test are employed. The Lilliefors test is an adaptationof the Kolmogorov-Smirnov test for normality [8].Should the data appear to not be normally distributed, an alternative methodfor density estimation is KDE. In KDE nothing is presumed about the shape ofthe distribution to be estimated. It is thus a non-parametric way of estimatingan unknown PDF, in contrast to the parametric method of normal distributionfitting.The details of the two approaches for PDF estimation are provided below.

2.1 Normal Distribution Estimation

Assume that the sample (x1, x2, . . . , xNs), xi ∈ RM is drawn from some

distribution. The underlying PDF can then be estimated using normal distri-bution fitting.The general expression for a multi-dimensional normal distribution function is

φ(x) =1

2π√

det (Σ)e−

12 (x−µ)T Σ−1(x−µ). (1)

where µ ∈ RM is the mean and Σ ∈ RM×M is the covariance matrix of thedistribution. The matrix determinant is denoted det (·). Fitting (1) to the data

139

is a matter of finding estimates of µ and Σ. The distribution mean, µ, can beestimated by the sample mean

µ =1

Ns

Ns∑i=1

xi, (2)

and Σ by the sample covariance matrix

Σ =1

Ns − 1

Ns∑i=1

(xi − µ)(xi − µ)T . (3)

Once µ and Σ are computed, they can be inserted into (1) to give the resultingPDF estimate.

2.2 Kernel Density Estimation

KDE is a non-parametric method to obtain a smooth estimate of the PDF ofan unknown distribution. Assume that the observations (x1, x2, . . . , xNs

),xi ∈ RM are drawn from some unknown distribution with probability densityf : RM → R. The kernel density estimator of f is given by

f(x) =1

Nsh

Ns∑i=1

K(x− xih

), (4)

where K is the chosen kernel function and h is a user parameter called thebandwidth. Here, the kernel was chosen to be the M -dimensional standardnormal density function

φ(x) =1

(2π)M/2e−

12x

T x. (5)

Using this kernel, the kernel density estimator can be rewritten as

f(x) =1

Ns

Ns∑i=1

1

(2π)M/2he−

12h2 (x−xi)

T (x−xi) =

1

Ns

Ns∑i=1

ψ(x− xi), (6)

where ψ(x) is a M -dimensional normal density function with standard devia-tion h and mean 0. The estimate of the probability density f is thus the meanof Ns M -dimensional Gaussian functions with standard deviations h (note thatthe kernel is circular) and means given by the observations xi.The choice of the bandwidth h has great impact on the resulting kernel estima-tor. In the one-dimensional case, if the underlying distribution being estimatedis Gaussian, an optimal choice of h can be derived [11]. This is not possible inthe general case, and an appropriate value must be found experimentally.

140

3 Finding the Outlier Region

Assume that an observation, x ∈ RM , is made and that it must be determinedwhether it is likely to be an observation of a given random variable X, or not.A hypothesis test with the null hypothesis:

H0 : x is an observation of X (7)

must be carried out. One way to do this is to define an outlier region S of therandom variable, being the set of all possible observations deemed unlikely tocome from the considered distribution, i.e all x for which H0 is rejected. Theprobability that an observation of X lies in S should be low. The complementof the outlier region is the confidence region corresponding to the set of allpossible observations deemed likely to come from the distribution. Define αsuch that

P (X ∈ S) =

ˆS

f(x)dx = α. (8)

Consequently, α is the probability with which an observation of the consideredrandom variable is deemed (incorrectly) to be from some other distribution.The choice of α will influence the size and shape of the outlier region S. Findingthe outlier region is done depending on the considered distribution. Methodsfor testing observations against normal distributions are well-known and widelyused.

3.1 Normal Distribution

For the case when the M -dimensional random variable X has a normal dis-tribution with mean µ and covariance Σ, the outlier region can be derived asfollows: Let Y = Σ−

12 (X − µ), so that Y ∈ N(0, I), where I is the identity

matrix. Form the variable

Z = (X − µ)TΣ−1(X − µ) = Y TY = Y 21 + Y 2

2 + . . .+ Y 2M , (9)

where Yi denotes the i:th component of Y . Thus Z is the square of the Maha-lanobis distance between X and µ and is a sum of the squares of M standardnormal random variables and will therefore have a χ2

M distribution [4]. Theoutlier region of X can thus be defined as

S1 ={x : (x− µ)TΣ−1(x− µ) ≥ χ2

M (1− α)}, (10)

where χ2M (p) is the quantile function for probability p of the χ2

M distribution.Eq. (10) shows that the outlier region S1 is the exterior of an ellipse centeredaround the distribution mean. Thus, for a given observation x, H0 is rejectedif x ∈ S1, which is easily established by inserting x into (10).

141

3.2 General Distribution

The kernel density estimates of the PDF are not Gaussian functions and theoutlier region must be derived differently than in the normal distribution case.The PDF may have several peaks and consequently the Mahalanobis distancefrom the distribution mean is no longer a plausible test quantity. The outlierregion may be non-convex or even disconnected.The outlier region of a random variable X with probability density f is givenby

S2 =

{x :

ˆS2

f(τ)dτ = α, f(x) ≤ f(xc), ∀xc ∈ Sc2}, (11)

where Sc2 is the complement set of S2, i.e. the confidence region. If f is positiveand continuous, S2 will be uniquely defined by (11).Determining whether a given observation x is part of S2 can be done numeri-cally as follows:

• Evaluate f for the finite set of uniformly spaced grid points {xi}Mi=1 toobtain {fi}Mi=1.

• Let {f(i)}Mi=1 be {fi}Mi=1 sorted in ascending order.

• Find K such that∑Ki=1 f(i) ≤ α

V <∑K+1i=0 f(i), where V = dM is the

volume of a grid element and d is the distance between adjacent gridpoints.

• An approximation S2 of S2 is then given by

S2 = {xi : f(xi) ≤ f(K) = γT }. (12)

In simpler terms: Keep summing the smallest elements of {fi}Mi=1 while thesum does not exceed α

V . Denote the largest term in the sum by γT . Let S2 bethe set of all xi corresponding to the summed fi. For a given observation x,H0 is rejected if x ∈ S2, which is easily checked by inserting x into (12).

4 Signal Differentiation

The method for signal differentiation presented here makes use of the Laguerrefunctions. They form an orthonormal basis which is complete in the L2 functionspace. They can be used to approximate transfer functions or the output oflinear dynamic systems, see for example [2].The k:th continuous-time Laguerre function is given by

Lk(t) =√

2pe−ptk+1∑n=1

k!(−2pt)n−1

((n− 1)!)2(k − n+ 1)!, (13)

142

where p is the Laguerre (user-defined) parameter determining the decay rateof the function. The set of Laguerre functions is orthonormal so that

〈Lm(t), Ln(t)〉 = δmn, (14)

where δmn is the Kronecker delta and the inner product is given by

〈f(t), g(t)〉 =

ˆ ∞0

f(t)g(t)dt. (15)

A function f(t) ∈ L2 can be expressed in terms of Laguerre functions by

f(t) =∞∑k=0

akLk, (16)

where

ak = 〈f(t), Lk(t)〉, (17)

are the Laguerre coefficients [2].One feasible way of approximating the derivative of a noisy signal is by project-ing the signal onto a finite set of Laguerre functions and then using analyticallydifferentiated Laguerre functions to give an estimation of the signal derivative.However, if the signal variance is large, high order Laguerre functions are re-quired in the approximation and may cause numerical problems. A way ofbypassing this is by performing Laguerre approximation within a sliding win-dow of a given size.Assume the signal to be differentiated is sampled at sampling frequency fs = 1

Ts

and stored in a vector

y =[y1 y2 . . . yN

]T, (18)

with N samples. Choose a suitable window length L and create M = N−L+1new vectors ym of length L such that

ym =[ym ym+1 . . . ym+L−1

]T. (19)

An approximation of the derivative of ym can then be found by

ˆym =∑k=0

(yTmLkTs)Lk, (20)

where Lk are sampled versions of the Laguerre functions in vector form, Lk arethe sampled versions of the Laguerre function derivatives in vector form and `is the highest order of the Laguerre functions used. In (20), the mid-point ruleis used to approximate integral (17).

143

Now form the M -by-N -matrix

ˆY =[

ˆy1ˆy2 . . . ˆy

](21)

and let Vm be the number of non-zero elements in row m of ˆY. The estimatedderivative of the signal, ˆy, is finally given by

ˆy =[

ˆy1ˆy2 . . . ˆyN

], (22)

where

ˆyi =1

Vi

M∑j=1

ˆYij . (23)

Hence, the approximation of the signal derivative is the average of the approx-imations obtained within each window.

5 Method

5.1 Visual Stimulus

The visual stimuli are in form of a white circle moving smoothly and seeminglyrandomly in a 25 cm × 25 cm black background window on a computer monitor.The stimuli were generated using a method in [7] which gives input sequences,horizontal and vertical, with the rich spectral and amplitude excitation neededto accurately identify the nonlinear Wiener-type models used to portray theSPS. This stimulus design approach is used for non-parametric modeling aswell because it reveals the underlying system dynamics. The method requiresthe specification of the maximum frequency to be present in the input, fc. Ahigher maximum frequency generally results in a faster moving stimulus.

5.2 Method Overview

The method comprises two parts: The first is establishing an eye-trackingprofile for a specific individual or group of individuals. The second is testingwhether an independently acquired data set is associated with said profile ornot.

5.2.1 Establishing the Eye-Tracking Profile

Assume that Ns data sets of eye movements are recorded from a test subjecttracking the same visual stimulus trajectory multiple times on different occa-sions. Due to the complex nature of the oculomotor system, the response to

144

a visual stimulus will not be the same for repeated exposures. Hence, the Nsdata sets will not be equal. For each of the Nt time instances at which thegaze direction is sampled, there will be Ns position data points, one from eachset of recorded eye movements. Since horizontal and vertical gaze directioncoordinates are logged separately, the data points will have two components.The data points at time instance k will be seemingly random with some ex-pected value, and can thus be seen as Ns observations of a two-dimensionalstochastic variable, X(k). Note that there will be one stochastic variable pertime instance, such that the entire gaze trajectory can be seen as a stochasticprocess. If the stochastic variables X(k) are assumed to be independent, thedistribution of X(k) for each k can be estimated from data. The distributionof X(k) will depend on the trajectory of the visual stimulus, but also on theindividual tracking it.In practice, the distributions of X(k), k = 1, 2, . . . , Nt, are not known, butcan be estimated from data. The simplest way is to use a histogram. However,since the data are multi-dimensional, a large number of data points is needed toachieve sufficiently small bin widths for reliable statistical testing. To acquire alarge number of data points, a test subject would have to track the same visualstimulus a large number of times, which would be time-consuming and tedious.Therefore, normal distribution fitting and KDE, as described in Section 2, areused for distribution estimation. The resulting non-parametric model, consti-tuted by a set of Nt PDF estimates, is referred to as an eye-tracking profile.

5.2.2 Testing an Independently Acquired Data Set

For an independently acquired data set, possibly from a different individual, ineach of the Nt time instances, hypothesis test (7) is carried out. All detectedoutliers, i.e. all time instances in which H0 is rejected, are recorded. The totaloutlier count in a data set is used as an anomaly score, showing how often andin which parts of the trajectory the deviations from the previously establishednormal behavior occurred. The anomaly score determines how probable it isthat the newly acquired data set belongs to the estimated eye-tracking profile.If the score is higher than some chosen threshold value T , the data set is deemedto be the gaze trajectory of a different individual.The approach generalizes in a straightforward manner to the case of a groupof test subjects sharing a property, such as e.g. healthy persons or persons ofa certain age.

5.2.3 Method Extension

The assumption that X(k) are independent greatly reduces the dimensionalityof the problem and thus also the number of observations required for accuratedensity estimation. Unfortunately, the independence assumption violates thepostulation that the SPS is a dynamical system. This may result in falsepositives as trajectories that are dynamically dissimilar, but point-wise similar

145

Figure 1: Screen shot from the eye-tracking software.

to the considered eye-tracking profile will pass for being associated with it.However, false negatives due to the independence assumption are unlikely andreliable results may thus still be achievable.A way of extending the method to decrease the risk for false positives is tonumerically estimate the momentaneous two-coordinate gaze velocity at eachtime instance and include it in the data. By including statistical informationabout the gaze velocity in the eye-tracking profiles, some of the dynamicalproperties of the SPS will be accounted for. The downside is that it doublesthe dimensionality of the PDFs.

6 Experiment

Gaze direction data of test subjects attempting to track the moving circleon a computer monitor were recorded using a video-based eye tracker fromSmart Eye AB, Sweden. The eye tracker output is the distance in centimeters(horizontal and vertical components separately) between the monitor centerand the point where the gaze direction line intersects the monitor. Eye-trackingdata were sampled at a sampling frequency of fs = 60 Hz. Fig. 1 shows a screenshot from the eye-tracking software, illustrating how the eye tracker evaluatesthe gaze direction.The conducted experiment involved four healthy test subjects:

146





and five test subjects diagnosed with PD:






Two stimuli of length T = 26 s were generated using two different maximumvalues for the frequency content, fc = 0.5 Hz and fc = 1.5 Hz. The latterwill be referred to as Stimulus 1 and the former Stimulus 2. Stimulus 1 wasthus of lower average velocity than Stimulus 2. H1 was first asked to track thetwo stimuli 50 times each. This generated 50 data sets per stimulus, each withNt = Tfs = 26 · 60 = 1560 time samples. In the same manner, 25 data setsper stimulus were gathered from H3. Further, the remaining test subjects wereasked to track the two stimuli 5 times each.

7 Results

7.1 Healthy Data

The first 20 data sets of H1 and H3 per stimulus will be referred to as theestimation sets. The remaining five sets of H1 and H3, and the five data setsfrom H2 and H4 per stimulus, will be referred to as the testing sets. The goalis to investigate the ability of the presented method to determine whether thetesting sets come from H1, H3 or neither.

7.1.1 Testing for Normality

The distribution of the data varies between different time instances and a testfor normality must be done at each instance. For illustrative purposes, theestimation sets of H1 in response to Stimulus 2 were examined at time instancek = 1200. The data at several other time instances had similar characteristics.Fig. 2 shows a Q-Q plot and a histogram of the horizontal data (x-coordinate)for time instance k = 1200. Judging from these two plots, the data may notbe normally distributed. Carrying out a Lilliefors test indeed showed that thedata were not from a normal distribution at the 95% significance level.

147

−2 −1 0 1 2−0.12

−0.1

−0.08

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

Standard Normal Quantiles

Qua

ntile

s of

Sam

pled

Dat

a

a)

−0.15 −0.1 −0.05 0 0.050

1

2

3

4

5

6

7

8

9b)

x−coordinate (dm)

Num

ber o

f obs

erva

tions

Figure 2: a) The Q-Q plot and b) the histogram for the x-coordinate of thestimulus 2 estimation sets from H1 at time instance k = 1200. The red line inthe Q-Q plot indicates a normal distribution.

Performing the Lilliefors test for all of the 1560 time instances in the estima-tion sets of H1 and H3 showed that the data were not likely to be normallydistributed in 39.7% of the cases for H1 and 44.2% for H2.

7.1.2 Establishing Eye-Tracking Profiles

Using the estimation sets of H1 in response to Stimulus 1, two-dimensionalPDFs for each of the Nt = 1560 time instances were estimated using the twomethods presented in Section 2: Normal distribution fitting and KDE. ThePDFs in each time step were thus estimated from 20 observations of the un-derlying stochastic variable. Together, the 1560 PDFs per estimation methodconstituted a 20-set eye-tracking profile for H1. In the same way, two 20-seteye-tracking profiles for H3 were established. Finally, the same was done usingthe data collected in response to Stimulus 2, in order to establish two neweye-tracking profiles.For the KDE method, five different choices of the bandwidth h were used. Thethe five choices of the bandwidth were h = {0.1σ, 0.25σ, 0.5σ, σ, 2σ}, where σwas either the sample standard deviation of the horizontal data or the verticaldata, depending on which of the two was larger. The bandwidth thus variedbetween time instances. As an example, the estimated normal distribution andthe corresponding KDE with h = 0.5σ, for H1 at time instance k = 1200 ofStimulus 2 are shown in Fig. 3.

7.1.3 Outlier Detection

The testing sets were compared to the eye-tracking profiles of H1 associatedwith Stimulus 2 and estimated in Section 7.1.2. The number of outliers in eachtesting set was computed at significance level α = 0.05. For each test subject,

148

00.1

0.20.3

0.4

−0.10

0.10.2

0.30

20

40

60

80

100

Horizontal

b)

Vertical

Estim

ated

PD

F

00.1

0.20.3

0.4

−0.10

0.10.2

0.30

20

40

60

80

100

Horizontal

a)

Vertical

Estim

ated

PD

F

Figure 3: a) The estimated normal distribution and b) the KDE with h = 0.5σat time instance k = 1200 (using the estimation data of H1.

Hk, the mean number of outliers in its testing sets was calculated, referred toas the anomaly score, denoted by mHk.The anomaly score of the data sets when comparing to the eye-tracking profileof H1 associated with Stimulus 2 are presented in Tab. 1. The numbers aregiven in percent of the total number of time steps. Fig. 4 shows a heat map ofthe estimated KDE distribution for H1 at time instance k = 1200 of Stimulus 2along with the boundary of the confidence region and an observation from atesting set of H2 lying outside it, i.e. in the outlier region.The same procedure as above was repeated but this time the testing sets werecompared to the eye-tracking profile of H3 associated with Stimulus 2, estab-lished in 7.1.2. The anomaly score of the data sets when comparing to theeye-tracking profile of H3 are presented in Tab. 2.

Dist.type

h mH1

(%)mH2

(%)mH3

(%)mH4

(%)

KDE

0.1σ 36 46 55 66

0.25σ 4.9 21 32 48

0.5σ 0.6 15 27 37

σ 0.1 0.9 5.6 15

2σ 0 0.5 0.7 1.8

Normal - 0.5 0.6 5.1 11

Table 1: The means, mH1, mH2, mH3, mH4, of the number of outliers atα = 0.05 in the testing sets when comparing to the 20-set eye-tracking profilesof H1. The means are given as percent of the total number of time steps.

The entries in Tab. 1 and Tab. 2 give the amount of time during which the gaze

149

0.1 0.15 0.2 0.25 0.3−0.05

0

0.05

0.1

0.15

0.2

x−coordinate

y−co

ordi

nate

Figure 4: The KDE estimate of the PDF at time instance k = 1200 from theestimation data of H1 in response to Stimulus 2. Red indicates high values.The approximated outlier region is the exterior of the dashed line. The circleshows the gaze direction of H2 at this time instance.

Dist.type

h mH1

(%)mH2

(%)mH3

(%)mH4

(%)

KDE

0.1σ 54 48 27 65

0.25σ 30 15 4.4 51

0.5σ 15 10 1.1 35

σ 2.7 1.4 0.9 7.4

2σ 0.4 0.4 0 2.5

Normal - 8.0 5.1 2.1 15

Table 2: The means, mH1, mH2, mH3, mH4, of the number of outliers atα = 0.05 in the testing sets when comparing to the 20-set eye-tracking profilesof H3. The means are given as percent of the total number of time steps.

150

Figure 5: Heat map of the estimated trajectory distribution of H1. Red indi-cates high values. The blue line shows a trajectory of P1 attempting to trackthe same stimulus.

direction of the test subjects deviated significantly from the mean trajectoryof the estimation sets from H1 and H3 respectively.It is evident from Tab. 1 that the anomaly score in the testing sets of H1 weresignificantly lower than in the other test subjects. In Tab. 2 the same can besaid about the anomaly score in the testing sets of H3.

7.2 Parkinson Data

Here, as a demonstration of the potential of the proposed method, its per-formance in terms of distinguishing PD patients from healthy individuals isevaluated.All 50 sets per stimulus of H1 were used to establish two eye-tracking profilesagainst which recorded gaze trajectories of the five PD patients were compared.Fig. 5 shows a heat map over an excerpt of the estimated trajectory distributionof H1 associated with Stimulus 2 overlaid by the gaze trajectory of P1. It isapparent that the trajectory of P1 deviates from the mean trajectory of H1 atseveral time instances.Tab. 3 shows the anomaly score at the 0.05 significance level in the data sets

151

of the different test subjects compared to the eye-tracking profiles of H1. Thetrajectories of the PD patients deviated significantly from the profiles of H1.However, Tab. 3 also shows that the gaze of H4 deviated as much from theprofiles as did that of P5 for both stimuli. This may suggest that P1 showedless symptoms than the other patients, but it can not be confirmed due to theunavailability of patient information.It should also be noted from Tab. 3 that the difference in results between thehealthy and the PD subjects was smaller when Stimulus 1 with a lower velocitywas used.

Stimulus 1 Stimulus 2

m(%)

m(%)

m(%)

m(%)

P1 28.1 H1 0.4 P1 48.1 H1 0.6

P2 38.7 H2 10.1 P2 61.4 H2 14.3

P3 41.2 H3 21.1 P3 58.2 H3 26.3

P4 30.2 H4 27.4 P4 49.1 H4 39.6

P5 25.2 P5 40.1

Table 3: The average number of outliers, m, at α = 0.05 in the testing setswhen comparing to the 50-set eye-tracking profiles of H1. The numbers aregiven as percent of the total number of samples in the data set.

In an attempt to further improve the results, the eye-tracking profiles of H1were extended by taking both gaze position and velocity data into account.The gaze velocity was estimated using the method in Section 4. Each eye-tracking profile was thus constituted by a set of Nt = 1560 four-dimensionalPDFs, each estimated from Ns = 50 data points. Tab. 4 shows the anomalyscore at the 0.05 significance level in the data sets of the different test subjectswhen comparing to the position/velocity eye-tracking profiles of H1. The datain Tab. 4 show that by also considering gaze velocity rather than just gazeposition, the difference in the results between the PD patients and the healthytest subjects became more apparent, particularly in the data obtained withStimulus 2. However, one should also note in Tab. 4 that the testing sets ofH1 were now deemed less likely to belong to the H1 eye-tracking profile, henceincreasing the probability of false negatives. This may be because the estimatedposition/velocity distributions are four-dimensional and will be more uncertaindue to the relatively low number of observations.

152

Stimulus 1 Stimulus 2

m(%)

m(%)

m(%)

m(%)

P1 28.2 H1 2.0 P1 55.1 H1 5.1

P2 45.2 H2 5.2 P2 69.1 H2 17.2

P3 50.0 H3 24.1 P3 65.6 H3 28.4

P4 42.9 H4 30.9 P4 55.3 H4 41.2

P5 31.2 P5 51.0

Table 4: The average number of outliers, m, at α = 0.05 in the testing setswhen comparing to the 50-set position/velocity eye-tracking profile of H1. Thenumbers are given as percent of the total number of samples in the data set.

7.3 Local deviation

The deviation of gaze trajectories from eye-tracking profiles may be more ap-parent in certain parts of the stimulus trajectory than in others. One of theadvantages of the presented method is that it enables the detection of suchlocal deviation.By studying the data sets obtained in the experiment, it seems that the eye-tracking profiles of different test subjects deviate most significantly in thecurved parts of the stimulus trajectory, i.e. when the velocity of the target(white circle) changes direction, which is associated with an increased acceler-ation.Fig. 6 shows the average of the number of outliers in a sliding 100-samplewindow in the data sets of H3 and P1 when comparing to the 50-set eye-trackingprofile of H1 associated with Stimulus 2. Also in the figure is the averageacceleration of the stimulus in the same sliding window. The acceleration islow-pass filtered for clearer visualization. The sliding averages of the numberof outliers for the two test subjects are represented by the blue and red linesin the figure. The lines to some extent represents, as a function of time, howdifficult it was for the test subjects to track the stimulus.As revealed in Fig. 6, the difficulty of tracking the stimulus is correlated withthe stimulus acceleration. It seems that the correlation is stronger for the P1data than for the H3 data. In the P1 data, almost every quick increase ordecrease in acceleration is followed by a quick increase or decrease of outliers,whereas the H3 data is smoother.

153

Time (s)0 5 10 15 20 25

Slid

ing

aver

age

of o

utlie

rs

0

0.5

Nor

mal

ized

stim

ulus

acc

eler

atio

n

0

1H3P1Stim. acceleration

Figure 6: The average of the number of outliers in a sliding 100-sample windowin the data sets of H3 and P1 when comparing to the eye-tracking profile ofH1 (left y-axis) along with the normalized average stimulus acceleration in thesame sliding window (right y-axis).


This paper is aimed at providing tools for distinguishing between individualson the basis of their recorded eye movements. The suggested method relies onanomaly detection and is non-parametric in nature. The results of Section 7highlight several properties of the method.

8.1 Choice of Bandwidth

Tab. 1 shows that H2, H3 and H4 are best distinguished from H1 when using theKDE method with h = 0.5σ. This can be argued by noting that for h = 0.5σthe difference in the number of outliers in the testing sets of H1 comparedto those of the other three test subjects is the largest. Here, the number ofoutliers in the testing sets of H1 is expected to be low since the eye-trackingprofile was indeed estimated from H1 data. In Tab. 2, it can be seen thatthe same bandwidth yields the best results when testing data sets against theprofile of H3.

154

Tab. 1 and Tab. 2 also show that the choice of bandwidth for the kernel functionin the KDE method has a significant effect on the results. When the bandwidthis low, the difference in the number of outliers between different test subjectsis not large, making it difficult to distinguish between them. A low bandwidthin the kernel function will result in a PDF estimate with several narrow spikes,each centered at an observation. A high bandwidth will increase the smoothingand the estimated PDF will appear almost Gaussian.

8.2 Age Differences

Tab. 1 shows that, for h = 0.5σ, H3 and H4 differ more from H1, in terms ofnumber of outliers, than does H2. Furthermore, the results in Tab. 2 indicatethat H2 and H1 differ less from H3 than does H4. These two results suggestthat the eye-tracking profiles of H1 and H2 are similar, but that the profiles ofH3 and H4 differ from H1 and H2 and from each other. A possible explanationfor this is the age of the test subjects. H1 and H2 are close in age and youngrelative to H3 and H4. The function of the oculomotor system impairs withage and may do so to different degrees.

8.3 Normal distribution approximation vs. KDE

Judging from Fig. 2 and the Lilliefors test, the data are not always well-modeledby a normal distribution. Hence, using Gaussian functions to approximate thePDFs at each time instance should impair the performance of the suggestedmethod. Tab. 1 and Tab. 2 shows that this is indeed the case. The numberof outliers in the testing sets does not vary significantly between individualswhen normal distribution fitting is used. If the testing sets of H1 are deemedto come from H1, so should the testing sets of H2.

8.4 Method Extension

The gaze velocities of the test subjects were obtained through the applicationof the suggested method for signal differentiation on the eye-tracking data.By including the gaze velocity estimates in the eye-tracking profiles, furtherimprovements were achieved in terms of separating healthy subjects from PDpatients, as shown in Tab. 4. With more data available to allow for moreprecise estimation of the four-dimensional eye-tracking profiles, this approachmay prove to be even more efficient.

8.5 Effect of Stimulus

The fact that larger differences between healthy and PD subjects were seenfor stimuli of higher frequency content suggests that PD symptoms in the SPSbecome more apparent for faster changes in the visual stimulus trajectory. Thisis backed up by the results in Fig. 6 which show that the PD subject reacted

155

more strongly to stimulus acceleration than did the healthy subject. However,as is stated in [7], if the stimulus movements are too rapid, the smooth pursuitmechanism will transfer control to other mechanisms or shut down completelyas the subject loses interest in the untrackable circle.The method of this paper is able to reveal specific parts of stimuli trajectoriesthat are particularly difficult to track. This is a property that may be usedto help determine what stimulus behavior causes the most problems for PDpatients and thereby be material in the design of stimuli for PD quantification.

8.6 Conclusion

In conclusion, with the method presented in this paper, individuals were suc-cessfully distinguished from each other on the basis of their recorded eye move-ments. Furthermore, the eye movements of the PD patients involved in thisstudy deviated from those of healthy controls, suggesting that further improve-ments of this technique may result in aids for diagnosing or staging PD. Theresults of this work are promising, but since the number of participating testsubjects was small, the results are only indicative of the potential of the pre-sented method.

References

[1] G. Avanzini, F. Girotti, T. Carazeni, R. Spreafico. Oculomotor disorders inHuntington’s chorea Journal of Neurology, Neurosurgery and Psychiatry,42:581–589, 1979.

[2] C. T. Chou, M. Verhaegen, R. Johansson Continuous-Time Identificationof SISO Systems Using Laguerre Functions IEEE Transactions on SignalProcessing, 47:349–362, 1999.


[4] M. Hazewinkel Chi-squared distribution Encyclopedia of Mathematics,Springer, 2001.


[6] D. Jansson, A. Medvedev Dynamic Smooth Pursuit Gain Estimation fromeye-tracking Data IEEE Conference on Decision and Control, Orlando,Florida, 2011.

156

[7] D. Jansson, A. Medvedev Visual Stimulus Design in Parameter Estima-tion of the Human Smooth Pursuit System from Eye-Tracking Data IEEEAmerican Control Conference, Washington D.C., 2013.

[8] H. Lilliefors On the Kolmogorov-Smirnov test for normality with mean andvariance unknown Journal of American Statistical Association, 62:399-402,1967

[9] S. Marino, E. Sessam, G. Di Lorenzo, P. Lanzafame, G. Scullica, A. Bra-manti, F. La Rosa, G. Iannizzotto, P. Bramanti, P. Di Bella QuantitativeAnalysis of Pursuit Ocular Movements in Parkinson’s Disease by Using aVideo-Based eye-tracking System. European Neurology, 58:193–197, 2007.

[10] A. B. Sereno, P. S. Holzman Antisaccades and Smooth Pursuit Eye Move-ments in Schizophrenia Biological Psychiatry, 37:394–401, 1995.

[11] B. W. Silverman Density Estimation for Statistics and Data AnalysisLondon: Chapman & Hall/CRC, ISBN 0-412-24620-1, 1998

[12] M.B. Wilk, R. Gnanadeskian Probability plotting methods for the analysisof data Biometrika, 55:1–17, 1968.

157

Paper V

d

Mathematical Modeling and Grey-Box

Identification of the Human Smooth

Pursuit Mechanism∗

Daniel Jansson, Alexander Medvedev, Peter StoicaDepartment of Information Technology, Uppsala University


[email protected]

Hans W. AxelsonUppsala University Hospital

NeurologySE-751 85 Uppsala, Sweden

E-mail: [email protected]

Abstract

A mathematical model of the human eye smooth pursuit mechanism isconstructed by combining a fourth-order nonlinear biomechanical modelof the eye plant with a dynamic gain controller model. The biomechanicalmodel is derived based on knowledge of the anatomical properties andcharacteristics of the extraocular motor system. The controller modelstructure is chosen empirically to agree with experimental data. With theparameters of the eye plant obtained from the literature, the controllerparameters are estimated through grey-box identification. Randomlygenerated and smoothly moving visual stimuli projected on a computermonitor are used as input data while the output data are the resulting eyemovements of test subjects tracking the stimuli. The model is evaluatedin terms of accuracy in reproducing eye movements registered over timeperiods longer than 20 seconds. The model is found to perform betterthan previous state-of-the-art models for the extended time data setsused in this study.


161

1 Introduction

The extraocular system has been studied for hundreds of years, starting withDescartes in 1630, and several mathematical models have been proposed. Apartfrom being mathematical curiosities, the models have also helped to achieve abetter understanding of the underlying system. The next step is to find practi-cal use for the models. They may, for example, have numerous applications inmedicine. They could be valuable tools for quantification of medical conditionssuch as schizophrenia [17], Huntington’s chorea [1] and Parkinson’s disease [6],where extraocular muscle impairment is a common symptom. Modern eye-tracking techniques have significantly simplified estimation and validation ofeye models.There are two primary ways in which humans can shift gaze during tracking:saccades and smooth pursuit [5]. Saccades are rapid movements with the pur-pose of centering an object on the fovea. During a saccade, the eye can reachangular velocities of up to about 600◦/s [9]. Smooth pursuit movements areslower movements meant to maintain the object in the visual field. A healthyhuman can pursue targets moving at angular velocities of up to 80 - 100 ◦/s [10].The steady-state angular velocity gain, i.e the ratio between the eye velocityand the stimuli velocity, of the smooth pursuit system is called the pursuit gainand is said to be about 0.8 – 0.9 in healthy subjects [8].Over the years, several studies were undertaken to model the human eye, bothbiomechanically and neurologically. Robinson [15] was among the first to de-velop a complete physically motivated biomechanical model of the eye plant.He derived his model from previous results on the properties of the muscles,tendons and tissues of the extraocular system and used measured eye-movementdata to validate his derivations. However, Cook and Stark [4] noted that thevelocity curves of Robinson’s model did not agree with experimental data andthat he failed to point out what portion of the driving force was attributed tothe agonistic (shortening) muscle’s increase in activity and what was attributedto the antagonistic (lengthening) muscle’s decrease in activity. This led Cookand Clark to incorporate the nonlinear force-velocity relationship of active mus-cles, as derived by Hill [7], into the model. Later, Clark and Stark [3] furtherimproved on previous models and also introduced some simplifications withoutnotable performance loss. The most recent significant revision to biomechanicalmodels of the eye plant was made by McSpadden [11]. His model was the firstto take not only the nonlinear force-velocity relationship into consideration,but also the force-length relationships.Biomechanical models attempt to describe the dynamics of the eye plant andhow it reacts to neural stimulation. The eye is usually modeled as a solidsphere whose rotational movements are affected by a combination of springs,dash pots, and force sources. The values of the parameters of the resultingdifferential equations are then either derived from physical laws, or obtainedfrom experiment.In order to describe how the eye responds to stimuli, the biomechanical model

162

must be augmented with a feedback controller model simulating the interac-tion of the brain with the extraocular system. Research suggests that differ-ent controllers govern different types of eye movements, [19]. In this study, amodel of the smooth pursuit system (SPS) is sought. Research has shown thatdirection-selective, motion-sensitive cells in the primary visual cortex estimatetarget angular velocity during smooth pursuit [12]. The SPS input is thustarget velocity and the smooth pursuit controller can be seen as an angularvelocity servo which minimizes the angular velocity error [14]. Any stationaryerror in angular position will be left uncorrected by this mechanism.Robinson [16] was the first to suggest a closed-loop model of the SPS. He de-rived his controller based on observations from experiment and had to includeseveral components and feedback loops to accurately match the velocity stepresponses of the human eye. Young et al. [19] adopted a sampled data model ofthe extraocular system in which they used a simple integrator for the smoothpursuit controller. In 2008, Nuding et al. [13] proposed a smooth pursuit con-troller with a dynamic gain as suggested by Churchland and Lisberger [2]. Thedynamic gain introduced a nonlinearity to the controller, but the biomechani-cal model of the eye plant was left linear. The dynamics of the eye plant hasless influence on the dynamics of the entire system when a feedback controlleris added, but better results may be achieved with the use of a more accuratemodel of the eye plant. To the best knowledge of the authors, no such closed-loop model of the SPS, adopting a nonlinear grey-box biomechanical model ofthe eye plant, exists in previous literature.In this paper, the biomechanical model of the eye plant in [11] is modifiedand combined with a controller model inspired by the one used in [13] to yielda new closed-loop model of the SPS. The model is then evaluated on realeye movement data from two different test subjects and its performance iscompared to that of previous state-of-the-art models.The paper is composed as follows: In Section 2 the mathematical model ofthe extraocular system is derived based on prominent earlier models. Theexperimental setup and identification method is then outlined in Section 3.This is followed by experiment descriptions and validation results in Section 4,where the model performance is also evaluated and compared to the results ofprevious research. Finally, the results and methods are discussed in Section 5.

2 Mathematical Model

A mathematical model of the SPS, relating gaze direction of the eye to dynami-cal visual stimuli, is considered. The model must include both a biomechanicalpart, describing the dynamics of the actual eye plant, and a controller part,describing how the brain interacts with the eye via the neural pathways. Thetwo models components are presented below:

163

2.1 Biomechanical Model of the Eye Plant

The model presented in [11] is used here in a modified and corrected form, asgiven by (1)–(6). The applied modifications are declared below.Assuming the state-space vector

x = (y y Ft1 Ft2)T , (1)

where y is the eye position angle relative to the reference normal, y is theangular velocity and Ft1 and Ft2 are the lateral rectus and medial rectus tendonforces respectively, the model equations are given by

x = f (x) , (2)

The right side of (2) is given by

f (x) =

x21

Jg(x3 − x4 −Bgx2 −Kgx1)

Kt(x3)(−x2 − 180

πr lm1

)Kt(x4)

(x2 − 180

πr lm2

)

,

where the rates of change of the length of the lateral rectus and medial rectusmuscles respectively are

lm1=

Vmax

[x3 − Fpe (lm1

)

a1FmaxFl (lm1)− 1

]3a1 ≥ c

−Kt (x3)x2

f (lm1) + 180

πr Kt (x3)a1 < c

, (3)

lm2=

Vmax

[x4 − Fpe (lm2)

a2FmaxFl (lm2)− 1

]3a2 ≥ c

Kt (x4)x2

f (lm2) + 180πr Kt (x4)

a2 < c, (4)

for some c > 0 and

f (lm) =

180kml

πr ekme( 180πr )(lm−lms) lms ≤ lm ≤ lmc

180kpmπr lm ≥ lmc

0 otherwise

.

The force-length relationship is given by

Fl (lm) = 1−

(lmlopt− 1

w

)2

,

164

and the tendon elasticity is

Kt (Ft) =

kteFt + ktl Ft < Ftc

ks Ft ≥ Ftc

.

The passive elasticity in the muscles produces a force

Fpe (lm) =

kml

kme

[ekme( 180

πr )(lm−lms) − 1]

lms ≤ lm < lmc

180πr kpm (lm − lmc) + Fmc lm > lmc

0 otherwise

where the muscle lengths are

lm1 = lmp −πr

180x1 − lt1 ,

lm2 = lmp +πr

180x1 − lt2 ,

and the length of the tendons are

lt1 =πr

180

1

ktlx3 + lts, (5)

lt2 =πr

180

1

ktlx4 + lts. (6)

In [11], the activation levels a1 and a2 are governed by differential equationsaffected by neural inputs. Here, the activation level dependence on the neuralinputs is included in the dynamics of the controller model. Thus a1 and a2are simply the output signals of the controller. Equations (5) and (6) arelinearized versions of the equations given in [11]. These linearizations simplifythe model further and experiments show that the resulting loss of performanceis negligible. Another simplification implemented here is the negligence of themass of the extraocular muscles. This makes the model equations much lesscomplex while keeping the model behavior intact, which was also shown inexperiments. The downside of this model reduction is the need to divide (3)and (4) into piecewise functions in order for the model to still be defined forzero values of the muscle activation levels, a1 and a2. This necessitates theintroduction of the arbitrary non-physical threshold c. The parameters in themodel equations are native parameters to the eye and approximate values aregiven in [11].

165

2.2 Controller Model

In [2], it is assessed that the smooth pursuit mechanism is most sensitive to per-turbations at high velocities which led Nuding et al. [13] to include a dynamicgain

G = Mk|y(t)|+Mm, (7)

in the controller. The controller model used herein consists of this gain followedby an integrating element. The role of the integrator is to filter out highfrequency variations as it has been observed in experiment that rapid but smallvariations in the visual stimulus are mostly ignored by the SPS. The parametersof the gain, Mk and Mm, will be called the velocity coefficient and the basegain, respectively.

2.3 Complete Model

The second state variable in (1) is the angular velocity of the eye. By subtract-ing this velocity from the angular velocity of the moving target s(t), an errorsignal, e(t), is formed. The error signal is fed to the controller to complete theclosed-loop system. A fifth state variable which represents the control signal,n(t), is added to the state vector of (1) and its differential equation is addedto the system of equations in (2) yielding

x =

x2

1Jg

(x3 − x4 −Bgx2 −Kgx1)

Kt1

(−x2 − 180

πr lm1

)Kt2

(x2 − 180

πr lm2

)(Mk|x2|+Mm)(s− x2)

. (8)

The activation levels of the muscles, a1 and a2, specified in (3) and (4), arethen made dependent on x5 according to

a1 =

x5 x5 ≥ 0

0 x5 < 0, (9)

a2 =

0 x5 ≥ 0

|x5| x5 < 0. (10)

These activation levels govern the rotation of the eye, so making them dependon the control signal closes the feedback loop and completes the model. Notethat (9) and (10) introduce yet another nonlineary to the system. Fig. 1 showsa block diagram of the complete model of the SPS.

166

Mk|y|+Mm

s(t) n(t)e(t) y(t)1s

Nonlineareye plant

Figure 1: The complete model where s(t) is the angular velocity of the stimulus,y(t) is the angular velocity of the eye, e(t) is the angular velocity error and n(t)is the control signal to the eye plant. Mk is the velocity coefficient and Mm

is the base gain of the controller. The differential equation for n(t) is thusn(t) = (Mk|y|+Mm)e(t).

3 Identification

3.1 Data Acquisition

To estimate the unknown parameters of the model, input data and correspond-ing output data of the system are required. The model relates gaze directionto dynamical visual stimuli. The visual stimuli consist of a moving white cir-cle projected on a computer monitor. The gaze direction of test subjects whowere asked to follow the moving circle with their eyes was registered usingelectrooculography (EOG).The experimental setup consisted of a laptop computer with an external 20”monitor and an EOG eye tracking device, located at the Uppsala UniversityHospital, Sweden. Measured signals were logged at a sampling frequency ofTs = 62.5 Hz. The acquired data were low-pass filtered to minimize undesirednoise.Discrete-time horizontal and vertical components of the visual stimuli weregenerated by

s(ti) = s(ti−1) + v(ti)Ts

v(ti) = v(ti−1) + a(ti)Ts, |s(ti−1)| < L/2

v(ti) = −v(ti−1) + a(ti)Ts, |s(ti−1)| ≥ L/2

(11)

where a(ti) for ti = iTsi , i = 0, 1, . . ., are observations of a zero-mean i.i.d.Gaussian process, s(ti) is the horizontal or vertical part of the stimulus and Lis the width or height of the window. The velocity of the stimuli generated by(11) thus follows a so called random-walk, but one that changes it sign uponreaching the window edge. This effectively makes the white circle ”bounce” offthe window walls.Note that in identification, v(ti) is used as the input signal, due to the velocityfeedback structure of the SPS, depicted in Fig. 1.

167

3.2 Identification Method

Given input and output data of the system, the parameter set that minimizesthe cost function

V (Mk,Mm) =1

N

N−1∑i=0

(y(ti)− y(ti,Mk,Mm, Ip))2, (12)

is sought, where y is the measured system output, y is the model output, Mk

and Mm are the unknown model parameters, Ip is the initial value for theposition and t is time.The model in (8) is nonlinear both in the states and in the parameters, requir-ing the use of a nonlinear least-squares method to find the minimum of (12).Here, the model was implemented as a nonlinear grey-box object in the SystemIdentification Toolbox of MATLAB R© using the idnlgrey-function. The pem-function, which applies the prediction-error minimization method, was thenused to carry out the identification of the unknown parameters. The functionutilizes the trust-region reflective Newton method [20].Because of the nonlinearity of (8), the cost function in (12) is not necessarilyconvex, meaning it may have several local minima. Hence, it is not guaranteedthat the global minimum will be found. The outcome will depend on the initialguess of the parameters. In this paper, the values of the controller parametersprovided in [13] were used for initiation.In (12), the initial value of the angular position of the eye Ip, is treated asa parameter to be estimated. The initial values of all other states were setto zero. Experiments show that the latter is the only initial condition thatsignificantly affects the identification results and model behavior.

4 Results

4.1 Performance Evaluation on Experimental Data

Here, the performance of the model postulated in this paper is evaluated andcompared to that of two current state-of-the-art models. The first is one pro-posed by Robinson [16] which incorporates a complex controller model buta very simple model of the eye plant. The second is the model suggested byNuding et al. [13] which extends Robinson’s model by augmenting it with a non-linear dynamic gain control. As a final comparison, the model performancesare compared to that of a linear black-box model, the transfer function of whichis given by

G(z) =b0 + b1z

−1 + b2z−2

1 + a1z−1 + a2z−2 + a3z−3,

where {an}3n=1 and {bn}2n=0 are the unknown parameters to be estimated.

168

Ten different 26-second (1625-sample) stimuli were generated using (11) anddisplayed to two test subjects to obtain a total of 10 input-output data setsper test subject. For each data set, four different models were identified: Themodel proposed in this paper (M1), the model by Robinson (M2), the modelby Nuding et al. (M3), and a third order linear black-box model (M4). M1– M3 were identified using the System Identification Toolbox of MATLAB R©.M4 was identified using the standard prediction error method [18].The obtained models of each test subject were validated using the remaining 9data sets not used for identification. This was done by simulating the nonlinearmodels using the classical Runge-Kutta method and comparing the obtainedmodel output with the corresponding measured eye movement. The comparisonwas done in terms of the value of the cost function (which in the case of M1 isgiven by (12)).Tab. 1 shows the obtained mean value and standard deviation of the validationcost function for each model and test subject. It demonstrates that the model

Model #P Subject 1 Subject 2

V (θ) V (θ)

M1 2 2.21 (1.03) 4.11 (1.51)

M2 4 4.12 (1.91) 7.89 (2.66)

M3 3 3.14 (1.79) 6.11 (2.69)

M4 6 5.01 (2.21) 8.28 (3.11)

Table 1: The average value of the cost function V (θ) (standard deviation) onvalidation data for the four models of both test subjects.

of this work outperformed the models of previous research. As expected, M3yielded better model fit than M2 due to the fact that M3 is an enhanced versionof M2. Furthermore, all three grey-box models outperformed the linear black-box model.One set of horizontal eye movements of test subject 1 along with the corre-sponding output of the model of this paper is shown in Fig. 2. By inspection,the model seemed to perform accurately and the overall dynamics of the systemwere captured to a great extent.

4.2 Frequency Characteristics

It is difficult to study the frequency characteristics of the proposed model sinceit is nonlinear. In particular, the nature of its nonlinearity is such that thegain will scale with the amplitude of the input signal. This is because the max-imum velocity of the eye at a certain frequency depends on the amplitude ofthe motion, and the gain of the controller is velocity dependent as seen in (7).

169

0 5 10 15 20 25 30−20

−10

0

10

20

Time (s)

Angl

e (d

egre

es)

Model outputEye movement

Figure 2: Horizontal eye movements of Subject 1 together with the correspond-ing model output of the model of this paper.

Hence, there is no unique frequency response curve for the model. However,it is instructive to study approximate frequency response curves for fixed in-put signal amplitudes. Generally, for nonlinear models, approximate frequencycharacteristics can be obtained by methodically exciting the model with sinu-soids of fixed amplitude, but with different frequencies. The amplitudes andphases of the main frequency components of the corresponding outputs arethen registered and a non-rigorous frequency response can be constructed. InFig. 3 and Fig. 4, typical approximate frequency characteristics of one modelrealization per test subject is shown. The responses shown in Fig. 3 and Fig. 4were obtained by exciting the system with input sinusoids of amplitude 10◦

and 1◦ respectively.For both models, the gain increases slightly as the frequency increases fromzero. This behavior is explained by the fact that at low frequencies, the max-imum angular velocity of the eye will be low and so the gain of (7) will be ata minimum. The gain will then increase with increasing frequency. The gainsof the models peak near 1 Hz and they both exhibit approximately constantphase lags for frequencies lower than 1 Hz. This suggests that the SPS is moreresponsive at frequencies up to about 1 Hz for this input amplitude. At highfrequencies, the velocity saturation in the biomechanical eye plant will causethe gain to rapidly decrease. For input signals of amplitude 1◦, the range overwhich the gain is constant is wider and the phase lags are smaller. This is dueto the fact that for small amplitudes, higher frequencies are required for thevelocity saturation to become effective. Input-amplitude dependent frequencycharacteristics is a behavior not capturable by linear models.

4.3 Angular Velocity Step Responses

The responses to a 10◦/s horizontal angular velocity step of the two modelrealizations given in Fig. 4 and Fig. 3 are shown in Fig. 5. The estimatedcontroller parameters in the model pertaining to Subject 1 were Mk = 0.0038

170

Gai

n

10-2

10-1

100

Subject 1Subject 2

Frequency (Hz)0 1 2 3 4 5 6 7 8 9 10

Phas

e (d

egre

es)

-200

-100

0

Figure 3: Approximate frequency characteristics of the models using inputamplitudes of 10◦.

Gai

n

10-2

10-1

100

Subject 1Subject 2

Frequency (Hz)0 1 2 3 4 5 6 7 8 9 10

Phas

e (d

egre

es)

-200

-100

0

Figure 4: Approximate frequency characteristics of the models using inputamplitudes of 1◦.

and Mm = 0.1874 and those of Subject 2 were Mk = 0.0041 and Mm = 0.1164.The higher base gain in the first model causes the activation level a1 (3) toexceed the value of c sooner, resulting in an earlier onset of ocular motion.This is just a side effect of the division of the expressions for lm1 and lm2

in (3)and (4) into piecewise-defined functions, and is not a physical property of theSPS. Thus, the delay should not be considered to be the reaction time of the testsubjects. However, the rise times and overshoots match experiments conductedin previous research [16]. The pursuit gain is seen in Fig. 5 to be about 0.8– 0.9 for both models. These values are normal for healthy individuals [8].The overall appearance of the responses agrees with the experimental resultsof previous research [13,16].

171

Time (s)0 0.05 0.1 0.15 0.2 0.25 0.3

Angu

lar v

eloc

ity (d

egre

es/s

)

0

2

4

6

8

10

Subject 1Subject 2

Figure 5: Angular velocity step responses of the three models for a 10◦/sangular velocity step.

Convergence region

The parameters sought are those for which the loss function is at a globalminimum. However, whether the algorithm of Section 3.2 converges to a globalminimum or to an ambiguous local minimum is highly dependent on both thefunction to be minimized and the initial guess of the parameters. To assess theextent to which one can rely on the identification results, the convergence regionshould be studied. The convergence region is the part of the parameter spacein which the initial guess must lie for the identification algorithm to convergeto the global minimum, i.e. to the true system parameters. The convergenceregion is unique to the specific system parameters and input signal.Establishing the convergence region is done by simulating the known systemfor a large number of initial guesses after which it is identified using the inputand its produced output. If the estimated parameters are close enough to theactual values, the algorithm is said to have converged for that particular initialguess.Fig. 6 shows the convergence region of the identification when the systemparameters were Mk = 0.0041 and Mm = 0.1164 and the input was as in Fig. 2.The algorithm was said to converge whenever the estimated parameter valueswere within ±5% of the actual parameter values. Evidently, the convergenceis more sensitive to the initial value of Mk than Mm. For small Mk, thealgorithm converged for most Mm. However, for larger Mk the convergenceregion becomes sparser. It is also noteworthy that even for the correct value ofMk, the algorithm may not converge depending on the initial value of Mm.

172

Mk

Mm

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

2.1

1.8

1.5

1.2

0.9

0.6

0.3

0

Figure 6: The convergence region (black) of the identification algorithm whenthe system parameters were Mk = 0.0041 and Mm = 0.1164 and the input asin Fig. 2.


Constructing a model of the human smooth pursuit mechanism by combining anonlinear biomechanical model of the eye plant with a dynamic gain controllerproduced favorable results in comparison to models of previous research. Betteroverall model fit was achieved with the considered model, most likely due to thefact that the linear eye plants in the earlier models fail to provide proper move-ment restriction at larger gaze angles resulting in both position and velocityovershoots. The velocity saturation of the model presented herein, introducedthrough the use of the Hill equations (not to be confused with the Hill func-tions), captured such movement restriction to a greater extent.Furthermore, adopting a physically motivated grey-box model is an advan-tageous approach compared to using black-box type models. For the formeralternative, it is possible to assign physical meaning to the identified parame-ters. This is not the case for black-box models where the parameter values givelittle information about the modeled system. Through the identified grey-boxmodel, a greater understanding of the physics and physiology of the system isobtained which may be of aid in various applications within e.g. medicine.In Fig. 2 there are some inexplicable deviations from the stimulus in the regis-tered eye movements which are impossible for the model to reproduce. Thesedeviations are probably best explained by the fact that a human is not a simpledynamical system. The intelligent nature of the brain will try to predict themovements of the stimulus and when the movements are random, such predic-

173

tions will often fail. Glitches in the attempted tracking are therefore common.These, and several other factors, are summarized as noise, simply because theyare unpredictable and difficult, if not impossible, to model. If the gaze trajec-tories of a test subject exhibit a significant amount of such noise-like anomalies,the estimated model quality will be bad. There are many other ways to designthe controller model, some of which may give better results than what wereachieved in this study. Nevertheless, no model can predict the unpredictable.The extraocular system is inherently nonlinear, but it is plausible to assumethat it behaves linearly for small angles. Some of the complexities in themodel may thus be superfluous and certain linearizations could allow for furthersimplification. Ideally, the model behavior should approach that of a linearblack-box model as more of the model equations are linearized. This wouldimply that this model is an enhanced version of available linear models.In this study, no effort was made to combine the information of the recordedhorizontal movements and vertical movements to gain additional knowledge ofthe tracking abilities of a subject. Studying vertical and horizontal movementsseparately may overlook useful information. Future research should considerthe coupling of the vertical and horizontal part of the system to reveal proper-ties that apply to the entire ocular system and not just components of it.The data acquired from the contact sensor-based eye tracker were not givenany unit or physical magnitude. The correct angles of the visual stimuli werecalculated using the knowledge of the monitor size and the distance of the testsubject to it, but assessing the actual angles of the output data from the eyetracker is not straightforward. Calibration sequences had to be run in order torelate the output data to the input data. During the calibration, the white circlewas sequentially positioned in all four corners of the monitor for acquisition ofreference points. Unfortunately, these calibration sequences gave no simplemapping between output and input data. This was due to direct current driftsin the system, tilted head, non-centralized gaze, and asymmetrically placedcontact sensors, which all resulted in distortions of the recorded data. Datanormalization would require application of nonlinear transforms to account forperspective and rotational distortions. In lack of the information needed todefine such transforms, the output data were scaled and de-trended to attainthe best possible fit to the calibration sequence. Using eye tracking techniquesmore suitable for smooth pursuit, such as video-based eye tracking, would solvethese problems and may lead to better results.

References

[1] G. Avanzini, F. Girotti, T. Carazeni, R. Spreafico. Oculomotor disorders inHuntington’s chorea. Journal of Neurology, Neurosurgery, and Psychiatry,Vol. 42, pp. 581-589, 1979

174

[2] A. K. Churchland and S. G. Lisberger. Gain Control in Human Smooth-Pursuit Eye Movements. Journal of Neurophysiology, Vol. 87, pp. 2936-2945, 2002.

[3] M. R. Clark and L. Stark. Control of human eye movements: I. Modeling ofextraocular muscles; II. A model for the extraocular plant mechanism; III.Dynamic characteristics of the eye tracking mechanism. MathematicalBiosciences, Vol. 20, pp. 91-265, 1974.

[4] G. Cook and L. Stark. Derivation of a model for the human eye-positioningmechanism. Bull. Math. Biophys., Vol. 29, pp. 153-174, 1967.

[5] R. Dodge. Five types of eye movements in the horizontal meridian plane ofthe field of regard. American Journal of Physiology, Vol. 8, pp. 307-329,1903

[6] J. M. Gibson, R. Pimlott, C. Kennard. Ocular motor and manual trackingin Parkinson’s disease and the effect of treatment Journal of Neurology,Vol. 50, pp. 853-860, 1987

[7] A. V. Hill. The heat of shortening and dynamic constants of muscle. Proc.Roy. Soc. Lond., B126:136-195, 1938.

[8] N. Kathmann, A. Hochrein, R. Uwer, B. Bondy. Deficits in Gain of SmoothPursuit Eye Movements in Schizophrenia and Affective Disorder Patientsand Their Unaffected Relatives. American Jouran of Psychiatry, Vol. 160,pp. 696-702, 2003.

[9] H. Metz Saccadic velocity measurements in strabismus. Trans Am Oph-thalmol Soc, Vol. 81, pp. 630 - 692, 1983

[10] C. H. Meyer, A. G. Lasker and D. A. Robinson. The upper limit of humansmooth pursuit velocity. Vision Res, Vol. 25, pp. 561-563, 1985

[11] A. McSpadden. A mathematical model of human saccadic eye movement.Master’s Thesis, Texas Tech University, 1998

[12] W. T. Newsome, R. H. Wurtz, M. R. Dursteler, A. Mikami. Deficits invisual motion processing following ibotenic acid lesions of the middle tem-poral visual area of macaque monkey. Journal of Neuroscience, Vol. 5, pp.825-840, 1985

[13] U. Nuding, S. Ono, M. J. Mustari, U. Buttner and S. Glasauer A Theoryof the Dual Pathways for Smooth Pursuit Based on Dynamic Gain ControlJournal of Neurophysiology, Vol. 99, pp 2798-2808, 2008

[14] C. Rashbass. The relationship between saccadic and smooth tracking eyemovements. Journal of Physiology, Vol. 159, pp. 326-338, 1961

175

[15] D. A. Robinson. The mechanics of human saccadic eye movement. Jour-nal of Physiology, Vol. 174, pp. 245-264, 1964

[16] D. A. Robinson, J. L. Gordon and S.E. Gordon. A Model of the SmoothPursuit Eye Movement System. Biological Cybernetics Vol. 55, pp. 43-57,1986

[17] A. B. Sereno, P. S. Holzman Antisaccades and Smooth Pursuit Eye Move-ments in Schizophrenia Biological Psychiatry Vol. 37, pp. 394-401, 1995

[18] T. Soderstrom, P. Stoica System identification, Prentice-Hall, Inc. UpperSaddle River, NJ, USA, 1988

[19] L. R. Young and L. Stark. Variable Feedback Experiments Testing a Sam-pled Data Model for Eye Tracking Movements. IEEE Transactions onhuman factors in electronics, Vol. HFE-4, pp. 38-51, 1963

[20] Y. X. Yuan A Review of Trust Region Algorithms for Optimization ICIAM,Vol. 99, pp. 271-282, 2000

176

Acta Universitatis UpsaliensisUppsala Dissertations from the Faculty of ScienceEditor: The Dean of the Faculty of Science

1–11: 1970–197512. Lars Thofelt: Studies on leaf temperature recorded by direct measurement and

by thermography. 1975.13. Monica Henricsson: Nutritional studies on Chara globularis Thuill., Chara zey-

lanica Willd., and Chara haitensis Turpin. 1976.14. Göran Kloow: Studies on Regenerated Cellulose by the Fluorescence Depolar-

ization Technique. 1976.15. Carl-Magnus Backman: A High Pressure Study of the Photolytic Decomposi-

tion of Azoethane and Propionyl Peroxide. 1976.16. Lennart Källströmer: The significance of biotin and certain monosaccharides

for the growth of Aspergillus niger on rhamnose medium at elevated tempera-ture. 1977.

17. Staffan Renlund: Identification of Oxytocin and Vasopressin in the Bovine Ade-nohypophysis. 1978.

18. Bengt Finnström: Effects of pH, Ionic Strength and Light Intensity on the Flash Photolysis of L-tryptophan. 1978.

19. Thomas C. Amu: Diffusion in Dilute Solutions: An Experimental Study with Special Reference to the Effect of Size and Shape of Solute and Solvent Mole-cules. 1978.

20. Lars Tegnér: A Flash Photolysis Study of the Thermal Cis-Trans Isomerization of Some Aromatic Schiff Bases in Solution. 1979.

21. Stig Tormod: A High-Speed Stopped Flow Laser Light Scattering Apparatus and its Application in a Study of Conformational Changes in Bovine Serum Albu-min. 1985.

22. Björn Varnestig: Coulomb Excitation of Rotational Nuclei. 1987.23. Frans Lettenström: A study of nuclear effects in deep inelastic muon scattering.

1988.24. Göran Ericsson: Production of Heavy Hypernuclei in Antiproton Annihilation.

Study of their decay in the fission channel. 1988.25. Fang Peng: The Geopotential: Modelling Techniques and Physical Implications

with Case Studies in the South and East China Sea and Fennoscandia. 1989.26. Md. Anowar Hossain: Seismic Refraction Studies in the Baltic Shield along the

Fennolora Profile. 1989.27. Lars Erik Svensson: Coulomb Excitation of Vibrational Nuclei. 1989.28. Bengt Carlsson: Digital differentiating filters and model based fault detection.

1989.29. Alexander Edgar Kavka: Coulomb Excitation. Analytical Methods and Experi-

mental Results on even Selenium Nuclei. 1989.30. Christopher Juhlin: Seismic Attenuation, Shear Wave Anisotropy and Some

Aspects of Fracturing in the Crystalline Rock of the Siljan Ring Area, Central Sweden. 1990.

31. Torbjörn Wigren: Recursive Identification Based on the Nonlinear Wiener Model. 1990.

32. Kjell Janson: Experimental investigations of the proton and deuteron structure functions. 1991.

33. Suzanne W. Harris: Positive Muons in Crystalline and Amorphous Solids. 1991.34. Jan Blomgren: Experimental Studies of Giant Resonances in Medium-Weight

Spherical Nuclei. 1991.35. Jonas Lindgren: Waveform Inversion of Seismic Reflection Data through Local

Optimisation Methods. 1992.36. Liqi Fang: Dynamic Light Scattering from Polymer Gels and Semidilute Solutions.

1992.37. Raymond Munier: Segmentation, Fragmentation and Jostling of the Baltic Shield

with Time. 1993.

Prior to January 1994, the series was called Uppsala Dissertations from the Faculty of Science.

Acta Universitatis UpsaliensisUppsala Dissertations from the Faculty of Science and TechnologyEditor: The Dean of the Faculty of Science

1–14: 1994–1997. 15–21: 1998–1999. 22–35: 2000–2001. 36–51: 2002–2003.52. Erik Larsson: Identification of Stochastic Continuous-time Systems. Algorithms,

Irregular Sampling and Cramér-Rao Bounds. 2004.53. Per Åhgren: On System Identification and Acoustic Echo Cancellation. 2004.54. Felix Wehrmann: On Modelling Nonlinear Variation in Discrete Appearances of

Objects. 2004.55. Peter S. Hammerstein: Stochastic Resonance and Noise-Assisted Signal Transfer.

On Coupling-Effects of Stochastic Resonators and Spectral Optimization of Fluctu-ations in Random Network Switches. 2004.

56. Esteban Damián Avendaño Soto: Electrochromism in Nickel-based Oxides. Color-ation Mechanisms and Optimization of Sputter-deposited Thin Films. 2004.

57. Jenny Öhman Persson: The Obvious & The Essential. Interpreting Software Devel-opment & Organizational Change. 2004.

58. Chariklia Rouki: Experimental Studies of the Synthesis and the Survival Probabili-ty of Transactinides. 2004.

59. Emad Abd-Elrady: Nonlinear Approaches to Periodic Signal Modeling. 2005. 60. Marcus Nilsson: Regular Model Checking. 2005.61. Pritha Mahata: Model Checking Parameterized Timed Systems. 2005.62. Anders Berglund: Learning computer systems in a distributed project course: The

what, why, how and where. 2005.63. Barbara Piechocinska: Physics from Wholeness. Dynamical Totality as a Concep-

tual Foundation for Physical Theories. 2005.64. Pär Samuelsson: Control of Nitrogen Removal in Activated Sludge Processes.

2005.

65. Mats Ekman: Modeling and Control of Bilinear Systems. Application to the Acti-vated Sludge Process. 2005.

66. Milena Ivanova: Scalable Scientific Stream Query Processing. 2005.67. Zoran Radovic´: Software Techniques for Distributed Shared Memory. 2005.68. Richard Abrahamsson: Estimation Problems in Array Signal Processing, System

Identification, and Radar Imagery. 2006.69. Fredrik Robelius: Giant Oil Fields – The Highway to Oil. Giant Oil Fields and their

Importance for Future Oil Production. 2007.70. Anna Davour: Search for low mass WIMPs with the AMANDA neutrino telescope.

2007.71. Magnus Ågren: Set Constraints for Local Search. 2007.72. Ahmed Rezine: Parameterized Systems: Generalizing and Simplifying Automatic

Verification. 2008.73. Linda Brus: Nonlinear Identification and Control with Solar Energy Applications.

2008.74. Peter Nauclér: Estimation and Control of Resonant Systems with Stochastic Distur-

bances. 2008.75. Johan Petrini: Querying RDF Schema Views of Relational Databases. 2008.76. Noomene Ben Henda: Infinite-state Stochastic and Parameterized Systems. 2008.77. Samson Keleta: Double Pion Production in dd→αππ Reaction. 2008.78. Mei Hong: Analysis of Some Methods for Identifying Dynamic Errors-invariables

Systems. 2008.79. Robin Strand: Distance Functions and Image Processing on Point-Lattices With

Focus on the 3D Face-and Body-centered Cubic Grids. 2008.80. Ruslan Fomkin: Optimization and Execution of Complex Scientific Queries. 2009.81. John Airey: Science, Language and Literacy. Case Studies of Learning in Swedish

University Physics. 2009.82. Arvid Pohl: Search for Subrelativistic Particles with the AMANDA Neutrino Tele-

scope. 2009.83. Anna Danielsson: Doing Physics – Doing Gender. An Exploration of Physics Stu-

dents’ Identity Constitution in the Context of Laboratory Work. 2009.84. Karin Schönning: Meson Production in pd Collisions. 2009.85. Henrik Petrén: η Meson Production in Proton-Proton Collisions at Excess Energies

of 40 and 72 MeV. 2009.86. Jan Henry Nyström: Analysing Fault Tolerance for ERLANG Applications. 2009.87. John Håkansson: Design and Verification of Component Based Real-Time Sys-

tems. 2009.88. Sophie Grape: Studies of PWO Crystals and Simulations of the pp → ΛΛ, ΛΣ0 Re-

actions for the PANDA Experiment. 2009.90. Agnes Rensfelt. Viscoelastic Materials. Identification and Experiment Design. 2010.91. Erik Gudmundson. Signal Processing for Spectroscopic Applications. 2010.92. Björn Halvarsson. Interaction Analysis in Multivariable Control Systems. Applica-

tions to Bioreactors for Nitrogen Removal. 2010.93. Jesper Bengtson. Formalising process calculi. 2010. 94. Magnus Johansson. Psi-calculi: a Framework for Mobile Process Calculi. Cook

your own correct process calculus – just add data and logic. 2010. 95. Karin Rathsman. Modeling of Electron Cooling. Theory, Data and Applications.

2010.

96. Liselott Dominicus van den Bussche. Getting the Picture of University Physics. 2010.

97. Olle Engdegård. A Search for Dark Matter in the Sun with AMANDA and IceCube. 2011.

Documents

ACTA UNIVERSITATIS UPSALIENSIS Uppsala Dissertations ...uu.diva-portal.org/smash/get/diva2:859855/FULLTEXT01.pdfHär nyttjas teori och tekniker från fält så som statistik, anomalidetektion