Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
DESIGNING HIGHER PERFORMANCE
NEURAL PROSTHETIC SYSTEMS
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
Gopal Santhanam
D ecem ber 2006
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMI Number: 3242614
INFORMATION TO USERS
The quality of this reproduction is dependent upon the quality of the copy
submitted. Broken or indistinct print, colored or poor quality illustrations and
photographs, print bleed-through, substandard margins, and improper
alignment can adversely affect reproduction.
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion.
®
UMIUMI Microform 3242614
Copyright 2007 by ProQuest Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest Information and Learning Company 300 North Zeeb Road
P.O. Box 1346 Ann Arbor, Ml 48106-1346
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
© Copyright by Gopal Santhanam 2007
All Rights Reserved
ii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
I certify th a t I have read this dissertation and that, in my opinion, it is fully adequate
in scope and quality as a dissertation for the degree of Doctor of Philosophy.
(Krishna V. Shenoy) Principal^/raviser
I certify th a t I have read this dissertation and that, in my opinion, it is fully adequate
in scope and quality as a dissertation for the degree of Doctor of Philosophy.
(Balaji Prabhakar)
I certify th a t I have read this dissertation and that, in my opinion, it is fully adequate
in scope and quality as a dissertation for the degree of Doctor of Philosophy.
(Teresg H. Meng)
Approved for the University Committee on Graduate Studies.
HI
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Abstract
Many individuals suffer from movement disorders, ranging from neurological deficits in the central
nervous system to limb amputations. In extreme cases, higher-level cognitive function can remain
intact but the motor output system is blocked and cannot function (e.g., ALS, brain-stem stroke,
or spinal cord injuries). It has been proposed th a t “neural prostheses” can be interfaced with
the hum an brain to read out motor intentions and create control signals to effectively bypass the
patient’s pathology. Recent studies have demonstrated th a t monkeys and hum ans can use signals
from the brain to guide computer cursors. These brain-com puter interfaces (BCIs) may someday
assist patients, but relatively low system performance remains a major roadblock. In fact, the
speed and accuracy with which keys can be selected using BCIs is far lower than for systems
relying on simple eye movements.
This dissertation will first describe the design and demonstration, using electrode arrays im
planted in monkey dorsal pre-motor cortex, of a manyfold higher performance BCI than previously
reported. Our >4 times increase in system performance indicates th a t a fast and accurate key se
lection system, capable of operating with a range of keyboard sizes, is indeed possible (up to 6.5
bits/s or ~15 words per minute). Next, an algorithm will be introduced th a t further increases per
formance over standard neural decoding models by incorporating correlation structure between
recorded neural signals. We find th a t by using a probabilistic framework, we can appreciably re
duce the error ra te of our prosthetic system. Lastly, as such prosthetic systems transition to the
clinical setting, there will be a need to more fully characterize electrode array sensor stability,
a feature essential for mobile humans. For this purpose, I will describe the design and prelimi
nary data from an embedded system for recording neural data from freely behaving monkeys and
our preliminary characterizations of long-duration, continuously-recorded data from a standard,
implanted electrode array. Taken together, these results should substantially increase the clini
cal viability of BCIs in hum ans as well as provide opportunities for the further study of neural
prostheses. Finally, there will be a discussion of collaborative work in the context of my primary
research.
iv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Preface
Early in my graduate student career, I was a t a crossroads. I had nearly completed my M aster’s
degree and had to decide whether I would continue on in pursuit of a doctorate. I investigated
the vast array of opportunities available in the Departm ent of Electrical Engineering a t Stanford
and evaluated the research groups th a t were aligned with my personal background and interests.
Nothing really grabbed my attention until I heard about a new research group being established
by a young A ssistant Professor, Krishna Shenoy. Although I had come from a traditional electrical
engineering background and had no experience in neuroscience or bioengineering, I was hooked
after my first meeting with Krishna. The prospect of performing scientific experiments to better
understand the brain and engineering prosthetic systems for neurologically impaired patients was
immediately exciting, motivating, and intellectually challenging. The field is certainly accessible
to people who are uninitiated or non-technical. In this vein, I would like this dissertation to be
easily digestible by the reasonably intelligent reader.
However, th a t is not to say th a t the research field as a whole is light on detail or rigor. There
are a great many opportunities for solid, intellectual contributions, including (but not limited to)
developing good experimental paradigms, creating efficient experimental apparatus, conducting
thorough analyses, and deriving new computational approaches. Above all, I have felt th a t a me
thodical approach coupled with a strong attention to detail can lead to very illuminating results. It
is my hope th a t this particular viewpoint will be expressed through the course of this dissertation.
Another quality th a t attracted me to this research area was the opportunity to engineer and
build tangible systems. Too often, academic research can be theoretical without a practical bent,
a t least in the near term. While I had had an initial desire to work on theoretical problems, I
found th a t I subconsciously gravitated toward projects th a t dealt with the development of research-
quality infrastructure and the execution of laboratory experiments. This will be a common thread
tha t connects the various projects in this thesis.
Finally, it is im portant to note th a t all of the presented work was a team effort. Though I took
the lead for the projects described in the main body of this dissertation, I had vital assistance from
other members of the Shenoy laboratory, without whom I would not have been able to carry this
research forward. Such collaboration is inevitable for these types of projects, especially when they
span multiple years and personnel. Likewise, during my tenure in the laboratory, I supported
v
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
other research conducted by my colleagues. In recognition of tha t collaboration, I have included
an appendix th a t briefly describes a few of these projects and their impact on my area of research
as a whole.
The following outline is a road map for this dissertation.
• Chapter 1 provides a broad introduction to neural prosthetic systems, or brain-com puter
interface, and their utility for helping patients with motor disabilities. I describe the general
schematic of a brain-com puter interface and point to the pressing need to improve such
systems before they are clinically viable. I also introduce a categorization for prosthetic
systems th a t are targeted toward individuals with motor disabilities. This categorization
should help the reader understand the various advantages and disadvantages of the types of
systems th a t one might consider developing. An overview of the past literature is also given
to provide adequate perspective for the the la ter chapters of this dissertation.
• In Chapter 2, I report some of the infrastructure th a t was built and used in our b ra in -
computer interface. Specifically we constructed a more robust system to perform “spike sort
ing” (the task of discriminating between different neurons on the tip of a recording electrode)
while a laboratory experiment is in progress. This task of separating neurons as distinct
sources is an im portant signal processing problem; different neurons provide different views
on w hat the brain is doing and mixing them together will degrade our ability to accurately
extract th a t information. The improvement of real-time “spike sorting” was one aspect in
our quest for greater overall performance of neural prosthetic systems.
• Chapter 3 describes the brain-com puter interface th a t we built in the laboratory using a
non-human prim ate animal model. This is the cornerstone of my thesis and demonstrates
an actual system, directly analogous to the types of systems th a t will be used for paralyzed
patients. The system is intended as a communication device (e.g., a keyboard interface for
typing out emails). It is able to achieve a more than four-fold increase in performance over
the current state-of-the-art. We performed careful controls to ensure th a t our results were
not confounded by certain neurophysiological factors. We also varied different param eters to
understand the impact of keyboard layout, task timing, and numbers of recorded neurons on
the overall performance of the system.
• Chapter 4 delves into the algorithmic decoding component of the brain-com puter interface.
A neural prosthesis targeted for individuals with motor disfunction m ust be able to read sig
nals from the brain, decode their meaning, and produce an output. The algorithm we used
for the system characterized in Chapter 3 was relatively simple. The purpose of the work
presented in this chapter was to explore how much improvement might be gained by trying
a more complicated decoding algorithm. Here, we used more sophisticated mathem atical
vi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
techniques to better model the responses of neurons and take into account their interrela
tionships. The new approach resulted in a substantial increase in the performance in certain
situations.
• In Chapter 5 ,1 switch gears and introduce a system th a t we designed and built in the labora
tory to conduct portable neural recordings. As prosthetic systems transition to wider clinical
use, there will be a greater need to test laboratory systems in settings th a t are similar to
the usage mode of actual patients. Importantly, the patient population will not only include
completely paralyzed individuals; it may eventually encompass paraplegics and amputees
who are still ambulatory. To investigate this scenario, we built a device, dubbed HermesB,
th a t can record neural data continuously from a rhesus monkey, even while the monkey is
freely behaving in its home cage. We were able to investigate two interesting scientific ques
tions with these long-duration, nearly continuous, datasets. First, we examined the stability
of neural recordings in freely behaving animals; stability refers to either the gross presence
of neural signals recorded from a chronic electrode or the general shape of the voltage wave
form emitted by any neuron. Second, we compared the neural recordings between active
periods (when the animal is moving) versus inactive periods (when the anim al is station
ary or asleep). This type of characterization will be invaluable when trying to design high
performance systems th a t expand past a very limited group of disabled patients.
• The second-to-last chapter will provide some brief concluding statements.
• The final chapter contains a list of my journal and conference contributions.
• An appendix, as previously mentioned, will present a brief overview of secondary projects
with which I have been involved. These projects have im portant implications to the th rust
of my research to increase the overall performance of neural prosthetic systems.
- When Krishna’s laboratory was in its infancy, M ark Churchland began a project inves
tigating motor planning in the context of fast and slow movements. The premise was
simple, namely to better understand how neural responses differed between when a
subject was planning a fast-paced reach versus a medium-paced reach to the same tar
get location. The results of this experiment were powerful and the work has spawned
many future studies. I was very fortunate to have had to opportunity to modestly con
tribute to this work (e.g., with infrastructure development and some amount of monkey
training) and also gain valuable experience working with Mark.
- In an effort to better understand and characterize the responses in pre-motor cortex,
Aaron B atista performed experiments th a t compared the influence of the current eye
position against the the influence of the upcoming arm movement. The influence of the
eye and the arm were surprisingly comparable, especially when considering th a t th is is
vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
a brain area in which a sizable fraction of neurons travel down the spinal cord and ac
tivate the musculature. In this project, I was involved with some experimental design,
infrastructure development, and data analysis. Coupled with the experiments of M ark
Churchland, these results have a substantial impact on how we view computation in
the motor areas of the brain.
- One of the more exciting recent research th rusts in the laboratory has been our explo
ration into the mechanistic details of motor planning. There is still very little known
about the actual process of how the central nervous system takes an abstract motor goal
and produces the necessary neural responses to coordinate all of the muscles in the arm.
Initial, groundbreaking work was performed by Mark Churchland, who demonstrated
th a t neural responses in pre-motor cortex are indicative of a recurrent neural network
settling to a solution when planning a movement. Byron Yu and Afsheen Afshar have
extended this further by attem pting to map out, in an abstract space, the planning
trajectory, or the path the motor system traverses from having no plan to having a
fully-formed plan, all prior to the s ta rt of movement. I have been able to assist with
both of these projects by helping to collect relevant experimental data and providing
scientific feedback.
- Another project started in the laboratory’s early days was the effort to combine activity
from different different brain areas to create a better neural prosthesis. Different parts
of the brain can provide vastly different information about a particular movement. For
example, one brain area might signal information before the movement begins (e.g.,
specifying the endpoint of the upcoming reach). Another brain area may signal infor
mation during a movement (e.g., coordinating the flight of the arm). The project to
combine these different types of neural activity has been an ongoing one for the past
few years. The studies have been headed up by Caleb Kemere and Byron Yu and sev
eral publications in the literature speak to the effectiveness of their approaches. My
involvement has included performing simulations, collecting neural data, and provid
ing scientific feedback for their work.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A cknow ledgm ents
First and foremost, I would like to thank Krishna Shenoy for his role as my Ph.D. adviser over
the past 5+ years. There were many times during my graduate student career where I would
stop by Krishna’s office with a problem, and often I felt the problem was ra ther serious. Krishna
would take time out of his busy schedule, listen to my concerns, and quickly develop a plan of
action. Within about 30 minutes, I would be leaving his office, no longer worried, and re-energized
about the direction of my research. Furthermore, his keen scientific mind, methodical nature,
peer-oriented m anagement style, and genuine concern for his advisees has made my professional
and personal relationship with him extremely rewarding.
Krishna should also receive praise for gathering together a very talented professionals to sup
port his group. Having joined the group a t its inception, I was exposed to many administrative
issues as we were setting up the laboratory. We were all ra ther fortunate to have the steady hand
of Sandra Eisensee chaperoning us through the Stanford bureaucracy during this time. Addition
ally, there were demands on our laboratory since we use rhesus monkeys in our research. Here,
Krishna was able to recruit two first-rate lab technicians. The first was Missy Howard who was on
staff for over half of my time in the laboratory; she was a friendly caretaker of the animals and the
researchers. After her departure, our group was able to continue efficiently and effectively due to
the capable oversight of Mackenzie Risch.
Apart from my adviser, I have grown intellectually and scientifically as a result of my inter
actions with my labmates. At the time I first decided to join the laboratory, the group consisted
of only three people — Krishna, M ark Churchland, and me. Little did I know then how much of
a positive impact M ark would have on my research and overall scientific outlook. My first expo
sure to experiments was through his mentorship, and his clear thinking, ability to design great
experiments, and sound intuition for new concepts have been invaluable assets throughout my
thesis work. Special thanks to Stephen Ryu and Byron Yu for the wonderful collaborative efforts
tha t lead to the development of our very own high-performance brain-com puter interface. I t was
very exciting working in the lab with them for many months and I couldn’t have asked for better
research partners.
I would also like to acknowledge Caleb Kemere for his unwavering willingness to lend a hand,
Aaron B atista for his insights on animal training and knowledge of the current literature, Afsheen
ix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Afshar for providing valuable assistance in running experiments for our brain-com puter interface
project, and Michael Linderman and Vikash Gilja for their tireless efforts in development, data
collection, and analysis for HermesB. There are several newer members of the Shenoy laboratory
th a t have also contributed to a very intellectually stim ulating environment and for th a t I am grate
ful. Furthermore, we have all benefited from the close interactions with other research programs,
including those of Maneesh Sahani from the Gatsby Computational Neuroscience Unit, Professor
Bill Newsome in Stanford’s Neurobiology departm ent and Professor Teresa Meng in Stanford’s
Electrical Engineering department.
Finally, I would like to thank my family who has provided me with solid support throughout
these past few years, including my father, mother, grandmother, brother, and sister-in-law. I es
pecially appreciate the countless home cooked meals from my mother th a t provided me with the
biofuel to do research, and I owe so much to my brother who has inspired me to be, among other
things, intellectually curious and precise.
x
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Contents
Abstract iv
Preface v
A cknow ledgm ents ix
1 Introduction 1
1.1 O verview .................................................................................................................................... 1
1.2 Motor and Communication P ro s th e se s ............................................................................... 3
1.3 Plan and Movement A ctiv ity ................................................................................................. 5
1.4 Recent A d v an ces..................................................................................................................... 8
1.4.1 Motor Prostheses ........................................................................................................ 8
1.4.2 Communication P ro s th e s e s ...................................................................................... 11
1.5 Approaches to Improving P erform ance............................................................................... 14
2 Real-Time Spike-Sorting 15
2.1 O verview .................................................................................................................................... 15
2.2 M e th o d s .................................................................................................................................... 17
2.2.1 Spike-Sorting System D ia g ra m ................................................................................ 17
2.2.2 Basic P la tfo rm .............................................................................................................. 18
2.2.3 RR: Second Generation Classification In fra s tru c tu re .......................................... 19
2.2.4 Spike Clustering A lg o rith m ...................................................................................... 20
2.2.5 Hoop Design for Online C lassification ..................................................................... 21
2.2.6 RRR: Third Generation Classification In fra s tru c tu re .......................................... 23
2.2.7 D ata Collection and A n a ly s is .................................................................................. 24
2.3 Results and D iscu ss io n ........................................................................................................... 25
2.3.1 Clustering and C lassifica tio n ................................................................................... 25
2.3.2 Target Location E s t im a tio n ...................................................................................... 26
2.4 Feasibility of Implantable Spike-Sorting Circuits .......................................................... 28
2.5 S u m m a ry .................................................................................................................................. 30
xi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.6 C re d its ........................................................................................................................................ 31
3 A H igh-Perform ance Brain-C om puter Interface 32
3.1 Overview..................................................................................................................................... 32
3.2 M e th o d s ........................................................... 34
3.2.1 Neural recordings.......................................................................................................... 34
3.2.2 Decoding A lg o rith m s .................................................................................................. 37
3.2.3 Model T ra in in g .............................................................................................................. 40
3.3 Control E xperim ents................................................................................................................ 41
3.3.1 Selection of Skip Time (Ts^ p ) ................................................................................... 41
3.3.2 Selection of Integration Time (Tjnt ) ....................................................................... 44
3.4 BCI E xperim ents...................................................................................................................... 47
3.4.1 Additional BCI Performance A spects....................................................................... 48
3.5 S u m m a ry .................................................................................................................................. 52
3.6 Addendum: EMG m e a su re m e n ts ........................................................................................ 53
3.7 Addendum: Application of Information Theory to B C Is .................................................. 55
3.7.1 Analogy to Communication S y s te m s....................................................................... 55
3.7.2 Com putations................................................................................................................. 57
3.7.3 N o tes ............................................................................................................................... 59
3.8 C re d its ........................................................................................................................................ 61
4 Factor A nalysis Investigation 62
4.1 Overview..................................................................................................................................... 62
4.2 M e th o d s ..................................................................................................................................... 65
4.2.1 Latent Variable M odels............................................................................................... 65
4.2.2 Poisson Output M odel.................................................................................................. 66
4.2.3 Extensions to Accommodate Multiple T a r g e t s ..................................................... 69
4.3 Results and D iscu ss io n .......................................................................................................... 72
4.3.1 D ata C h a ra c te riz a tio n ............................................................................................... 72
4.3.2 Target D eco d in g ........................................................................................................... 74
4.3.3 Datasets with More Shared V a r ia b il ity ................................................................. 78
4.4 S u m m a ry .................................................................................................................................. 80
4.5 C re d its ........................................................................................................................................ 81
4.6 Appendix: M athematical Derivations for F A G O ............................................................. 82
4.6.1 E S t e p ............................................................................................................................. 82
4.6.2 M S te p ............................................................................................................................. 84
4.6.3 Inference ....................................................................................................................... 85
xii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5 Herm esB 86
5.1 Overview ..................................................................................................................................... 86
5.2 B ackground ............................................................................................................................... 87
5.3 M e th o d s ..................................................................................................................................... 90
5.3.1 System D e sc rip tio n .................................................................................................... 90
5.3.2 Recordings and A nalyses ........................................................................................... 93
5.3.3 Recording Stability Analyses .................................................................................. 95
5.4 R e s u lts ........................................................................................................................................ 96
5.4.1 System V erification .................................................................................................... 96
5.4.2 Recording S ta b i l i ty .................................................................................................... 97
5.4.3 Neural Correlates of Behavioral C ontex ts ............................................................. 103
5.5 S u m m a ry .................................................................................................................................. 106
5.6 C re d its ........................................................................................................................................ 107
6 Future D irections 108
7 Publications 110
7.1 Journal A rticles......................................................................................................................... 110
7.2 Conference Talks, Articles, A b s tra c ts .................................................................................... I l l
7.2.1 2006 .............................................................................................................................. I l l
7.2.2 2005 .............................................................................................................................. 112
7.2.3 2004 ............................................................................................................................... 113
7.2.4 2003 ............................................................................................................................... 115
7.2.5 2002 ............................................................................................................................... 115
A Select C ollaborations 116
A .l Speed Tuning in P M d ............................................................................................................ 116
A. 1.1 Motivation ................................................................................................................... 116
A. 1.2 R esu lts ........................................................................................................................... 117
A. 1.3 Significance................................................................................................................... 119
A.2 Reference Fram es in PMd ...................................................................................................... 120
A.2.1 Motivation ................................................................................................................... 120
A.2.2 R esu lts ........................................................................................................................... 120
A.2.3 Significance................................................................................................................... 123
A.3 Mechanisms of Motor P la n n in g ............................................................................................ 124
A.3.1 Motivation ................................................................................................................... 124
A.3.2 R esu lts ............................................................................................................................ 124
A.3.3 Significance................................................................................................................... 127
A.3.4 Beyond N V ................................................................................................................... 128
xiii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A.4 Mixture of Trajectory Models ............................................................................................... 131
A.4.1 Motivation ................................................................................................................... 131
A.4.2 M eth o d s ........................................................................................................................ 132
A.4.3 R esu lts ........................................................................................................................... 135
A.4.4 Significance................................................................................................................... 137
xiv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
List o f Tables
2.1 Decoding Performance Improvement due to Spike S o r tin g ............................................ 27
2.2 Decoding Performance Improvement when Further Restricting E lec tro d es .............. 27
3.1 BCI Experiments with Highest IT R C .................................................................................. 48
3.2 Comparing Methods for Calculating Information T ra n s fe r............................................ 59
4.1 Factor Analysis Performance C om parison ............................................................. 75
4.2 Second Factor Analysis Performance C o m p ariso n ........................................................... 77
5.1 HermesB P a ra m e te rs ............................................................................................................. 91
xv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
List o f Figures
1.1 Overview C hart of Neural Prostheses ................................................................................ 2
1.2 System Diagram of a N eural Prosthetic S y s te m .............................................................. 4
1.3 Examples of Plan and Movement A c tiv ity .......................................................................... 5
1.4 Control of Prostheses with Plan and Movement A c tiv ity ................................................ 7
2.1 Extraction of Neural S ig n a ls .................................................................................................. 17
2.2 Screenshot of the Cerebus User In te r fa c e .......................................................................... 18
2.3 RR Block D ia g ra m .................................................................................................................... 19
2.4 Block diagram of the Sahani algorithm ............................................................................. 20
2.5 Clustering Results from the Sahani A lg o rith m ................................................................. 22
2.6 Threshold and Hoop Design Example ................................................................................ 23
2.7 RR Block D ia g ra m .................................................................................................................... 24
2.8 Example of Difficult Hoop S o rt............................................................................................... 26
3.1 Instructed-delay and BCI T a s k s ............................................................................................ 35
3.2 Anatomical Placement of Electrode A r r a y s ....................................................................... 36
3.3 M ultivariate Gaussian D a ta -fittin g ...................................................................................... 38
3.4 Empirical Spike Count D istrib u tio n s................................................................................... 39
3.5 T ^ p A n a ly se s ......................................................................................................................... 42
3.6 Tgjjjp Analyses with Multiple T a r g e ts ................................................................................ 43
3-7 Tin t Effects in Control E x p e r im e n ts ................................................................................... 45
3.8 ITRC for M ulti-target T a s k ..................................................................................................... 46
3.9 Effect of Tjn t and Target Configuration in BCI Experiments ....................................... 49
3.10 Single-trial Accuracy as a Function of Number of Neural Units and Tjn t .................. 50
3.11 ITRC as a Function of Number of N eural Units and ................................................ 51
3.12 EMG Measurements for Monkey G ...................................................................................... 54
3.13 Schematic Diagram of a Communication S y stem .............................................................. 56
3.14 Confusion Matrices from Two E x p erim en ts ....................................................................... 58
3.15 Information Transfer as a Function of A c cu rac y .............................................................. 60
xvi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.1 Illustration of Spike Count Covariance.............................................................................. 63
4.2 Latent Space Example for F A G O ....................................................................................... 70
4.3 Choosing the Number of Latent Dimensions with Test L ik e lih o o d ............................ 73
4.4 Intrinsic Fano Factor ............................................................................................................ 74
4.5 Choosing the Number of Latent Dimensions for FAGOcmb ............................................ 77
4.6 FAGOcmb for BCI Experiments .......................................................................................... 79
5.1 Array Lifetime Diagram ...................................................................................................... 88
5.2 HermesB Block D ia g ra m ...................................................................................................... 90
5.3 HermesB Com ponents............................................................................................................ 93
5.4 Sample Protocol for HermesB E xecution ............................................................................ 94
5.5 Sample Neural and Accelerometer D a t a ............................................................................ 96
5.6 Comparison between CKI and H erm esB ............................................................................ 97
5.7 Neural Stability over 48 Hours .......................................................................................... 98
5.8 Variation in Vpp and R M S ................................................................................................... 99
5.9 Variation in Waveform Relative to Acceleration E v e n ts ................................................. 100
5.10 Variation in Waveform under High A ccelera tion ............................................................. 100
5.11 Neural and Accelerometer D a t a .......................................................................................... 103
5.12 LFP A n a ly se s ........................................................................................................................... 104
A .l Effect of Direction, Distance, and Instructed-Speed on Neural Firing R a t e ................. 118
A.2 Variety of Reference Fram es in P M d .................................................................................. 122
A.3 Optimal-Subspace H ypo thesis .............................................................................................. 125
A.4 NV Time C o u r s e .........................................................................................................................126
A.5 Inferred Trial-by-Trial Planning D y n a m ics ...................................................................... 129
A.6 MTM Trajectory Decoding E x am p les .................................................................................. 136
A.7 MTM compared against RWM and S T M ............................................................................ 137
xvii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 1
Introduction
1.1 Overview
Each year, hundreds of thousands of people suffer from neurological injuries and disease, result
ing in the perm anent loss of motor function. In many cases, the disability is so severe th a t it is
not even possible to feed oneself or communicate with others. Though surgical and medical inter
ventions have made it possible to repair peripheral nerves and promote recovery in many cases,
most central nervous system impairments still do not have effective treatm ents. Medical systems
th a t electronically interface with the nervous system, termed neural prostheses, have started to
fill some of these treatm ent gaps.
There have been successes in other classes of disabilities, including cochlear im plants for the
profoundly deaf and deep brain stimulators to alleviate Parkinsonian tremor. In the relatively near
term, epileptic-seizure disruption systems, artificial vision systems, prosthetic arm robotics, and
basic communication systems are believed to be possible, while cognitive, memory and language
oriented systems may be likely in the more distant future. Furthermore, while much of the initial
research in these areas has been enabled by basic discoveries in systems neuroscience over the
past three decades, applied neuroprosthetic research is also beginning to provide new views of
neural representations and processing.
The ultim ate goal of any prosthesis is to restore normal functionality. Though complete restora
tion is ideal, prostheses are clinically viable when the anticipated quality of life improvement out
weighs the potential risks. Since neural prostheses m ust often measure or perturb neurons in
the central nervous system, non-invasive techniques are particularly attractive and have been in
vestigated extensively. Invasive electrode-based techniques have become a major research th rust
due to their high signal quality — they promise the potential for enabling extremely high perfor
mance prostheses despite the increased risk. However, the use of invasive techniques represents a
somewhat long-term approach. In the near-term, due to moderate surgical risk and the extensive
1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 2
investment required per patient, clinical applications may be limited to only the most severely
disabled patients. However, if these systems can surpass w hat is possible with non-invasive mea
surement techniques, and surgical risk and system costs can be sufficiently minimized, it is an
ticipated th a t invasive electrode-based prostheses will find more widespread use (e.g., amputees
and moderately non-communicative cerebral palsy patients as opposed to ju st quadriplegics and
“locked in” amyotrophic lateral sclerosis (ALS) patients).
The primary objective of the work presented in this dissertation is to increase the performance
of neural prostheses so th a t these systems can tangibly increase the quality of life for patients
with motor disabilities. In this chapter, we begin by first introducing “motor” and “communication”
prostheses and their goals, which allows us to better define performance. We then review the use
of “plan” neural activity, which is beginning to complement and extend the capabilities of systems
th a t have relied on only “movement” activity until recently. Figure 1.1 depicts the two types of
prostheses (motor and communication), two types of neural activity (movement and plan), and two
ends of the invasiveness spectrum considered here. Then, we return to the topic of recent advances
in prosthetic performance, providing a description of the valuable research already conducted in
the field and what remains before systems can be viable in a clinical setting. At the end of this
chapter, we will provide a brief overview on our approach to improving prosthetic performance,
which is the subject of the balance of this dissertation.
NEURAL PRQSTKESES.
Peripheral nervous system
Central nervous system
Cortically controlled
Communication prosthesesMotor prostheses
nvasivenessinvasiveness low *—►high low*—►high
Otheractivity
Figure 1.1: Chart illustrating the relationship among the various types of neural prostheses. Prostheses th a t interface with the peripheral nervous system (e.g., cochlear implants) are extremely im portant but are not considered further here. Prostheses interfacing with the central nervous system, including the spinal cord and deep brain structures (e.g., deep brain stimulators) are also very im portant, but are again not considered. Only cortically-controlled systems which attem pt to restore motor and communication functions are considered, along with the underlying types of neural activity (movement, plan and other) and methods of m easuring this activity (invasive and non-invasive).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 3
1.2 Motor and Communication Prostheses
Motor prostheses aim to provide natural control of the paralyzed limb, via electrical microstim
ulation, or of an equivalent prosthetic replacement limb. In the case of upper-limb prostheses
considered here, natural control includes the precise three-dimensional movement of all arm and
hand segments along the desired path and with the desired speed profile. Such control is indeed
a daunting ultim ate goal, with many steps along the way leading to clinically viable systems. For
example, simply being able to feed oneself, even without being able to deftly cut a steak, could still
help thousands of quadriplegics.
Communication prostheses do not aim to restore the ability to communicate in the form of
natural voice or typing. Instead they aim to provide a fast and accurate communication channel
rivaling the natural communication rate with which most people can speak or type. For example,
“locked-in” ALS patients are altogether unable to converse with the outside world; m any other
neuro-degenerative diseases also severely compromise the quality of speech. Being able to reliably
type even a few words per m inute on a computer would be a meaningful advance for these patients.
In fact, many of the most severely disabled patients, who are the likely recipients of first gener
ation systems, would benefit from a prosthesis capable of performing motor and communication
functions (Tkach et al. 2005).
Figure 1.2 illustrates the basic operating principle behind motor and communication prosthe
ses. Neural activity from various brain regions is electronically processed to create control signals
for enacting the desired movement. Non-invasive or minimally-invasive sensors can collect neural
signals representing the average activity of many neurons. When invasive permanently-implanted
arrays of electrodes are employed, it is possible to identify individual neurons near the tip of each
electrode through a m athematical process termed action potential (spike) sorting. Spike sorting
(discussed further in Chapter 2) uses waveform shape differences to discriminate between cells
and compress the information associated with neurons’ outputs into the times a t which the cells
“spiked” (emitted an action potential). This can be even further compressed by only considering
the number of spikes in a predefined time window (e.g., 50-100 ms); this quantity is oft referred to
as the neural firing rate. After determining how each neuron responds before and during a move
ment (tuning), typically accomplished by correlating arm movements made during a behavioral
task with associated neural activity, estimation (decode) algorithms can be designed to la ter infer
the desired movement from only the ongoing pattern of neural activity.
The system can then generate control signals appropriate for moving a robotic arm, or more
simply, a cursor on a screen. Motor prostheses guide prosthetic arms (robotic arms via actuators
or paralyzed limbs via microstimulation) or computer cursors continuously through space in order
to restore natural functionality. On the other hand, communication prostheses do not attem pt to
reconstruct trajectory with high fidelity. Instead, these systems control prosthetic devices, such as
a computer cursor, by simply selecting among a discrete set of targets, as we do while typing on a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 4
S1Neural signals
FrontalP a rie ta l
O ccipital
=a 5 a n »"Spike sort T em p o ra l
Spike times
JULiLControl signals
Figure 1.2: Concept sketch of cortically-controlled motor and communication prostheses (illustrated with intra-cortical recordings). There exist distinct regions in the cortex of a rhesus monkey (as shown) and in homologous areas in hum ans th a t participate in the preparation and execution of natural arm movements. Areas include the medial intra-parietal area (MIP) / parietal reach region (PRR) with largely plan activity, the dorsal aspect of pre-motor cortex w ith both plan and movement activity, and motor cortex with largely movem ent activity. The signal path and prosthetic operation are sim ilar when non-invasive (e.g., EEG) signals are used.
keyboard. Their goal is to provide a fast and accurate communication channel.
Motor and communication prostheses are quite similar conceptually, but im portant differences
critically influence their design. Motor prostheses m ust generate movement trajectories and they
attem pt to reproduce the desired movement as accurately and precisely as possible. Continuous
prosthetic guidance is a necessity, and measures of prosthetic performance m ust quantify the sim
ilarity between the prosthetic movement and the desired trajectory. In contrast, communication
prostheses are concerned with information throughput from the subject to the world (e.g., the
speed and accuracy with which keys on a keyboard can be selected). Although a continuously
guided motor prosthesis could be used to convey information by moving to a key, only the key tha t
is eventually struck contributes to information conveyance. Thus, simple discrete prosthesis posi
tioning is sufficient for a communication prosthesis. For example, if it is possible to predict which
letter on a keyboard is desired, a computer cursor could be directly positioned on th a t key as op
posed to sliding it out to strike the key. M easures of communication prosthetic performance m ust
quantify the similarity between the prosthetic selections (speed and accuracy on a given task) and
the desired selections. This seemingly subtle distinction between motor and communication pros
theses has important implications th a t profoundly influence the type of neural activity to be used
and the overall prosthetic architecture.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 5
1.3 Plan and M ovement Activity
Two main types of neural activity are well suited for driving prosthetic movements. Plan activity
is present before arm movements begin and is believed to reflect preparatory processing required
for the fast and accurate generation of movement. This activity is readily observed before move
ment initiation in a delayed reach task. Delayed reach tasks begin by presenting a visual reach
target. After a delay period of several tenths of a second, a “go cue” indicates th a t a reach may
begin. Figure 1.3 illustrates th a t plan activity in a pre-motor cortex (PMd) neuron of a behaving
rhesus monkey is correlated to, or “tuned” for, the direction of the upcoming movement. Plan ac
tivity is present from soon after target onset until ju s t after the go cue is given (several hundred
milliseconds in this example). This activity typically rises a t the sta rt of, and is held during, the
delay period. Plan activity can also be tuned for movement extent (data not shown, Messier and
Kalaska 2000). Movement activity is present from ju st before movement initiation until ju st before
movement completion, correlating with the movement details of the arm. This activity is tuned
for both the direction (panel A) and speed (panel B) of arm movement (e.g., Moran and Schwartz
1999a).
d [ ) T a r g e t O n s e t G o C u e
y Lm .— 400 m s
P la n M o v em en t A ctivity Activity
M o v em en tActivity Activity
Figure 1.3: Plan and movement activity (illustrated with intra-cortical recordings) from a single PMd neuron. a. Spike histograms showing average plan (green) and movement (red) activity associated with center- out reaches to peripheral targets (blue circles). Fifty representative reach trajectories to the upward-right target are shown in gray (mean trajectory in black), b. Top panel, same fifty representative reach trajectories as in panel a shown as a function of time (horizontal component only). Bottom panel, spike times associated with each of these fifty reaches (one row corresponds to one trial, black tick m arks indicate spike times, gray bar indicates movement onset) along with the response averaged across these trials. (Figure adapted from Kemere et al. (2004a).)
Until recently, both motor and communication prostheses have focused exclusively on move
ment activity. As depicted in Fig. 1.4a, movement activity can be elicited merely by “thinking”
about moving the arm. Surprisingly, it has been found th a t this “movement activity” is even
present in the absence of movement or electromyographic (EMG) activity (e.g., Wolpaw and Mc
Farland 2004). When using animal models, such as healthy monkeys, neural activity th a t can
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 6
be generated when there are no muscle contractions is currently considered to be an adequate
proxy for neural signals from paralyzed subjects (Taylor et al. 2002), although this assumption
has yet to be widely tested. Hochberg et al. (2006) have recently demonstrated th a t algorithms
developed with healthy monkeys is directly applicable to paralyzed patients. Movement activity
is then decoded to generate instantaneous direction and speed signals, which is used to slide a
prosthetic device such as a computer cursor along the specified trajectory (e.g., Taylor et al. 2002;
Serruya et al. 2002; Carmena et al. 2003). Traditionally, motor prostheses would only incorporate
movement activity since the goal is to recreate the desired movement path and speed, and plan
activity has been thought to reflect only movement endpoint.1 Nevertheless, plan activity can
play an im portant role in trajectory estimation by providing an estim ate of where the movement
will end. The goal estim ate serves as a probabilistic prior, which helps constrain the instantaneous
movement estim ates based on movement activity and improves overall performance (Kemere et al.
2002, 2004b). Furthermore, if necessary, plan activity alone is sufficient to guide a motor prosthe
sis. For example, a quick succession of discrete categorizations such as left, left, and up would be
sufficient to guide a limb largely leftward and a bit upward. Alternatively, a typical movement tra
jectory (e.g., straight path with a bell-shaped speed profile) could be followed from sta rt to finish if
the endpoint can be determined from plan activity alone (Shenoy et al. 2003; Kemere et al. 2004b;
Musallam et al. 2004). Thus, motor prostheses generally rely on movement activity but can also
benefit from plan activity.
In contrast, communication prostheses are not obliged to move the prosthetic device along a
continuous path in order to strike a target such as a key on a virtual keyboard (see Fig. 1.2).
Instead, if target location can be estimated directly from neural plan activity, the cursor can be
positioned immediately on the desired key. Recent reports suggest th a t there is considerable per
formance benefit to using plan activity and direct-positional prosthesis control (Shenoy et al. 2003;
Musallam et al. 2004; Hatsopoulos et al. 2004). Figure 1.4b illustrates th a t plan activity can be
elicited merely by “intending” to move the arm to a target/key location, and it is well-established
th a t plan activity does not necessarily produce movements or EMG activity (e.g., Weinrich and
Wise 1982; Churchland et al. 2006a). Plan activity is then decoded to yield the desired key and the
prosthetic cursor immediately appears to signal the selection of this key. Additionally, movement
activity can be used to control a sliding cursor th a t then makes discrete selections; this would be
classified as a communication prosthesis (Kennedy and Bakay 1998; Leuthardt et al. 2004; Wol-
paw and McFarland 2004). Thus, communication prostheses can either rely upon plan activity
alone, ju st movement activity, or a combination of the two.
1We have recently shown that plan activity can also reflect the upcoming speed of the movement and this finding will be discussed in Appendix A.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 7
“Think" a b o u t actually movinc a rm to leftward ta rg e t (bu t d o n o t m ove arm)
O•o
M ovem ent Activity
O O
^ oo o
M otor P ro s th e s is
O“Plan" to m ove arm to leftw ard ta rg e t (bu t do no t m ove arm )
□O
P lan Activity
Oo o<§>
O , / " ^ w 0U pt I'1 / / /
C om m unication P ro s th e s is
Figure 1.4: Two types of neural activity for controlling two types of prostheses (illustrated with intra-cortical recordings), a. Movement activity can guide a cursor (red circle) along the desired path (e.g., straight or curved dashed red lines) and a t the desired speed to h it the target, b. Plan activity can be decoded into a desired endpoint location which can be used to directly position a cursor (green circle) on the desired target. Though motor prostheses often rely on movement activity and communication prostheses often rely on plan activity, both types of activity are useful in both types of prostheses (see Fig. 1.1).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 8
1.4 Recent Advances
Having introduced the basic operation and goals of motor and communication prostheses, as well
as movement and plan activity upon which these systems depend, we tu rn to the question of per
formance. This is of utmost importance since a certain minimum level of performance is needed
before a prosthesis will be deemed clinically viable for each particular patient group. Moreover,
basic risk-benefit analysis requires a firm understanding of not only the relative risks between
non-invasive and invasive alternatives, but also a quantitative understanding of the relative levels
of performance. We begin by considering recent advances in motor prostheses, whose performance
is inherently difficult to measure as it m ust compare the produced trajectory with the desired
trajectory. Recent advances in communication prostheses are then described. Performance is con
siderably simpler to measure in this domain, thereby allowing for a more quantitative comparison
among systems.
1.4.1 Motor Prostheses
In the late 1960s and 1970s, Olds, Fetz, and others discovered th a t nonhuman prim ates could
learn to regulate the firing ra te of individual cortical neurons (Olds 1965; Fetz 1969; Fetz and
Baker 1973). These pioneering experiments relied on straightforward forms of real-time feedback
but clearly demonstrated th a t firing rates could be brought to requested levels without accompa
nying muscle contraction, even in motor cortex (Ml). In the 1970s and early 1980s, Humphrey,
Schmidt, and colleagues proposed th a t neural activity could be used to directly control prostheses
(Humphrey et al. 1970; Schmidt 1980). By the late 1990s, technological advances and a consider
ably better understanding of how cortical neurons contribute to limb movement (e.g., Georgopoulos
et al. 1982, 1986; Schwartz 1992, 1993, 1994; Ashe and Georgopoulos 1994) sparked renewed in
terest in developing clinically viable systems. This would require an ongoing series of experiments
with animal models and disabled hum an patients with the goal of learning fundamental design
principles and quantifying performance.
Chapin, Nicolelis, and colleagues investigated one dimensional (ID) control of a motor prosthe
sis by training ra ts to press a lever to receive a liquid reward (Chapin et al. 1999). The apparatus
was then altered such th a t the lever was controlled by movement activity across a population of
cortical neurons. This was accomplished by chronically-implanting electrodes in cerebral cortex
and using various population decode algorithms to convert spike activity into movement signals.
They found th a t ra ts could still control the lever to receive a liquid reward. Animals soon learned
tha t actual forelimb movement was not needed and stopped movements altogether while contin
uing to move the lever with brain derived activity. Although previous studies had demonstrated
neural control of prosthetic devices, they were primarily conceived of as communication prosthe
ses. In contrast, this and other concurrent studies provided im portant experimental evidence
demonstrating the feasibility of motor prostheses.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 9
While this work demonstrated th a t a lever could swing through an arc, it is difficult to assess
the quality of this ID control. As discussed above, motor prostheses m ust reproduce the desired
trajectory and it is not clear how to ascertain w hat speed profile along this arc, including stationary
hold periods, the ra t actually desired. Success was defined as having received a reward; it was 60-
100%.
Meanwhile, investigations of 2D and full 3D control essential for recreating natural arm move
ments were underway. Often these studies controlled computer cursors on a screen visible to the
subject. While cursors are of direct utility in communication prostheses, their role in motor pros
thesis research is also well-founded. Since robotic or prosthetic limbs can operate by following a
series of desired hand locations, it is sufficient to demonstrate tha t the desired hand location (cur
sor) can be continuously controlled. Electronic controllers could then supplement this end-point
trajectory by computing the inverse kinematics necessary for determining joint angles and forces
for guiding an arm-like prosthesis. Schwartz and colleagues demonstrated th a t 2D and 3D hand
location could be reconstructed with reasonable fidelity from the movement activity of a popula
tion of simultaneously recorded M l neurons in rhesus monkeys (Isaacs et al. 2000), and Nicolelis
and colleagues reported similarly encouraging reconstructions using simultaneous recordings from
parietal cortex, PMd and M l (Wessberg et al. 2000). Together with similar recording studies from
Donoghue and colleagues (Maynard et al. 1999), the stage was set for the first 2D and 3D motor
prosthesis experiments.
Donoghue and colleagues pursued 2D cursor control with rhesus monkeys (Serruya et al. 2002).
A few tens of M l neurons were recorded simultaneously with a chronically-implanted electrode
array as monkeys moved a manipulandum to guide the cursor. By recording spike activity while
monkeys tracked a continuously, pseudo-randomly moving target, a linear filter could be learned
to relate neural activity to cursor movement. This linear filter was then used in a new task, where
neural activity guided the prosthetic cursor to h it visual targets appearing a t random locations.
Linear filters were re-learned once neural control was underway. Targets could be h it within
roughly one second on average, only slightly longer than during manipulandum control. This
study clearly demonstrated 2D cursor control.
Schwartz and colleagues pursued 3D cursor control with rhesus monkeys (Taylor et al. 2002).
A few tens of M l neurons were recorded with a chronically-implanted electrode array, and mon
keys made 3D reaching movements to visual targets appearing in a 3D virtual reality environ
ment. Neural responses were characterized in term s of the movement direction eliciting maximal
response (preferred direction) and were combined to form a modified population vector. This in
dicates the instantaneous movement of the hand or, in prosthesis mode, the prosthetic cursor.
Monkeys then entered “brain-control” mode wherein the 3D prosthetic cursor was controlled by
the neural population vector as opposed to the location of the hand. The prosthetic cursor h it the
visual target roughly half the time when several seconds were allowed for target acquisition. In-
triguingly, cell tuning properties were observed to change when controlling the prosthetic cursor.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 10
By using a novel control algorithm th a t tracked these changes (“co-adaptive” algorithm), monkeys
were able to substantially improve prosthetic performance. The algorithm was able to bootstrap its
training process without actual arm movements by assuming th a t the subject intentions matched
the instructed targets. This would be similar to the situation of paralyzed patients. Targets could
be h it on 70-80% of trials, and within 1.5-2.0 seconds. This study clearly demonstrates 3D control
and provides tantalizing evidence th a t adaptation during prosthetic operation may be used to im
prove performance. This group has since gone on to demonstrate th a t monkeys can use these 3D
control signals to feed themselves with an anthropomorphic robotic arm (Spalding et al. 2005).
Nicolelis and colleagues pursued 2D cursor control, along with a form of prosthetic grasping,
with rhesus monkeys (Carmena et al. 2003). Hundreds of M l, PMd, supplementary motor area
(SMA), prim ary sensory area (SI) and posterior parietal neurons were recorded with chronically-
implanted electrode arrays while monkeys performed each of three behavioral tasks. Monkeys
were trained to move a pole to control the position of an on-screen cursor (task 1), to grip the pole
to control the size of the cursor which indicated grip force (task 2), or the combination of tasks
1 and 2 (task 3). During these tasks, multiple linear models were used to estim ate a variety of
motor param eters including hand position, velocity, and gripping force from neural activity. After
several minutes of training, models converged to an optimal performance and their coefficients
were fixed. These models were used in “brain control” mode to translate neural activity into cursor
movement (tasks 1 & 3) or cursor size (tasks 2 & 3). Animals initially produced arm movements
in brain control mode but soon realized th a t these were not necessary and ceased to produce them
for periods of time. Intriguingly, performance on each of the three tasks improved substantially
over a period of days with performance achieving the following statistics: approximately 80% of
visual targets were h it in 2 to 3 seconds (task 1), approximately 95% of the requested grip force
ranges were achieved in 1.5 to 3 seconds (task 2), and approximately 75% of the combined tasks
were successfully completed in 3 to 3.5 seconds (task 3). This study clearly demonstrates th a t grip
force can be controlled, which is essential once a prosthetic arm arrives a t the desired location, and
tha t combined positioning and gripping are possible. Moreover, a robotic arm was inserted into the
control loop th a t increased the delay between neural activity and cursor movement by nearly 0.1
second, but performance was able to nearly fully recover after training.
Taken together, these investigations provide compelling proof-of-concept demonstrations th a t
motor prostheses are possible. Systems are generally capable of guiding a prosthetic cursor to
the specified target within a few seconds. Although this level of performance falls short of the
speed and accuracy of natura l arm movements, it is sufficiently high to motivate next-generation
experiments and technological designs. In fact, it is based on these results th a t an FDA-approved
pilot clinical tria l of motor prostheses has begun in hum an patients (Hochberg et al. 2006). It
is worth noting th a t these studies did not quantify motor prosthetic performance in term s of the
difference between the desired trajectory and the actual cursor trajectory, on a trial-by-trial basis
It is challenging to quantify true motor prosthetic performance in animal models because it is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 11
difficult to obtain reports of what the desired trajectory might be. Consequently, hum an clinical
trials may be needed to fully understand how well movement trajectories are aligning with desired
trajectories. Nevertheless, animal experiments where trajectory “corridors” are specified should
allow for more meaningful performance quantification.
1.4.2 Com m unication Prostheses
The prosthetic systems described so far have relied exclusively on movement activity and were
designed primarily as motor prostheses. Alongside and even preceding this motor prosthetic re
search effort, there has been active research on cortically-controlled communications prostheses.
This interest has been especially prevalent among the non-invasive community, and the intra-
cortical community has recently begun to explore this domain as well. Using non-invasive EEG
recordings, researchers have explored several approaches for engineering a brain-computer inter
face th a t allows for discrete target selection. Some commonly studied categories of EEG signals
include slow cortical potentials (SCPs), sensorimotor (p and /3) rhythms, and evoked potentials
(P300).
Slow cortical potentials are not entirely sensorimotor-related signals. Rather, they are rep
resentative of cognitive state and trained through operant conditioning. As the name suggests,
these signals can be controlled only over long timescales (>2 sec); thus, it is inherently difficult
to achieve high-speed communication with devices designed for this modality. Birbaumer and col
leagues have focused on building practical systems based on the SCP for many years (Birbaumer
et al. 1999), but despite various signal processing improvements, the ra te of communication is 1-2
characters per minute (Hinterberger et al. 2004). This is equivalent to approximately .08-. 16 bits
per sec (bps), assuming a ~32 key keyboard and perfect accuracy. At these rates, it would take >10
hours to electronically communicate a message of ~100 words.
Scalp recordings can also sense rhythmic activity in the p and /3 frequency bands related to
sensorimotor cognition. In recent years, Wolpaw and colleagues have been training subjects to
use prosthetic systems based on these signals to slide computer cursors to predetermined targets.
The subjects often report th a t they use motor imagery, where they imagine moving a limb or even
the entire body through a trajectory, to first learn how to control the prosthetic system. There
are many instantiations of this type of system; most of the highest performing systems report an
accuracy on the order of 80-90% and bit rates in the range of 0.25-0.40 bps (McFarland et al.
2003). Furthermore, we can infer from the numbers reported in Wolpaw and McFarland (2004)
th a t the recent 8 target, 2D EEG system is able to perform a t 1.25 bps for their best subject, given
their usual methods for calculating information transfer rate.
A final EEG communication approach is the P300 evoked potential spelling device. The sys
tem displays a grid of possible character choices (usually a 6x6 matrix). The grid is randomly
illuminated, one row or one column a t a time, during which the patient attends to the preselected
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 12
character, mentally noting when it is illum inated (this occurs when either its row or its column is
illuminated). The P300 potential is a deflection in the EEG signals occurring approximately 300
milliseconds after unexpected, or random, events. Since the patient is concentrating on a specific
row-column pair, the P300 potential will be modulated when either the row or the column is ran
domly illum inated by the system. By searching for the row and column th a t induced the most P300
deflection, one can determine the subject’s original character selection post hoc. The initial report
of this system was by Farwell and Donchin (1988) and recently there have been some dramatic
advances in performance. Modern signal processing and statistical tools, such as support vec
tor machines, have now been imported from other engineering domains to improve performance.
Serby et al. (2005) provide a survey of the latest performance of these systems (reaching 1.0 to
1.5 bps for certain subjects, with 44% accuracy) as well as a report on their own adaptive online
system th a t achieves 0.25 bps with 80% accuracy.
In the intra-cortical domain, Kennedy and colleagues demonstrated th a t just one or two neu
rons from the motor cortex of locked-in hum an ALS patients could be used to move a cursor across
a virtual keyboard to type out messages (Kennedy et al. 2000). Patients reported simply “imagin
ing” or “thinking” about moving various parts of their bodies, and eventually the computer cursor
itself, to guide the cursor. This is an example of a communication prosthesis (goal is to select
targets) operating by converting movement activity into continuous cursor control. A severely
disabled patient was able to achieve a maximum of 3 characters/min when simultaneously us
ing neural and muscle activities. While one should take care in noting th a t this was not a fully
cortically-controlled prosthesis (it used some EMG signals from residual muscle function), the per
formance was equivalent to an information ra te of ~0.5 bps. There are other minimally invasive
approaches being investigated (Kennedy et al. 2004; Leuthardt e t al. 2004) th a t will likely yield
similar information throughput, though results are still preliminary.
In another study, Taylor et al. (2003) provide a nice demonstration th a t the line between motor
prostheses and communication prostheses can be productively blurred — i.e., a system designed for
the former purpose can be used for the la tter application. In this work, the authors demonstrated
th a t the continuous trajectory output from a prosthetic system designed to reach to discrete targets
could be processed by an algorithm th a t chooses the most likely target location. If only a very early
portion of the trajectory is needed to make the discrete classification, the system can simply cut
the tria l short and prepare to decode a new target. With this paradigm, the authors are able to
estimate th a t such a system could theoretically achieve 1.6 bps2.
Different factors limit non-invasive and invasive techniques. The low communication rates of
EEG systems are likely due to the inherently low information content in the non-invasive scalp
recording caused by spatial averaging across many neurons with dissimilar properties. On the
2We took the data shown in Figure 2 by Taylor et al. (2003) and adjusted the classification time (x-axis) to include an additional 700 ms, accounting for the 500 ms inter-trial interval and the 200 ms delay between target presentation and cursor movement. Then we divided the information (y-axis) by the time (x-axis) to yield bps. We selected the maximum value on that curve.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 13
other hand, the bit rates described by Taylor et al. (2003) are most likely limited by having to
employ w hat is inherently a motor prosthesis design as a communication prosthesis. Recall th a t
communication prostheses can simply position the cursor directly on the target because target se
lection is their sole function (Shenoy et al. 2003). Furthermore, intra-cortical communication pros
theses record data from a population of single neurons th a t contains relatively high information
content, thereby allowing such a system to select targets/keys much more quickly and accurately
than EEG-based systems. Andersen and colleagues recently reported the use of MIP/PRR and
some PMd plan activity from rhesus monkeys to select targets/keys (Musallam et al. 2004). Plan
activity was used to determine the desired movement endpoint.3 Performance comparable to the
systems described above was achieved, though the speed and accuracy with which targets/keys
on keyboards of different sizes was not directly investigated or pushed. In fact, a t this time in
the literature (2004), there had been no concrete studies demonstrating how intra-cortical designs
can offer substantially higher performance than their EEG-based counterparts. This comparison
is essential if we are to justify the increased surgical risk associated with intra-cortical electrode
implantation. This dissertation helps provide answers to this critical question, as detailed later.
As will be described in Chapter 3, information transfer rate is a natural metric for quantifying
performance of communication prostheses, but this metric m ust be defined and applied precisely.
Unfortunately, the precise definition and interpretation of this information transfer ra te can be
ra ther inconsistent between the aforementioned studies. Specifically, the information transfer
rate of a system is a theoretical maximum th a t is asymptotically achievable. It is defined as the
information th a t can be conveyed with zero probability of error, and is often achieved with an infi
nite length error correcting code (Shannon 1948). Depending on the nature of the communication
prosthesis it may be im portant to optimize single-trial accuracy or it might be more beneficial to
optimize average information transfer rate (bit rate). In other words, if the design goal of a specific
prosthetic system is to optimize information transfer rate, there is no benefit to accepting a lower
bit ra te for the sake of higher average accuracy — any theoretical bit ra te computation already
implies a zero probability of error. Furthermore, using all of the error statistics when computing
the information transfer ra te can lead to a more accurate assessment of prosthetic systems. For
example, if the errors associated with a particular intended symbol are always assigned to an ad
jacent symbol, one can use this fact in the error correcting code. Often studies simply collapse the
error statistics to a single accuracy value before computing bit rate. I t will be become increasingly
im portant for the field to move toward a consistent interpretation of information theory in order
to meaningfully compare the performance of communication prostheses.
interestingly, they also discovered that activity in MIP/PRR was modulated by reward expectancy, with neural activity more sharply differentiated relative to target direction when the subject was expecting a larger reward. Leveraging these effects could increase information transfer rates.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1. INTRODUCTION 14
1.5 Approaches to Improving Performance
We have provided an overview of neural prostheses as it pertains to patients with motor disabilities
and also summarized the research th a t was current around the time we started our own work. We
now wish to frame the critical, outstanding next set of questions whose answers can move the field
forward substantially. F irst, while present systems can definitely improve the quality of life for
severely disabled patients, their performance does not come close to fully restoring lost motor func
tion, or even providing a substantial sense of independence. For example, a system th a t can attain
a communication rate of 1.5 bps can only yield a typing speed of ~3.5 words per m inute (wpm) on a
limited alphanumeric keyboard. Secondly, research has not demonstrated the promise of invasive-
based systems — namely, their ability to provide greater performance over non-invasive-based
systems. Recall th a t non-invasive-based systems record responses averaged over several million
or more neurons th a t might be representing different information. Consequently, performance can
suffer and patient training can be slow and tedious. Invasive-based systems presumably will not
suffer from these limitations but it is imperative to demonstrate this fact in practice, especially if
we are to justify the considerably higher surgical risk associated with such approaches.
One can address both of these concerns by attem pting to raise the performance of invasive-
based systems. This is w hat we chose to do. We first restricted ourselves to investigating com
munication prostheses because we feel it will provide the highest immediate impact for severely
disabled individuals. Again, the clinical population th a t will be first targeted for such systems
are patients th a t have very severe disabilities including ALS and quadriplegia, which leaves them
unable to interact easily with the outside world, if a t all. To address their basic everyday needs,
these individuals require devices th a t improve their ability to communicate, by way of selecting
icons on a screen or typing emails. Having decided on communication prostheses, we evaluated
whether it might be better to use plan or movement activity. Here, we chose to build a system
around plan activity since it does not require decoding of trajectory information and therefore may
achieve higher performance as mentioned in Section 1.3.
With these high-level design choices, we chose to investigate the performance of each com
ponent in our system schematic (consult Fig. 1.2). Naturally, if we improve the performance of
each individual piece, we can improve the performance of the system as a whole. We looked at
the initial signal extraction phase by improving the ability to source separate individual neurons
recorded on a single electrode tip (Chapter 2). Next, we started with a simple decoding algorithm
and optimized its key param eters for performance, as well as analyzed the impact of varying other
different design param eters such as keyboard size and number of recorded neurons (Chapter 3).
Finally, we examined a more complicated decoding algorithm to see how much more performance
could be squeezed out of such a system (Chapter 4). With this approach, we have made substan
tial headway in improving the performance of neural prostheses and delivering on the potential
benefits of invasive systems.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 2
Real-Time Spike-Sorting
2.1 Overview
In either basic systems neuroscience or neural prosthetic research, experiments require the col
lection of data from the brain. Traditionally, this involves investigating the response properties
of single neurons. Electrodes are implanted in the brain and situated nearby to several cells.
One can measure the voltage surrounding these neurons and attem pt to in terpret the information
communicated by them (Kandel et al. 2000). While modulations in the low-frequency neuronal os
cillations may contain useful information, the prim ary mechanism of information transm ission is
the emission of a characteristic waveform (i.e., action potential or “spike”). The exact shape of the
action potential is very regular across emission events, but is otherwise generally unim portant for
the conveyance of information.1 It is the ra te of these emissions (or “firing ra te”) th a t is thought
to convey information (Dayan and Abbott 2001).
From a signal processing perspective, the spike is the signal of interest and the remainder of
the recorded waveform is noise. If the electrode is close to the cell, the sensed action potential will
appear large relative to the background noise. This allows one to reliably detect the presence and
im portantly the time of the action potential emission. Likewise, if the electrode is far from the cell,
it will be difficult to distinguish the cell’s spikes from the noise, or it may be difficult to discriminate
between two different cells’ spikes. The procedure of spike sorting is to infer the times a t which
one or more neurons emit spikes, as well as assigning each spike to the cell th a t emitted it. This is
done by determining the number of distinct voltage shapes and using those shapes to help identify
and classify further events in the recording. A good review of the challenges associated with this
blind-source separation problem has been w ritten by Lewicki (1998).
In the context of neural prosthetic systems, spike sorting can lead to greater information ex
traction from the brain. For example, in the extreme case, it would be highly detrim ental to lump
1This exact characteristics of the action potential might be important to neural biophysicists or single channel neuroscientists, however.
15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 16
two neurons with opposite response properties together. Different neurons are communicating dif
ferent information and combining them will possibly degrade prosthetic performance. However,
there has been a tendency to shy away from spike sorting in this area of research, even though
overall decoding performance is of primary interest. One recent study recognizes the importance of
spike sorting but argues th a t it is impractical for large electrode counts given th a t sorting provides
only an incremental performance gain (Carmena et al. 2003). Some studies sort units on small
numbers of electrodes (Serruya et al. 2002; Taylor e t al. 2002), but do so in a semi-automated fash
ion. Not surprisingly, there can be a wide variability in the number of neurons and spikes detected
when different researchers are asked to manually spike sort an identical raw data stream (Wood
et al. 2004).
As discussed in Chapter 1, there has been a recent push for implanting large numbers of im
movable electrodes (100s) for neural prosthetic research. With these arrays, the electrodes’ loca
tions are fixed and there is little flexibility to increase the signal-to-noise ratio after implantation.2
Hence, implantable electrodes are m anufactured with only moderately high impedances (e.g., 200-
500 kf2) to ensure recordings from a t least one neuron, though in practice they typically record
from two or more. While recording from more than one neuron per electrode may sound advanta
geous a t first blush, it substantially increases the need for high-quality and fully-automated spike
sorting capable of distinguishing each spike’s neural origin. Sophisticated spike-sorting algorithms
exist for training and classifying multiple clusters (“units”) in low signal-to-noise situations (Sa-
hani 1999; Shoham et al. 2003), but none of these have been applied across high electrode counts
under real-time classification constraints.
In this chapter, we present a new infrastructure th a t leverages existing unsupervised, prob
abilistic clustering algorithms to sort spikes from a cortical electrode array. The data were col
lected from a rhesus monkey performing a delayed center-out reach task and are presented here
to demonstrate the efficacy of our system. We used both sorted and unsorted (thresholded) action
potentials from an array implanted in pre-motor cortex to “predict” the reach target, a common
operation in neuroprosthetic research. The use of sorted spikes led to an improvement in decoding
accuracy of up to 9.3% on an 8-target task. This system was used in most array recording and all
prosthetic experiments in the lab (e.g., Santhanam et al. 2006b). We conclude this chapter with a
brief discussion of whether such complex spike sorting algorithm are feasible in a fully implantable
system th a t m ust operate in real-time.
2 With single-electrode technology, electrode shanks are movable and can be situated closer to a neuron of interest during the experiment.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 17
2.2 Methods
2.2.1 Spike-Sorting System Diagram
Certain features are common to nearly all spike-sorting algorithms, as shown in Fig. 2.1. The sig
nal from each electrode m ust be impedance converted, filtered, and amplified (Fig. 2.1a, triangle).
Then, the process requires some conversion from the analog waveform to a digital signal because
the fundamental data desired is the timing of action potentials. Since only neural spike times are
of interest (~ 100 Hz), but the neural signals are sampled a t a high ra te (~30 kHz) to quantify
action potential shapes, clearly some sort of data reduction m ust follow the digitization. Next,
detected spikes are classified into categories corresponding to individual neurons or inseparable
m ultiunit activity. The param eters for the data reduction are computed during a “training” phase
(e.g., where raw data is processed to determine the numbers of neurons detected by each electrode
and how best to separate them). Finally, the time and identity of each detected spike can be used
by a downstream algorithm to infer a paralyzed patient’s motor intentions. Again, Lewicki (1998)
provides a complementary overview of the spike-sorting problem.
Electrode H Data TelemetryReduction
Digitization
neurons #1-3 I I I I I I I I I I I 1 1 I I II I I I
llll I I Ifinal signal: a series of spike times
broadband signal
Data reduction into three classes and noise using orthogonal subspaces.
means of class waveforms
single neural spike
3 2 sa m p le s
Figure 2.1: Extraction of neural signals, a. General block diagram of data extraction from cortical neural recordings for a prosthetic interface: Broadband signal (b: 1 s of data; c: 2 ms, showing a spike) recorded on electrode is first digitally sampled. Then, a feature extraction process reduces the dimensionality of the data (d: spike waveforms in an optimized three dimensional subspace are easy to distinguish). In th is reduced signal space, the activity of individual neurons can be differentiated from each other and from background noise. Optimally, only the spiking tim es of neurons (e) are finally transm itted from the device to the downstream system which decodes neural activity into control signals for a prosthetic device. Figure taken from Zumsteg et al. (2005).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 18
2.2.2 B asic Platform
A cornerstone of our system is the Cerebus 128 Channel Data Acquisition System (Cyberkinetics
Neurotechnology Systems, Foxborough, MA). We chose to use the Cerebus system because its ar
chitecture allows for easy interfacing with our design. F irst, the Cerebus “front-end” amplifies the
incoming signals, applies an anti-aliasing filter, and digitally samples each channel (electrode) at
30 kHz. The digitized output is transm itted via a fiber optic link to the Cerebus Neural Signal
Processor (NSP).
The NSP can filter the incoming data stream for spike extraction. We chose a fourth order high-
pass Butterworth filter with a cut-off frequency of 250 Hz (one of the available digital filters on
the Cerebus system). The NSP compares the filtered data in real-time against a simple threshold
trigger — if the trigger is tripped, a 1.6 ms “spike snippet” is sampled. Next, the NSP compares the
spike snippet against several sets of time-amplitude window discriminators (“hoops”). Each set of
hoops can be used to classify a un it — if a spike waveform passes through all of the active hoops
for a specific unit, it is classified with th a t un it number. There can be up to 4 hoops per unit and
5 units per electrode channel. Snippets th a t do not satisfy any hoops are tagged as unclassified.
The spike snippets, with their classification numbers, are broadcast over a private UDP network.
The NSP can optionally broadcast the electrodes’ 30 kHz raw data onto the network as well.
A desktop PC runs a graphical user interface (GUI) under Microsoft Windows. The GUI can
configure the NSP via the UDP network, including modifying the threshold levels and hoops for
online classification. Additionally, the GUI receives the spike snippets and plots each snippet,
color coded by classification number. A hum an operator would ordinarily determine the best sets of
hoops (or more generally, the sorting param eters) for each channel by examining the past history
of spike snippets. This is known as the training phase. Figure 2.2 is a screenshot of the user
interface for one particular electrode.
1 ISf i l ia l
ssiiawtsia 1m S.
•Lu Hi it iif ** n,: ! “
OmtJ•sgSjSjSSl~ii r m— J
F ig u re 2.2: Screenshot of the Cerebus user interface. Two units are sorted while a th ird was left unsorted. The rem ainder of the waveforms are from noise crossing the trigger. The operator sets the trigger threshold (red horizontal line) and places hoops to classify incoming waveforms. The NSP can classify a spike with a round-trip latency of 1-1.5 ms.
Compared to the Cerebus system, other commercial online spike-sorting products offer more
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 19
advanced visualization tools during the training phase, such as principal components analysis.
Even so, these products require a great deal of hum an intervention. Moreover, their ability to
identify separate units on each electrode is considerably less robust than the methods of Sahani
(1999) and Shoham et al. (2003), which can allow for automated (unsupervised) training and clas
sification of neural data.
When we had first started our array recording experiments, Dr. Stephen Ryu was responsible
for spike sorting the neural units recorded off of the array prior to each experiment. We affection
ately dubbed this the “R” system, after his surname.
2.2.3 RR: Second G eneration C lassification Infrastructure
We wished to develop a new system th a t could eliminate the tedious hum an involvement during
the training phase of the spike-sorting process. At the same time, we aimed to have a system th a t
was sufficiently repeatable and scalable to hundreds of electrodes. Our approach was to leverage
the data acquisition and classification capabilities of the Cerebus system, while automating the
training phase. A block diagram of our second-generation system, th a t we dubbed “RR” (as a
codename for “Ryu Replacement”), is shown in Fig. 2.3. F irst, the RR server configures the NSP
to broadcast the 30 kHz data stream from all active electrodes. The collection time is set so tha t
we capture a sufficient number of neural events for the training algorithm; this was typically 2—
3 minutes while the subject was actively performing a relevant behavioral task. The RR server
buffers data from all electrodes into main memory.
CorticalArray
C e re b u s N SP1
UDP
RPC 1o M atlab MEX RPC to M atlab MEX Laver RPC to M atlab MEX Laver
— — R R C lie n t C --------- R R C lie n t I I — R R C lie n t d--------- (1 ) -----------{pr --------- (2 ) ----------- --------- (N ) -----------
F ig u re 2.3: System diagram of our “RR” architecture. The Cerebus “front-end” collects raw data from the set of electrodes and interfaces with the Cerebus NSP as usual. The GUI is now relegated to a monitoring role. A second PC, running RTAI Linux (a real-time variant of Linux) is also on the UDP interface - we dub this the “RR server.” I t can receive data from the NSP as well as manipulate the NSP’s configuration. The RR server communicates with data processing clients on a separate network interface. These clients tra in on the data and m anipulate the Cerebus NSP param eters by using the RR server as a proxy. We also wrote a Matlab (MathWorks, Natick, MA) MEX interface for communication with the RR server; th is allows for easy integration of clustering algorithms th a t are w ritten in Matlab.
After collection is finished, an RR client can request a specific electrode’s data from the RR
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 20
server through a remote procedure call (UNIX rpcgen). The client processes the data with the
algorithm of choice (as detailed in the following sections), identifying the units present on an
electrode. There are typically several computational clients communicating with the server on a
TCP/IP network. Each electrode or group of electrodes can be farmed out to one of these clients for
parallel processing. This is a key feature since parallelization can dramatically reduce the overall
time to train the spike sorter across all of the electrodes. We used generic Pentium 4, 3.0 GHz
computers with 2 GB of RAM for the RR server and three accompanying RR clients.
Once an electrode’s data is processed, the client uses the clustering information to generate
hoops for online classification by the NSP. The sorting clients relay the new threshold level and
hoops to the NSP via the RR server. The NSP subsequently classifies all incoming neural events
based on these hoops.
2.2.4 Spike C lustering Algorithm
Our architecture can support a variety of specific spike-sorting algorithms. For RR, we chose to
use methods described in Sahani (1999) to identify the shapes of action potentials associated with
different cells in the recording, and the shapes were then used to design hoops for the Cerebus
NSP classification system. This training algorithm was run in Matlab and interfaced with the RR
server using compiled Matlab MEX functions. We summarize the algorithm here, but refer the
reader to Sahani (1999) for more details. The objective of the algorithm is to estim ate the number
of sources (neurons) th a t contribute to the observed signal and to characterize the distribution of
action-potential shapes th a t each source produces. Figure 2.4 provides a diagrammatic summary
of the Sahani algorithm and the individual steps are outlined below.
NeuralData
Real-Time Classification
Training
Threshold
No Yes
NWrPCACoeff.
REM/CMSNWrPCA
Threshold
Interp./ Peak Align
NWrPCA ClassificationInterp./ Peak Align
Figure 2.4: Block diagram of the Sahani algorithm. Processing associated with one electrode is illustrated; such processing m ust be performed for all electrodes. Figure taken from Zumsteg et al. (2005).
1. The data are first high-pass filtered to eliminate low-frequency neuronal oscillations th a t
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 21
are not directly related with the action potentials. The “spikes” of interest ride on top of this
wave.
2. A threshold is chosen relative to the RMS of the filtered signal. A snippet is sampled around
each threshold crossing, but snippets th a t do not m atch a predefined shape heuristic are
discarded. The recorded signal is also sampled a t times where the RMS-derived threshold
was not exceeded, so as to build an estim ate of the covariance m atrix of the noise.
3. The covariance of the background noise (Xn) is computed from all of the snippets th a t didn’t
cross threshold.
4. Align all of the spike snippets so th a t their peaks appear a t the same sample time.
5. Whiten the noise component of the spike snippets by linearly transform ing each with the
inverse square root of Xn.
6. Estim ate the principal components of the noise-whitened snippets by a fitting technique tha t
is robust to outliers (NWrPCA).
7. Project the snippets to the corresponding 4-dimensional principal subspace, and then fit
a mixture-of-Gaussians model to the data using maximum-likelihood. The fitting uses a
“relaxation” variant of Expectation-Maximization th a t reduces the chances of converging
to local maxima. The particular relaxation scheme employed allows model selection to be
integrated into the fitting procedure, thus automatically identifying the number of cells.
Figure 2.5 shows the results on a two m inute segment of neural data. The false positive and
miss rates for four clusters are each less than 5% when examining the a posteriori cluster assign
ment probabilities in the training set. Note th a t only two of the units could have been reasonably
sorted using hand-positioned hoops. Also, the pre-processing of snippets described above is essen
tial to cell identification; conventional principal components estimated from unprocessed data do
not reveal the differences between the three lower-amplitude action-potential shapes.
2.2.5 Hoop D esign for O nline C lassification
Given the mixture model derived by the spike-clustering algorithm, each action-potential snippet
can be assigned to the cell from which it is most likely to have originated. However, this operation
cannot be carried out on the standard Cerebus NSP hardware. For our second generation system
(RR), we developed a novel method th a t uses the probabilistic assignments from the training set
to generate hoop param eters for each cell. Then, the Cerebus NSP can classify new snippets in
real-time3:
3Note that since the NSP does not perform any snippet alignment before classifying, all training spike snippets are locked to NSP threshold crossings for hoop design.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 22
20
400
200
OQ.
■400
-0.2Tim e Relative to T rough (ms)
0 .2 0.4
0 10 20 30 40 50PC 1
Figure 2.5: Clustering results from electrode G20040117.22. Projections into a 2-dimensional principal subspace (after peak alignment and noise whitening) are shown (left panel). Median waveforms for each cluster dem onstrate the difference in unit shapes in the temporal domain (right panel).
1. Choose the cluster whose waveforms have the highest power about their peak.
2. Given the set of snippets for this cluster, for each time point consider a hoop whose amplitude
window encompasses a fixed multiple of the interquartile range of snippet samples a t tha t
time point. Center the windows about the median voltage a t the respective time point. This
non-parametric metric minimizes the effect of outliers in a given class.
3. Select the hoop from those considered a t all time points th a t minimizes the false positive
ra te from other neural events in the data stream. Continue this process until there are no
false positives remaining or the four available hoops are exhausted.
4. Remove all events th a t have been correctly classified by this set of hoops. Since the hoop
selection is non-optimal and is not as robust as the original clustering, there can be many
unclassified neural events remaining for this cluster (i.e., misses). These events continue to
remain in the training data since they need to influence the hoop selection for other clusters.
5. Repeat steps until all clusters have been assigned hoops.
Although our process of choosing hoops is not optimal, it is a computationally-efficient greedy
algorithm. I t implements an intuitive heuristic for setting hoops from a set of tagged waveforms.
We added an extra heuristic to reduce the leakage of false positives into legitimate classifications.
We used the first set of hoops to extract mostly unsortable activity th a t crosses threshold. Four
hoops are placed a t equispaced time points shortly after the threshold crossing. Their amplitude
windows are twice the threshold level of th a t channel, centered about zero volts. We call this the
“hash unit.” The NSP classifies units in a prioritized fashion and all classifications are mutually
exclusive. Hence, the hash unit will reduce the false positive rate a t the expense of miscategorizing
true spikes into this hash unit.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 23
150
100
50
o>
-50
100
-150- 0.2 0.2
Time Relative to T hreshold Crossing (ms)0.4 0.6 0.8
(ms)
Figure 2.6: Threshold and hoop design for three clusters on electrode G20040202.14. Shading of waveforms denotes 1.5 tim es the interquartile range, centered about the median. Hoop positions are graphed with a slight jitte r along the x-axis to provide visibility when hoops overlap.
Fig. 2.6 shows the waveforms of each unit along with the hoop settings. This is the final result
from the clustering and hoop design process for our RR system. Features of note include: the hash
unit (gray hoops) captures most of the green m ulti-unit cluster; the red un it registers ~10% false
positives due to the green unit and -20% misses due to the hash unit.
2.2.6 RRR: Third G eneration C lassification Infrastructure
The hoop-based classification of RR was a first attem pt a t automating our spike sorting procedures
in the laboratory. As will be shown subsequently, RR provided adequate sorting capabilities but
faltered in situations where the action potential shape from a particular neuron was very similar
to another on the same electrode. In order to solve this, we created a more robust infrastructure,
dubbed “RRR” (as a code name for “Revised Ryu Replacement”).
A diagram of RRR is provided in Fig. 2.7. Unlike in the RR setup, the RRR server performs the
real-time classification in lieu of the Cerebus NSP. The Cerebus NSP is relegated to act simply as a
data acquisition system. During real-time classification the NSP transm its broadband data to the
RRR server and the RRR server performs the actual spike-sorting. In RR, the NSP performed the
real-time spike sorting using hoops. In RRR, the server performs the real-time spike sorting using
the full probabilistic model afforded by the Sahani ,algorithm. This allows us to classify spikes in
the same principle component subspace used for training and the classification can be performed
using maximum-likelihood techniques. The RRR server then sends data to the GUI by mimicking
the communication of the NSP. Our early neural prosthetic experiments were conducted using RR
(monkey G), and we later transitioned to RRR (monkey H).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 24
C e re b u s N S PCorticalArray
RR R Server
RPC to M atlab MEX L aver
RRR Client I I
RPC to M atlab MEX Lavei RPC to M atlab MEX L a v e r l a i^ -
RRR Client I I
F ig u re 2.7: System diagram of our “RRR” architecture.
2.2.7 Data C ollection and A nalysis
For the purpose of quantifying and illustrating spike-sorting performance, we analyzed data from
a rhesus monkey trained to perform delayed center-out reaches to visual targets as briefly outlined
in Chapter 1. This behavioral setup will be la ter detailed in Chapter 3.
After testing and verifying the RR and RRR systems, we investigated the benefits of sorting by
running analyses to ascertain how well target location can be estimated for each tria l from plan
period activity. Given a particular target location, the distribution of spike rates for each tria l was
modeled as a m ultivariate Gaussian. We employed maximum likelihood methods to determine the
highest probability target location for a given trial (see Chapter 3 for specifics). E ither sorted data
or threshold crossings were input into the estimator. We obtained classification percentages for
each day’s session through leave-one-out cross-validation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 25
2.3 Results and D iscussion
2.3.1 C lustering and C lassification
The two key param eters for our algorithm were the threshold level (RR and RRR) and hoop extent
(RR only); these were set to 3—3.5 times the RMS of the filtered data and 3.73 times the interquar
tile range, respectively. The param eters were empirically determined to provide adequate results.
Also, the amount of training data collected per electrode was im portant since it determined how
many representative spikes were used to build our clustering models. We used 2 minutes of data
to balance between the need for sufficient training data and the overall amount of experiment time
available per day (~90-120 minutes).
Our infrastructure was highly effective in term s of training time. The clustering algorithm
took approximately 20 seconds per electrode, and we sorted 96 electrodes in 10 minutes with three
clients. This is a t least as fast as hum an-assisted training, but the strength of our architecture is
its scalability and repeatability for very large electrode counts. Training time can be reduced by
simply adding more RR/RRR clients.
The traditional problem with testing spike-sorting algorithms on real neural data is th a t there
is no measure for the ground tru th . We have no independent way of knowing w hat the time
and identity of each spike really was. As an alternative, given th a t the training algorithm is
probabilistic by nature, we can compute the a posteriori probability of each spike belonging to
each of the classes (which correspond to neurons). This allows us to calculate an average false
positive and miss probability for each cluster. The cluster is said to be well-isolated if each type
of misclassification probability is under 5%. For example, with our G20040202, G20040312, and
G20040330 datasets, there were 62, 40, and 41 units th a t fit this criteria, respectively.
For our original RR, we asked if these units are still well-isolated when classified with hoops.
We computed the false positive and miss rates for hoop classification by comparing against the
initial clustering results. For the same three datasets, only 46, 33, and 25 units had false positive
and miss rates of less than 5% when sorting with hoops. This was a significant drop in the number
of well-discriminable units. Furthermore, this error comparison excluded noise snippets th a t were
misclassified as spikes. When we lifted this exemption, we found th a t noise heavily influences the
misclassification rates and many fewer neural units satisfied our goodness criteria.
Ultimately, the hoop-based classifier performed well but did not achieve exceptional results.
There was either extraneous noise assigned to legitimate units or a loss of spikes into the hash
unit. For example, our hoop-based system was unable to reliably sort all five units on the elec
trode data shown in Fig. 2.5. Nevertheless, the overall sorting performance was assessed to be
qualitatively equivalent to hum an selection of the hoops — often an individual may feel he is se
lecting acceptable hoops, but he is unable to fully appreciate the underlying clustering of the data.
Figure 2.8 provides an extreme limit case where hoop-based sorting breaks down.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 26
150
100
TOO>
-50
1.20.4Time Relative to Threshold C rossing (ms)
0.2 0.6Time Relative
F ig u re 2.8: Waveforms from two clusters w ith the shading height corresponding to two times the interquartile range, centered a t the median. These units are easily separated by the clustering algorithm, with low false positive and miss rates. While the median shapes are distinct, a hoop-based classifier struggles with the data due to the spread of the waveforms. Hoops placed for the green unit capture 26.7% false positives from the red un it even though clustering algorithm estim ates false positives a t less than 5%. D ata were taken from electrode G20040312.21.
Compared to RR, spike-sorting performance was considerably improved when using RRR, as
verified by hum an inspection. This is because the sorts were performed in the reduced-dimension
space using the Sahani algorithm’s probabilistic model. As such, unlike hoop classification, the
overall shape of the waveform was implicitly considered ra ther than treating each timepoint as in
dependent. This allows for the distinguishing of shapes with considerable timepoint-by-timepoint
overlap (e.g., Fig. 2.8).
2.3.2 Target Location Estim ation
Next, we performed a target estimation analysis to verify the hypothesis th a t spike sorting allows
for greater information extraction. For each day’s data, we excluded electrodes th a t did not have
two or more clustered units as determined by our training algorithm. This exclusion is sensible
since if an electrode had only one very large amplitude spike waveform, its signal would be de
facto spike-sorted with even ju st a thresholding scheme. Furthermore, for our task, the estimation
performance asymptotes as the number of electrodes is increased, even if the additional electrodes
only possess unsortable neural activity. To illustrate the benefits of spike sorting, we “biased”
the simulations by considering only sortable electrodes. We suggest th a t this biasing would not
be necessary for a more challenging behavioral task — see Carmena et al. (2003) where more
performance was gained by spike sorting.
Spiking ra te was calculated for each unit (or electrode for the unsorted simulations) in a 150
to 350 ms window following the reach target presentation. The results of the maximum-likelihood
estimator are summarized in Table 2.1. For RR, we found a performance increase between 2.7 and
5.7% when using spike-sorted units for classification. The increase was dependent on the following
parameters: model training size, spike integration window, and the electrodes th a t were excluded.
Searching this entire space of param eters is intractable and unnecessary. We can however report
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 27
th a t in the various scenarios we tested, RR spike sorting resulted in approximately the same
performance increases relative to performance when using simple threshold crossings. On two
occasions, we compared the autom ated sorting architecture and hand-optimized hoop locations;
the two methods were nearly equivalent in performance.
Table 2.1: Decoding Performance Improvement due to Spike Sorting
D ata Set # of Tgts # of Elec. Unsorted Perf. RR Perf. RRR Perf.G20040329 8 36 64.4% 70.1% 73.2%G20040330 8 35 66.3% 71.9% 75.6%G20040413 16 35 75.6% 80.0% 83.5%G20040417 8 48 91.1% 93.8% 94.5%G20040421 8 42 83.7% 89.1% 91.3%
When using RRR, there was a consistent increase in performance over even RR, up to +3.7%.
The total performance increase over unsorted data is 7.5-9.3% (when excluding G20040417).4 If
we are to further restrict our analyses to only electrodes th a t have 3 separate neural units or more,
we see th a t there can be an even more dramatic difference between unsorted-, RR-, and RRR-based
performances. These numbers are listed in Table 2.2. With more neural units on a given electrode,
differentiating between them becomes much more important. Hence, better sorting leads to better
overall decoding performance, which agrees with intuition. Finally, as will be shown in Chapter 3,
what appear to be small gains in decode accuracy can have large impacts on overall performance.
Table 2.2: Decoding Performance Improvement when Further Restricting Electrodes
D ata Set # of Tgts # of Elec. Unsorted Perf. RR Perf. RRR Perf.G20040329 8 20 59.2% 65.5% 69.2%G20040330 8 16 58.6% 65.1% 70.0%G20040413 16 16 64.6% 69.9% 72.9%G20040417 8 18 80.9% 84.2% 86.8%G20040421 8 24 80.9% 85.5% 87.7%
4The performance increase when including all of the electrodes is 6.1-7.8%, again omitting G20040417.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 28
2.4 Feasibility o f Implantable Spike-Sorting Circuits
Based on the success of proof-of-concept prosthetic systems in the laboratory (see Chapter 3), there
is now considerable interest creating implantable electronics for use in clinical systems. A critical
question is whether it is possible to perform the spike-sorting operations in real-time and with low
power. Low power is essential both for power supply considerations and heat dissipation in the
brain. In order to answer this question, we performed a feasibility analysis to estim ate how much
power would be theoretically consumed by an advanced real-time spike-sorting algorithm. Only a
summary of this work is provided here. For more details please refer to Zumsteg et al. (2005).
To dem onstrate the feasibility of high quality real-time spike sorting in implanted hardware,
we chose the algorithm th a t we believe to be both one of the best and one of the most computation
ally intensive spike sorting algorithms available. We intentionally sought a state-of-the-art spike
sorting algorithm, which is uncompromising in spike sorting quality and relies on principled m a
chine learning techniques, to help assure th a t our power estimates would not be overly optimistic.
As detailed earlier in Section 2.2.4, our algorithm of choice is the Sahani algorithm.
We addressed the power consumption of the two major computational elements of a spike sort
ing system: analog-digital conversion (ADC) and the digital training/classification. For the ADC
component, we first turned to previous reports of low-energy ADC converters (Scott et al. 2003).
However, recent developments in low-power ADC design have leveraged the extremely power-
efficient digital circuit to “aid” the analog design (Murmann and Boser 2004). As a result, using
these digital calibration and compensation techniques, the power consumption of ADCs is expected
to be reduced by an order of magnitude from the values quoted by Scott et al. (2003). Hence, a con
verter consuming close to 1 pW with 8-bit resolution a t 30 kHz — or 100 pW for 100 channels —
should be achievable.
We estim ated the power requirements of the Sahani spike-sorting algorithm by recasting the
operations performed to simple instructions th a t can be implemented in integrated circuits (ICs).
A detailed analysis of the algorithms was carried out and approximate figures for the number
of operations (specifically adds and multiplies) required for each task were obtained.5 Operation
counts were then translated to power using the conversion factor 1 mW/GOPS (Chandrakasan
et al. 1992). This figure is used as the standard power consumption per operation for ASICs
implemented in 0.13 pm CMOS technology. Finally, to approximate power usage from memory
accesses, we simply double the power from instruction execution (Meng et al. 1998). The figures
should be taken as an “order of magnitude” indication. However, we believe th a t these figures are
indicative of the power consumption, and thus achieve the objective of showing th a t these systems
can be implemented in an implantable neural prosthesis.
For the training portion of the Sahani algorithm, assuming a training interval of 12 hours, the
5 Operation counts for some complex linear algebra functions used in the algorithms, like matrix decompositions, were taken from standard texts on numerical linear algebra (Golub and Van Loan 1983).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 29
total power requirem ent is approximately 2.8 pW for 100 electrodes. Next, the classification pro
cess itself contributes relatively little to the overall power consumption of real-time spike sorting,
even though it m ust be operated continuously. A simplified classification, using only the mini
mum Euclidean distance to a cluster (i.e., the traditional technique used in concert with PCA),
requires 1.3 x 104 ops/sec/electrode. This corresponds to 0.026 pW/electrode or 2.6 pW for 100 elec
trodes. Finally, most of the real-time computational burden is dominated by the high-pass filter
and thresholding. The problem is made more difficult by the fact th a t the LFP is in the 0.5-100 Hz
frequency range, while much of the signal power is concentrated in the 1000-3000 Hz range. With
a sampling frequency of 30 kHz, the necessary transition band is somewhat steep. A digital filter
consumes approximately 1 pW per electrode or 100 pW per 100 electrodes. This figure is similar to
tha t of a analog filtering approach although it will not require large capacitors and resistors which
can be chip-area intensive.
Therefore, with 100 electrodes, an upper bound of the power consumption of our spike sorting
algorithm (without interpolation during real-time operation) is ~150 pW. Also, we have shown
tha t the hundred, 8-bit, 30 kHz analog-to-digital converters needed for digital spike sorting are
expected to consume less than 100 pW of power. Thus, 250 pW is an achievable level of power
consumption for an implantable, 100 electrode digital spike sorting circuit. Assuming heat dis
sipation over a 16 mm2 chip, we have a power to area ratio of about 1.6 mW/cm2, which is well
below the 80 mW/cm2 chronic heat dissipation threshold believed to cause tissue damage (Seese
et al. 1998). By way of comparison, for 100 electrodes, the all-analog approach of (Harrison 2003),
would require 5.7 mW, and the wavelet compression technique of (Oweiss et al. 2003) 120 mW.
While these alternative approaches may benefit from some of the architectural techniques which
we leverage for our estimates, the loss of information and less than ideal data compression remain
significant when compared to our proposed implantable, spike-sorting approach.6
6 We have not considered the requisite low-noise amplifier in this report as all approaches to spike sorting require their use, and because recent reports have demonstrated low power (< 1 pW per channel) and noise (~2 pV) levels (Horiuchi et al. 2004; Harrison and Charles 2003).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 30
2.5 Summary
We demonstrated th a t fully autom ated spike sorting for laboratory experiments involving hun
dreds of neural electrodes is practical with present-day technology. Our architecture facilitates
use of unsupervised clustering algorithms for configuring existing real-time spike classifiers. Fur
thermore, we also demonstrated th a t the performance of a target estim ator was improved when
using sorted information as opposed to threshold crossings. The performance improvement was
moderate and it was gained with little expense. The infrastructure, once installed, is trivial to run
before every day’s experiment, and it is extensible past the point where rapid, consistent, human-
assisted sorting of hundreds of electrodes becomes untenable. Finally, the training stage is truly
quantifiable and can serve as a more robust daily record of the neural im plant’s stability.
We have offered two alternatives. The RR architecture is designed to exploit the real-time
classification capability of the Cerebus NSP. Since it uses a very simple classifier (time-amplitude
hoops) it should extend easily to thousands of electrodes. The RRR architecture is more com
putationally intensive since it m ust perform operations similar to the training algorithm (peak-
alignment; subspace projection) and then classify based on a maximum-likelihood computation.
However, spike shapes can be more accurately sorted using this technique. The extra performance
benefits of RRR over RR are measurable and the sorting method is more mathematically princi
pled and computationally tenable for the ~100 electrode systems we use in the laboratory today.
The computational complexities of RRR are also theoretically realizable in a custom, implantable
solution given our power feasibility calculations.
It is im portant to note th a t sorting units may possibly reduce decode performance. If nearby
neurons had similar tuning properties, it could be advantageous to group them together as a single
channel ra ther than separating each into its own unit. This would help combat the inherent
spiking variability of neurons, which is often modeled as an inhomogeneous Poisson distribution
(Dayan and Abbott 2001). The idea of designing the sorting param eters based on the optimality of
the final decoding algorithm is an exciting line of research and requires further investigation. To
properly explore this subject, one would have to first spike sort and then ask whether it is better to
fuse together the separated units. As such, there is a need for a solid spike-sorting infrastructure
regardless.
Finally, for many neuroscience experiments and future prosthetic work where single neuron
adaptation is expected, it is critical to track individual neurons. The experimenter needs to val
idate th a t shifts in a neuron’s response are indeed legitimate and not an artifact of poor signal
separation. Robust architectures like th a t proposed in this Chapter can hopefully facilitate these
types of studies.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2. REAL-TIME SPIKE-SORTING 31
2.6 Credits
A report of RR has previously appeared in a five-page conference paper format (Santhanam et al.
2004). The starting point for our endeavors was the collection of algorithms developed by Dr. Ma-
neesh Sahani for his doctoral dissertation (Sahani 1999). I wrote and tested all of the real-time
software, which interfaced tightly with the algorithmic code provided by Dr. Sahani. Dr. Sahani
and I worked closely to develop the new greedy algorithm for RR and I la ter ported RR to RRR.
Dr. Stephen Ryu (i.e., “R”) was the prim ary surgeon for electrode im plantation and provided the
initial inspiration for the project; he would perform daily spike sorting by hand prior to the advent
of this autom ated system. Caleb Kemere conducted much of the analysis regarding the feasibility
of an implantable spike-sorting solution, Stephen O’Driscoll assisted with the ADC power compu
tations, and Professor Teresa Meng contributed many valuable scientific discussions.
We also thank Byron Yu for assisting with data collection, Missy Howard for surgical assistance
and veterinary care, and Dr. Nicho Hatsopoulos for surgical assistance with the monkey G implant.
This study was supported by the NDSEG Fellowship (GS), NIH grant NS-10414 (MS), the
Coleman Fund (MS), the Christopher Reeve Paralysis Foundation (SIR,KVS), MARCO Center
for Circuit & System Solutions (www.c2s2.org) under contract 2003-CT-888 (THM,CK), and the
following awards to KVS: the NSF Center for Neuromorphic Systems Engineering a t Caltech,
ONR, W hitaker Foundation, Center for Integrated Systems a t Stanford, Sloan Foundation, and
Burroughs Wellcome Fund Career Award in the Biomedical Sciences.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 3
A High-Perform ance
Brain-C om puter Interface
3.1 Overview
As covered in Chapter 1, brain-com puter interfaces (BCIs) may someday assist patients suffer
ing from neurological injury or disease, but relatively low system performance remains a major
roadblock. In fact, the speed and accuracy with which keys can be selected using BCIs is still
far lower than for systems relying on simple eye movements. This is true whether BCIs employ
recordings from populations of individual neurons using invasive electrode techniques (Serruya
et al. 2002; Taylor et al. 2002; Carmena et al. 2003; Musallam et al. 2004; Kennedy et al. 2000;
Hochberg et al. 2006; Patil et al. 2004) or EEG recordings using less- or non-invasive (Leuthardt
et al. 2004; Wolpaw and McFarland 2004) techniques. In Chapter 2, we presented a front-end
approach to improving prosthetic performance, namely performing a more principled job of dis
criminating neurons recorded from implanted electrodes. We now tu rn to improving the task of
decoding neural signals to predict the motor intentions of a subject.
Most BCIs translate neural activity into a continuous movement command, which guides a
computer cursor to a desired visual target (Kennedy et al. 2000; Serruya et al. 2002; Taylor et al.
2002; Carmena et al. 2003; Leuthardt et al. 2004; Wolpaw and McFarland 2004; Patil e t al. 2004;
Hochberg et al. 2006). If the cursor is used to select targets representing discrete actions, the BCI
serves as a communication prosthesis. Examples include typing keys on a keyboard, turning on
room lights, and moving a wheelchair in specific directions. Human-operated BCIs are currently
capable of communicating only a few letters per m inute (~1 bit/s sustained rate; Wolpaw and
McFarland 2004) and monkey-operated systems can only accurately select one target every 1-3
seconds (~1.6 bits/s sustained rate; Taylor et al. 2003), despite using invasive electrodes.
An alternate, potentially higher-performance approach is to translate neural activity into a
32
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 33
prediction of the intended target and immediately place the cursor directly on th a t location. This
type of control is appropriate for communication prostheses and benefits from not having to esti
mate unnecessary param eters such as continuous trajectory (Shenoy et al. 2003; Musallam et al.
2004). In this chapter, we describe how we conducted an iterative series of experiments to investi
gate how quickly and accurately a BCI could operate under direct endpoint control. We were able
to design and demonstrate, using electrode arrays implanted in monkey dorsal pre-motor cortex,
a manyfold higher performance BCI than previously reported (Wolpaw and McFarland 2004; Tay
lor et al. 2003). These results indicate th a t a fast and accurate key selection system, capable of
operating with a range of keyboard sizes, is possible (up to 6.5 bits/s, or ~15 words per minute,
with 96 electrodes). The highest information throughput is achieved with unprecedentedly brief
neural recordings, even as recording quality degrades over time. These performance results and
their implications for system design should substantially increase the clinical viability of BCIs in
humans. The significance of this work has been independently assessed by Scott (2006).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 34
3.2 Methods
We trained two rhesus monkeys (G and H) to perform a standard instructed-delay center-out reach
ing task (Cisek and Kalaska 2004) to first assess neural activity in the arm representation of
monkey pre-motor cortex (PMd), as shown in Fig. 3.1a. Animal protocols were approved by the
Stanford University Institutional Animal Care and Use Committee. Hand and eye position were
tracked optically (Polaris, Northern Digital, Canada; Iscan, Burlington, MA). Stimuli were back-
projected onto a frontoparallel screen 30 cm from the monkey. Real-reach trials (consult Fig. 3.1a)
began when the monkey touched a central yellow square and fixated his eyes on a m agenta cross.
Following a touch hold time (200-400 ms), a visual reach target appeared on the screen. After
a randomized (200-1000 ms) delay period, a “go” cue (central touch and fixation cues were ex
tinguished and reach target was slightly enlarged) indicated th a t a reach should be made to the
target. As previously reported, neural activity during the delay period (time from target appear
ance until ‘go’ cue) reflects the endpoint of the upcoming reach (Messier and Kalaska 2000). The
reach endpoint can be decoded from delay-period activity using maximum-likelihood techniques
(Yu et al. 2004).
Eye fixation was enforced throughout the delay period to control for eye-position-modulated
activity in PMd (Cisek and Kalaska 2002; B atista e t al. 2005). This fixation requirem ent is appro
priate in a clinical setting if targets are near-foveal, or imagined as in a virtual keyboard setup.
The hand was also not allowed to move until the go cue was presented, providing a proxy for the
cortical function of a paralyzed subject (Serruya et al. 2002; Taylor et al. 2002; Carmena et al.
2003). Subsequent to a brief reaction time, the reach was executed, the target was held (~200 ms),
and a juice reward was delivered along with an auditory tone. An inter-trial interval (~250 ms)
was inserted before starting the next trial. We presented various target configurations (2, 4, 8 or
16 targets) on the screen, including layouts with 2, 4, or 8 directions, and 1 or 2 distances (6-12 cm
radially outward).
The aforementioned paradigm was used for control experiments th a t helped us design of our
BCI system. When actually implementing our BCI system we modified the system to display
targets in rapid succession as will be detailed later. We call these our BCI experiments. This
allowed us to test the true performance of the system in a setting analogous to its usage scenario
for hum an patients.
3.2.1 Neural recordings.
Neural activity was simultaneously recorded from a 96-channel electrode array (Cyberkinetics
Neurotechnology Systems, Foxborough, MA) implanted in arm representation of PMd, contralat
eral to the reaching arm (left, monkey G; right, monkey H). For monkey G, we used the second-
generation spike sorting (RR) described in Chapter 2. The third-generation, more sophisticated
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 35
3 T ouch hold D elay period 'G o' c u e R eal re ach
•« f*MTy.‘r \*'/Vi’s**. •;■.*.,*» ■'!v
T o u ch hold Trial #1 Trial # 2 Trial # 3 Trial # 4 D elay P erio d 'G o1 c u e
:•■■■%*•'.rv;,** V" J '.v~ , r ‘.'** * '* *. r. . •.*•*. * . /
_ . i* .'.v i^i iri''*ii k ■ .1, v ,““IT ■
•*. _■**• l}1' . V ' - V # . ' * . j ‘] i v v V t - I . :
1 2 3
Figure 3.1: Instructed-delay (real reach) and BCI (prosthetic cursor) tasks, with accompanying neural data. Large numbered ellipses draw attention to the increase in neural activity related to the peripheral reach ta rget. a. Standard instructed-delay reach trial. D ata from selected neural units are shown (gray shaded region); each row corresponds to one unit and black ticks indicate spike times. Units are ordered by angular tuning direction (preferred direction) during the delay period. For hand (H) and eye (E) traces, blue and red lines show the horizontal and vertical coordinates, respectively. Full range of scale for these data is ±15 cm from the center touch cue. b. Chain of three prosthetic cursor trials followed by a standard instructed-delay reach trial. Tgkip is denoted by orange in the timeline. Neural activity was integrated (Tjn t) during the purple shaded interval and used to predict the reach target location. After a short processing tim e (?’(jec+ren(j«40 ms), a prosthetic cursor was briefly rendered and a new target was displayed. The dotted circles represent the reach target and prosthetic cursor from the previous trial, both of which were rapidly extinguished before the sta rt of the tria l indicated. Trials from experiment H20041106.1 with monkey H.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 36
system, was used for monkey H (RRR; also detailed in the previous chapter). The use of an auto
matic spike-sorting system ensured a very fast and repeatable method for classifying neural units
each day. We recorded 20-30 single neurons and 60-100 multi-neuron units in a typical session.
Figure 3.2 shows the anatomical placement of the electrode array in both monkeys.
Monkey G Monkey H
Posterior Anterior Posterior
Figure 3.2: Placement of electrode arrays in PMd of monkeys G and H. For both monkeys, the arrays were placed in a location th a t spans dorsal pre-motor and prim ary motor cortices. The neural signals tended to be responsive during both the delay period and the movement phase of trials. Intraoperative photographs of the array implanted in cerebral cortex are shown with sulci indicated. Overlapping diagram shows the relative array placement between monkeys. Monkey H’s sulcal pattern is reflected vertically and rotated to bring the sulci into alignment with those of monkey G. Ce.S.: central sulcus; S.Pc.D.: superior precentral dimple; Sp.A.S.: spur of the arcuate sulcus; A.S.: arcuate sulcus.
In BCI experiments, a selection process determined which units were to be used during target
prediction. For monkey G, we used 0-4 single units for each electrode, the exact number varying
from day to day, along with an optional m ulti-unit classification. Single units were preferentially
included by signal-to-noise ranking. We collected data from 18 separate BCI experiments from
monkey G, each experiment containing many hundreds of trials.
For monkey H, we included m ulti-unit activity along with 0-5 single units per electrode. An
additional ANOVA criteria was applied to include only units tha t were significantly modulated
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 37
by reach target direction during the delay period (p < 0.01). We collected data from 40 separate
BCI experiments from monkey H, each experiment containing many hundreds of trials. With the
aforementioned selection criteria, ~70-90 neural units were used in our highest performance BCI
experiments with monkey H.
To better understand w hat proportion of single units and multi-units were recorded, we exam
ined neural data with param eters similar to those used during our BCI experiments. For monkey
H (using dataset H20041217; 8 targets), there were 25 tuned single units and 89 tuned m ulti-units
(tuning assessed with ANOVA, p < 0.05). Tuned units are those units th a t show a statistically
significant difference in spike count as the direction of the reach is varied. For monkey G (using
dataset G20040508; 8 targets), there were 26 tuned single units and 65 tuned multi-units. We
evaluated the sort quality of a un it (single versus multi) using all spiking data in each experiment
and the discriminability between units in the modified principal components space (Sahani 1999).
3.2.2 D ecoding Algorithm s
Maximum-likelihood techniques (or decoding algorithms, as we refer to them) are central to our
ability to decode neural activity in order to discover a subject’s motor intentions. For each trial, we
compress the activity recorded off of the array into a vector th a t denotes the number of spikes from
each neuron during the delay period.1 We then model the spike counts as a random vector derived
from either a m ultivariate Gaussian or Poisson distribution. Taking the case of a m ultivariate
Gaussian distribution, we can write the following m athematical expression:
n y l ’ (27r)'l/2|Zs |1/2 ( }
where y e R9 is the vector of spike counts for a single trial, and p s e K9 and l s e l?9*9 are the
mean vector and the covariance m atrix fitted to the data for reach endpoint s e {1,.. .,M}. This is
illustrated with simulated data in Fig. 3.3 for q = 2 and M = 3. The mean vector will be different
for each reach endpoint s and for any given tria l the observed spike counts will be perturbed by
Gaussian noise.
The param eters of the model are first fit w ith a set of trials dubbed “training trials.” Then the
fitted model can be used to make predictions of the reach direction given only the neural data. We
decoded reach direction for “test” trials using maximum likelihood as follows:
s = argmax P(s | y) (3.2)S
f (y\ s )P(s)= a rg m ax — ----- (3.3)
s f ( y )
= argmax f ( y | s), (3.4)
1The discrimination of individual neurons from the recorded electrode voltages and the determination of their spike times was covered in Chapter 2.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 38
c/>CD
CLCO
o
60
40
20#
20 40
y1 (# of spikes)60
Figure 3.3: M ultivariate Gaussian data-fitting. Each point corresponds to a single tria l and its color corresponds to the actual reach direction on th a t trial. A covariance ellipsoid is fit to the set of data points for each reach direction. Only three reach directions (0° blue, 90° green, 180° red) are shown. The ellipses correspond to 50% confidence regions.
where s is the estimated reach endpoint. Equation (3.3} is obtained using Bayes’ rule and Eq. (3.4}
is a result of all reach directions being equally likely and f ( y) not being dependent on s.
With our datasets, the number of neural units (q ) is most often comparable to the number of
training trials (150-200 neurons versus 50-100 trials per reach endpoint). Therefore, we chose to
constrain the covariance m atrix (Zs e IR9*9) to be diagonal in order to avoid overfitting issues with
a full covariance matrix. We assume th a t spike counts from each neuron is independent of the
counts from the other neurons once the reach direction is predetermined.
Another common approach in for modeling the spike counts of neurons is to fit the data to
a Poisson distribution (Dayan and Abbott 2001). Fig. 3.4 shows the distribution of spike counts
from a random subset of neuron/reach-direction pairs using data collected from our standard ex
perimental setup. Overlaid on the plot is the theoretical distribution from a Poisson distribution
(matched mean; blue) and a Gaussian distribution (matched mean and standard deviation; red).
A Gaussian distribution can model spike count data well for high mean counts, but when there
are fewer spike counts in a given delay period, the Gaussian is no longer a good fit. The Poisson
model, however, provides a better fit for lower spike rates. We have also examined the Fano Factor
and found th a t this measure roughly agrees with the value expected from a Poisson distribution.
For a Poisson-based model, the probability mass function (pmf) can be expressed as:
,-Aip( y l I s) = ■
a yy l i
(3.5)
where y l e No is the spike count for neural unit i in a single trial, X\ e IR+ is the mean spike count
fitted to the training data for reach direction s e {1,.. .,M}.
It is im portant to note th a t the Poisson-like noise properties of neural recordings is not a spe
cific feature of our recordings, but ra ther a generally accepted model for cortical neurons. There
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 39
unit12 ,1-d ir1 unitl 3,1 —dirl unit14,1-dir1 unitl 5 ,1—dirl
0.5BUB----- 1 T ......
0.5 °-sm \0 1 2 3 4 5 0 1 0 1 2 3 4 5 6 0 1 2 3 4 5
unit18 ,1—dirl unitl 9,1 —dirl unit20,1-dir1 unit20 ,2—dirl1
0.5
0
1 1 1
m a m —0.5 0.5 0.5
0 L A i h i . — 0 .....................0 1 2 3 4 0 1 2 3 4 5 6 7 0 2 4 6 8 1 0 1 2 1 4 0 1 2 3 4 5
1
0.5
0
unit22 ,1-d ir1 unit23 ,4-d ir1 unit25,1-dir1 unit26 ,3-d ir1
1r r -------- 1 10.5
0
0.5
00 2 4 6 8 10 12 0 1 2 3 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7
unit29 ,1-d ir1 unit29 ,2-d ir1 unit34,1 —dirl unit40,1 —dirl1 1 1 1
n2 0.5 0.5 0.5 0.5
0 — 00 1 2 3 4 5
sp ike coun t
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 0 1 2 3 4
Figure 3.4: Histogram of normalized spike counts with overlaid Poisson distribution (red) and Gaussian distribution (blue) for a random selection of neural units. The x-axis denotes spike counts and the y-axis is the normalized frequency or probability. Specific neural-target pairs are plotted in each subpanel. Spike counts are summed over the interval [150,250] ms referenced from target presentation. There is an arbitrary y-axis scaling for the Gaussian; this degeneracy was resolved by rescaling the maximum point on the blue curve to coincide with the data histogram.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 40
is strong precedent with respect to neural decoding algorithms using the Poisson distribution to
describe the data. Such algorithms most often use either Gaussian (Maynard et al. 1999; Zhang
et al. 1998) or Poisson models (Zhang et al. 1998; Brown et al. 1998; Smith and Brown 2003; Truc-
colo et al. 2004; Brockwell et al. 2004). The belief th a t neuronal spike counts are best modeled
with a Gaussian or Poisson distribution is quite strong, and, while not perfect (e.g., a Gamma
distribution can sometimes be better), it is considered to be a reasonable approximation and also
computationally tractable.
3.2.3 M odel Training
We fit (trained) models based on neural activity collected starting Tskip after the target presenta
tion time and extending for a duration ■ int- For the control experiments, we either used two sepa
rate blocks of trials, one for training and one for prediction, or used leave-one-out cross-validation
with all the trials in a dataset. For BCI experiments, training trials were initially collected to fit
the models and during subsequent trials, the target was predicted with the model (Gaussian model
for monkey G and Poisson model for monkey H). There were many hundreds of test trials in these
BCI experiments. As such, we were able to provide sufficient repeatability in our experiment so as
to ensure a high degree of confidence in our results.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 41
3.3 Control Experim ents
Before conducting BCI experiments, we analyzed the neural activity from instructed-delay control
experiments to set param eters essential for high-performance BCI operation. We subdivided the
delay period into two epochs: a time to skip after target onset while waiting for reach endpoint
information to become reliable and readily decodable ( T ^ p ) and a time to integrate the neural
data th a t will be used to predict the desired target selection (Tjn t).
3.3.1 Selection o f Skip Time (Tskip)
The first epoch, Tskip> includes the time for visual information about the target to arrive in PMd
(50-70 ms), the time for the subject to select among targets if more than one are present, and the
time for neural activity reflecting the desired target to be generated. Despite being of considerable
scientific interest (Yu et al. 2006a), neural activity during these early periods is discarded in the
present BCI design. Some activity during this period may already be predictive of the desired
target, but it is not yet clear how best to decode this information. Choosing a short Tskjp can
reduce the overall length of each trial, but may adversely affect prediction accuracy. Tgjjip was
chosen to be 150 ms based on control experiments including a m ulti-target task where the monkey
was trained to reach for one of many simultaneously-presented targets, as described directly below.
Before visual information is relayed to PMd, the measured neural activity in PMd is not target-
related — random neural variability can inject noise into our decoding model. We computed the
average single-trial accuracy as a function of T g^p, fixing to 50 ms. Figure 3.5 demonstrates
tha t the neural activity in PMd cannot be meaningfully decoded to predict the reach target until
~75 ms after the target is displayed. This estim ate includes a ~ 16-33 ms delay between when
the software sends a request to show the stimulus and when it is actually displayed by the CRT
projector. Figure 3.5 also reveals th a t there is target related information in PMd as early as 50-
70 ms after the target is first cued. It would not be possible to decode the target with above chance
probability otherwise. This rough estim ate of latency agrees with neural response plots from
other previous studies in PMd (Crammond and Kalaska 2000; Kalaska and Crammond 1995, etc.),
where some neurons show a change in activity very soon after stimulus onset. This exact latency
has further implications for BCI experiments where reach targets are presented in rapid succession.
Figure 3.1b shows that neurons were spiking according to the target location o f a previous trial for
many 10s o f milliseconds after the start o f a new trial (see ju s t after ellipse #2).
The previous analysis measures the latency of PMd neurons in a very specific situation —
where there is a single target displayed and the subject reaches to th a t target. To estim ate the
time needed for the brain to select among multiple reach targets, we performed a separate control
experiment with both monkeys. We presented each monkey with a m ulti-target task where all of
the eight possible reach locations were shown on every trial, but only one was colored yellow while
the rest were colored green. The monkey was trained to reach for the yellow target following the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 42
Monkey H
70
50 M onkey G
30
0 100 200 300 400Skip tim e (7skip) (m s)
F ig u re 3.5: PMd latency analysis w ith the single-target instructed-delay task (one reach target was shown out of a possible of 8 locations and the rem aining 7 locations were invisible) as a function of Tg^jp. Performance was calculated by training a Poisson model on all tria ls in a dataset and computing the feave-one- out cross-validated performance on the same data. The shaded area denotes the 95% confidence interval (Bernoulli process) around the mean performance (embedded line). Dark curves correspond to monkey G (dataset G20040603) and light curves to monkey H (dataset H20041117). Performance was calculated for a constant 7 ^ of 50 ms with varying 7 g jp.
delay period. Figure 3.6 compares the performance for the conventional single-target instructed-
delay and m ulti-target tasks, as a function of Tgjjjp. For the m ulti-target task, we require a longer
Tgkip before there is a decodable reach plan. For monkey H, we used both a yellow-green and
yellow-blue color scheme.2 Comparing the two color schemes, there is a much larger (+150 ms)
latency for the yellow-green scheme. This large difference between the two color schemes demon
strates th a t the difficulty of the task can greatly influence the speed a t which plans are formed.
A question th a t frequently arises in visually cued studies such as ours is whether the neural
activity measured during the delay period is related to a reach plan, the visually cued stimulus, or
a combination of both. One such discussion of this issue can be found in Crammond and Kalaska
(1995). For example, recording from primary visual cortex could provide excellent prospects for
decoding the reach target in our single-target task, but a BCI operating on this neural activity
would not represent the motor intentions of the subject. Our m ulti-target task can serve as a
control experiment in this regard. Placing Tgj^p a t the time where the performance curves in
Figs. 3.6b and 3.6d converge would provide assurance th a t such a BCI is decoding motor intention.
The m ulti-target task is also an inherently more difficult task than the single-target task. The
different time courses of these two tasks cannot be entirely due to the difference in visual stimuli,
especially since merely changing the color of the non-reach targets caused a considerable shift in
the time course for monkey H (cf. Figs. 3.6c and 3.6d). We therefore chose to be neither overly
conservative by waiting until the time a t which the performance curves fully converged (250 ms),
nor overly liberal by selecting T ^ p to be coincident with the early plateau in decoder accuracy
for the single-target task (75-100 ms). We chose a T ^ p of 150 ms.
2 All colors were measured to be roughly isoluminant with a photometer calibrated for the primate visual system.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 43
§ 90' oO
70'
S'2 50'3 OoS 30'
100 200 300 400
Skip tim e (Tskip) (m s)
Monkey G
<Do
80
60
40
20
0100 200
Skip tim e (T ) (m s)300 400
80
60
40
20
0100
Skip tim e (T ) (m s)200 300 400
Monkey H
Figure 3.6: Direct performance comparison between the single-target and m ulti-target tasks as a function of Tskjp . a. Different task configurations. Tasks were interleaved in a pseudorandomized fashion during each experiment. Analysis is sim ilar to th a t presented in Fig. 3.5. Performance is plotted with ^ in t fixed at 50 ms and varied, b. Performance with yellow-green color scheme converges a t 7’sj5;jp«250 ms (dataset G20040603). c. Performance with yellow-green color scheme converges a t ?1skip~400 ms (dataset H20041117). d. Performance with yellow-blue color scheme converges a t Tgkjp=:250 ms (dataset H20041201).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 44
We used the single-target task, as opposed to the more complicated m ulti-target task, for our
BCI experiments because we felt th a t it provided the simplest analogy to a hum an prosthetic
system. While real patients may have to choose from several objects in their workspace, these
objects will ordinarily not be presented immediately prior to a decision to execute a prosthetic
reach. Furthermore, BCIs will typically rely on internally-generated target plans as opposed to
externally presented stimuli (Shenoy et al. 2003; Afshar et al. 2005) and it has been shown th a t
PMd exhibits robust motor plan activity in the absence of visual stimuli (Crammond and Kalaska
2000). BCIs tested with internally-generated plans may well achieve even greater performance
than what we have demonstrated in this report. Internally generated plans can be formed without
the added latencies of the visual system. Experiments are underway to tes t this hypothesis, but
these are outside the scope of this dissertation.
3.3.2 Selection o f Integration Time (Tjn .)
The second epoch, r int> directly follows Tgkjp and provides the neural data used to predict the
desired BCI cursor position. Given the Poisson-like noise in the spike timing of cortical neurons,
a longer T ^ will average away more noise and result in more accurate predictions of reach end
point. However, a longer ■ int will also reduce the total number of cursor positionings th a t can be
made per second. Herein lies the fundamental speed-accuracy tradeoff th a t we m ust optimize in
order to increase BCI performance.
To determine the best T ^ to be used in BCI experiments, we analyzed the effect of this param
eter on two performance metrics. The first is single-trial accuracy, which is the percent of trials
in which the target is correctly predicted on average. We found th a t accuracy rises and largely
saturates around 85-90% as increases to 200-250 ms. Figure 3.7 illustrates this effect as a
function of total tria l length, which is defined to be the sum of Tskip (150 ms), Tjn t (variable),
and a small system overhead time associated with decoding and rendering the prosthetic cursor
on the screen (^dec+rend*4® m s)- Should a minimum level of single-trial accuracy be required for
a particular application, a corresponding minimum Tjn t can be chosen.
The second performance metric is information transfer ra te capacity (ITRC, in bits/s or bps).
This quantity measures the ra te a t which information is conveyed from the subject, through the
BCI, to the environment (Taylor et al. 2003; Shannon 1948). It is the information per trial, which
is closely related to single-trial accuracy, divided by the total trial length.3 As shown in Fig. 3.7,
the optimal ITRC occurs a t short tria l lengths, despite relatively low single-trial accuracy a t these
trial lengths. The highest ITRC is 7.5 bps a t a total tria l time of 260 ms, which corresponds to a
Tint of 70 ms ( r skip=150 ms> r dec+rend=40 ms>-As further confirmation th a t neural responses are reflecting motor intention even a t a rapid
3 As such, ITRC takes into account (1) task complexity, (2) the accuracy of task completion, and (3) the speed of task completion, and it is used universally to quantify performance of communication systems (Shannon 1948; Cover and Thomas 1991). For more discussion on ITRC, see Section 3.7 later in this chapter.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 45
100aOo
o23OO<0a)■aooCDQ
r T T T200 250 300 350 400
Trial length (m s)
Figure 3.7: Single-trial accuracy and information transfer rate capacity (ITRC) with monkey H. Performance curves investigating the dependence on Tjn£ were calculated from control experiment H20041118 (8-target configuration). The trial length was 7’sj£jp+T,jn t+7’(jec+ren(j with Tsjcjp=150 ms and T(jec+ren(j=40 ms. Tint was varied and performance was computed. Performance metrics were very consistent day after day and between monkeys (data not shown). The theoretical maximum ITRC in bps, assuming 100% accuracy regardless of Tjn t, is plotted as the dotted red curve.
pace, we repeated the above analysis with the m ulti-target task. Since the m ulti-target task is
a more difficult task, overall performance of a BCI using such a paradigm may not be as high
as tha t of the single-target-based system. Fig. 3.8 compares the ITRC between the single-target
task and the m ulti-target task in control experiments. In summary, there was a ~30% penalty
for a system using the more difficult m ulti-target task.4 Similar to the discussion for Tgj^p, the
difference in ITRC performance could be attributed to differences in visual stimulus presentation,
cognitive difficulty, or a combination of both.
Importantly almost all past studies, including BCIs employing continuous trajectory control,
are based on singly-presented, visual stimuli. This was one reason for why we chose to employ a
single-target paradigm for BCI experiments and report those results as our primary finding.
4The maximum ITRC in this analysis differs from that found in Fig. 3.7 since different datasets were used for each analysis along with different model training methods.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 46
Monkey H
Monkey G
Figure 3.8: ITRC comparisons between single-target (black) and m ulti-target (gray) tasks. An 8-target layout was used in the experiment. Both tasks were interleaved in a pseudorandomized fashion during each experiment. Trial length was taken to be 7’skip+^ in t+^'dec+rend w^ h T’dec+rend se*' t° 40 ms. Tgjjjp was fixed a t 150 ms for the single-target task and 250 ms for the m ulti-target task. was varied and performance extrapolated, a. D ata from monkey H (H20041201) with a Poisson decoding model; maximum ITRC was 8.0 bps for the single-target instructed-delay task and 5.5 bps for the m ulti-target task. b. D ata from monkey G (G20040603) with a Gaussian decoding model; maximum ITRC was 6.8 bit/s and 4.6 bit/s, respectively.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 47
3.4 BCI Experim ents
The performance curves in Figs. 3.7 and 3.8 are extrapolations using experimental data from in
dividual trials th a t had long delay periods and long times between trials (refer to Fig. 3.1a). To
directly measure the ITRC performance when actually presenting trials a t high speeds, we con
ducted a series of BCI experiments using a real-time system capable of rapidly decoding neural
information. BCI experiments began with the collection of delay-period activity preceding reaches
to different target locations (Fig. 3.1a) and fitting statistical models to the activity (model tra in
ing). Then, during BCI prosthetic cursor trials (Fig. 3.1b), the intended target was decoded and a
circular cursor was rendered on the screen a t the predicted location. If the prediction was correct,
the next target was displayed with very little delay. If the prediction was incorrect, the tria l was
either considered a failure and aborted, or the monkey was allowed to make a real reach to the
target. In this manner, a sequence of high-speed prosthetic cursor trials could be generated. Fig
ure 3.1b illustrates three successful prosthetic cursor trials followed by a standard real-reach trial.
Real-reach trials were also interspersed to ensure the monkey remained engaged in the task .5
Using this paradigm, we varied the number of locations a t which a target could appear on
any given trial. This allowed task difficulty to be varied, which contributes to the ITRC metric.
Performance values were calculated by averaging data from several hundred trials per condition.
Table 3.1 lists the highest ITRC results during BCI experiments with 2, 4, 8 or 16 targets. In all
cases, we were careful to avoid placing targets directly below the center touch cue since this loca
tion would be obscured by the monkey’s hand. We also explored two annular rings (for the 8- and
16-target configurations) to demonstrate 2-dimensional target selection. The best overall perfor
mance was achieved with the 8-target task (6.5 and 5.3 bps, monkeys H and G). This performance
corresponds to typing ~15 words per m inute with a basic alphanumeric keyboard.
We took a conservative approach in computing BCI performance. Specifically, we considered
only sustained BCI trials. All BCI trials are not equivalent in their timing characteristics. In
Fig. 3.1b, the first BCI tria l contains a large center touch hold time. This period allows the mon
key to reset its behavioral state after an immediately preceding reach trial. Consequently, the
monkey is not being requested to rapidly switch his plan from a previous BCI trial. Including
this particular tria l’s success or failure in our performance numbers is not a valid indication of
sustained performance and could unduly inflate performance results. For the particular chain of
trials shown in Fig. lb, we only included trials #2 and #3 in our average performance results. As
mentioned before, we take the whole tria l time, consisting of Tgkip’ Tjn^, and T(jec+rencj, when
calculating all results th a t depend on the ra te of target presentations.
While a sustained performance ra te of 6.5 bps is manyfold greater than reported previously,
it is lower th an the extrapolated result (7.5 bps; refer to Fig. 3.7). Furthermore, the ITRC peak
5 For videos demonstrating the instructed-delay (real-reach) task, and moderate-speed and high-speed BCI experiments, please see the Nature Publishing Group website and reference the supplementary materials included for Santhanam et al. (2006b).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 48
Table 3.1: BCI experiments with highest ITRC for monkeys H and G. Each row lists the experiment with highest performance (ITRC) for a given target layout. Other experiments yielded higher single-trial accuracy or involved faster cursor rates, but did not achieve the highest ITRC for the corresponding target layout (not shown).
# of targets accuracy (%) trials/s bpsH 2 94.3 3.5 2.4
4 94.5 2.8 4.78 68.9 3.5 6.516 51.1 2.9 6.4
G 2 84.2 3.6 1.34 93.0 2.5 3.88 76.8 2.5 5.316 26.4 2.2 3.1
was expected a t a total tria l length of 260 ms, but our BCI experiments yielded 5 bps with this
timing. These discrepancies are due to the limitations inherent when using control experiments to
extrapolate performance for speeds a t which the subject m ust quickly recognize new targets and
rapidly change neural activity (i.e., switch reach plans). The differences between extrapolated and
directly measured performance were present despite specific model training methods th a t allowed
for a fair comparison.6
3.4.1 A dditional BCI Perform ance A spects
Having confirmed th a t large BCI performance gains are possible with a direct endpoint control
strategy, we investigated two additional performance aspects. First, we varied T-jn in BCI exper
iments with monkey H to experimentally verify the trends seen in Fig. 3.7. Fig. 3.9 also demon
strates an increase in single-trial accuracy with increasing tria l length (black curves) as well as a
peak in each ITRC curve (red curves). These results reveal how two or four target tasks restrict
ITRC by virtue of the lower number of maximum bits per tria l (1 and 2, respectively). Further
more, given the numbers of neural units available in these experiments, it appears th a t ITRC
is approaching a saturation point beyond which adding more target locations may not produce
an appreciable increase in performance (doubling targets from 8 to 16 does not increase ITRC7).
6One possible source of decreased performance in BCI experiments includes situations where data used to train the decoding models is dissimilar to data used for prediction. To optimize the similarity between these two conditions, for monkey H we presented rapid sequences of reach targets during the training portion of our experiment, only commanding the monkey to reach for the last target in the sequence. Since the subject presumably planned reaches to every target, statistical models were trained from these high-speed trials that mimic the speed of trials during the prediction portion of the experiment. Overall performance was improved in comparison to decoding using models trained on slower-paced trials. Note that this training procedure can be easily adapted for paralyzed patients as well.
7 The latter layout requires distance tuning which is known to be weaker than direction tuning (Messier and Kalaska 2000; Churchland et al. 2006a)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 49
Additional target locations should improve ITRC when more neurons are available.
2 targets 4 targets 8 targets 16 targets
' !,.! • +L: >
Q 40
r
r
I-------- 1---------I-------- I200 300 400 500
Trial le n g th (m s)
l/A -s
200 300 400 500 200 300 400 500 200 3 0 0 400 500
4 00o
Trial len g th (m s) Trial le n g th (m s) Trial len g th (m s)
Figure 3.9: Single-trial accuracy and information transfer rate capacity (ITRC) with monkey H. Performance measured during BCI experiments. Performance is plotted for each target configuration and across varying total trial lengths. Each data symbol represents performance calculated from one experiment (many hundreds of trials). Across target configurations, single-trial accuracy decreases and ITRC increases as more targets locations are used.
Though each data point in Fig. 3.9 represents performance consolidated over hundreds of trials
in a given session, we would ideally replicate experimental conditions and repeat experiments
over multiple sessions. Practically, the electrode array can only provide a quasi-stable number of
neurons over a relatively short time (2-3 months). We chose instead to sample the fundamental
design param eters (target configuration and T ^ ) . Also, in these BCI experiments with monkey
H, different T ^ p times (150-250 ms) were chosen on an experiment-by-experiment basis based
on the cross-validated performance of the training trials, but the majority of experiments were
conducted with 2^]^ = 1 5 0 ms.
Second, a common concern for BCIs such as ours is th a t as the electrode im plant ages the num
ber of recordable neurons declines, leading to a drop in overall performance (Schwartz 2004). To
investigate the impact of neuron loss, we performed analyses of single-trial accuracy and ITRC us
ing data from control experiments. For a single day’s experiment, we selected all neural units from
the array th a t were responsive to target location within a desired using an ANOVA (p<0.05).
For each neural ensemble size of interest, our total set of neural units was subdivided by drawing
100 randomized subsets (without replacement). Performance was computed for each subset and
these data were averaged within a given ensemble size. This provided a single prediction accuracy
and a single ITRC for each ■ int and ensemble size. We generated contour plots by using linear
interpolation across this 2-dimensional surface.
As expected, single-trial accuracy falls as neuron ensemble size decreases. However, it is pos
sible to partially compensate for this performance loss by increasing Tin t’ BCI speed may be
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 50
compromised as a result, but single-trial accuracy can be preserved (Fig. 3.10). For example, a t
a population subset of -8 0 neural units, increasing the subset size improves decode performance.
Increasing the integration time also improves decode performance. For very low numbers of neural
units, the performance eventually saturates regardless of the size of T ^ . This effect reflects the
inherent noise present from sampling a small subset of neurons as well as the potential mismatch
of our (or any) spiking model to the actual statistics of the neural system.
160 i
120 ■
80 •
40 ■ 0 .7 '0.6
0 J0 100 200 300 400 500 600
F ig u re 3.10: Single-trial accuracy as a function of numbers of units and All data is from experimentH20041118 which involved an 8-target configuration. Tgjjip was fixed a t 150 ms. Similar results were obtained for dataset G20040508 from monkey G.
Figure 3.11 plots ITRC as a function of the number of neural units and ■ int- For small en
sembles (e.g., 20 neurons), the ITRC peaks a t ■ int a 120 ms but does not decline sharply as r i n t is
further increased; accuracy (and bits per trial) is increasing so as to offset the longer tria l times.
For larger ensembles, the information content a t small is relatively high such th a t further
lengthening ■ int has a dramatic effect on ITRC.
Furthermore, for each ensemble size tested, a cubic spline interpolation was used to estim ate
the particular Tjn . th a t maximized the ITRC. Plotting this “optimal” value (Fig. 3.11 inset) for
each ensemble size illustrates th a t the maximum ITRC is achieved with small (60-130 ms),
over a broad range of ensemble sizes. Thus, high-performance BCIs may require far shorter trials
than have been explored prior to our work.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRA1N-C0M PUTER INTERFACE 51
160 i140
.£ 100
0 5 0 100Num berof neural units
80 ■
(bits/s)40 -
^ ---
0 100 200 300 400 500 600
F ig u re 3.11: ITRC as a function of num ber of neural units and r int- All data are from experiment H20041118 which used an 8-target configuration and contained over 1300 trials. Tg^-p was fixed a t 150 ms. Main panel shows contours of ITRC (bps) as a function of the number of neural units available and ^ in t ' The inset shows the value of ^ in t th a t achieves the maximum ITRC for each neural ensemble size th a t we tested. Similar results were obtained for dataset G20040508 from monkey G.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 52
3.5 Summary
Using a direct endpoint control strategy, we have described here an over four-fold (6.5 versus
1.6 bps) increase in BCI performance compared to recent studies. Performance is calculated in
a conservative fashion since the entire tria l time (^ s k ip ^ in t^ d e c + re n d ) was used; had ju st
Tint been used as is sometimes done, the maximum ITRC would have been 28.4 bps. However,
this is not an appropriate metric since it does not reflect an achievable selection throughput.
As described previously, our system differs from continuous BCI approaches in several ways
which may account for our performance gains. Additionally, continuous BCIs attem pt to move the
cursor well enough, although a t the expense of speed (1-3 seconds per selection), to avoid making
errors for a given selection. Conversely, the direct endpoint control reported here need not correct
errors within a given selection since these errors can be rectified with rapid follow-on selections.
This concept is intrinsic to our use of information theory and the capacity metric to quantify our
communication prosthesis.
Our performance results far exceed EEG-based non-invasive system performance, and help
motivate the use of invasive, electrode-based systems in clinical BCIs. Although a t its fastest, this
direct endpoint control BCI demonstrates selection speeds (~3.5 trials/s) on par with saccadic eye
movements, the ITRC for saccades is much higher due to their exceptional precision and accuracy.
While eye or even speech control may be effective in specific settings, BCIs attem pting to restore
lost motor function m ust rely on the natural neural signals if they are to avoid commandeering
and interfering with another motor modality. For example, an eye-tracking system used to control
a wheelchair can prove inconvenient if the paralyzed patient wishes to exercise free gaze without
controlling the wheelchair.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 53
3.6 Addendum: EMG m easurem ents
We m easured EMG from monkey G to verify th a t the neural activity was not a byproduct of minor
limb movements during the delay period of our experiment. The aim was to ensure th a t the
BCI system is operating with motor planning activity as opposed to movement execution activity.
This is an im portant requirem ent if such a prosthetic system is intended for paralyzed patients.
Figure 3.12 shows the data from three different muscles in monkey G. There is no noticeable
difference in EMG activity between the periods before and after target presentation for real-reach
trials. Furthermore, there is no significant tuning in the EMG activity for target direction during
the time period 50 to 300 ms after target presentation (ANOVA, p » 0.05). We also measured
EMG activity while presenting targets a t a rapid pace (“rapid condition”), akin to the behavioral
conditions present in BCI experiments.8 Results were very similar to those of the real-reach trials
— there was no elevated activity after target presentation and there was no target-specific tuning
in the EMG signal during the delay period.
The lack of tuned EMG signal in the delay period was typical across other monkeys in our
laboratory (Churchland et al. 2006b,a). While we did not measure EMG from monkey H, the
endpoint position of the monkey’s fingertip did not move with respect to the target direction during
the delay period. Our hand tracking apparatus has a sub-millimeter resolution.
8 There was no movement epoch for the rapid condition, much like there was no such period for prosthetic cursor trials during BCI experiments.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 54
— /■ Vt
°
□ □□ □
□■vt□q «□
i — n200 ms
——A a.f t ♦ t
„~/Ya □ □□
._ A aa D
□□
I \Q J | I I
-TV
/ * \t t
t t
. - A □
□ □□ □
\ □
UN -I □
Figure 3.12: EMG measurements for monkey G plotted in arbitrary units. Data was collected for real reach trials (green) as well as trials with rapid presentation of targets (red). The first arrow designates the target presentation time and the second arrow marks 150 ms before the start of the movement. The delay period and movement periods are separated by a slight gap to allow for differing delay periods across trials, a. Measurements from the deltoid muscle, b. Measurements from the biceps muscle, c. Measurements from the triceps muscle. We only collected data from real-reach trials for this muscle.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 55
3.7 Addendum: Application o f Information Theory to BCIs
3.7.1 A nalogy to Com m unication System s
Information theory has been used previously in neuroscience to estim ate the information content
in neural spike trains or other neural activity. This is not what we did; we did not attem pt to
estimate the intrinsic information content of the neural signals, a t least not in any direct fashion.
Instead, we calculate how much information, quantified in bits much like transm ission of data
over a modem, can be extracted from the subject’s thoughts, by way of our prosthetic system (which
includes the entire signal path, from electrode recordings to target decoder).
Figure 3.13a shows a schematic of a standard communication system. The system is used as
follows:
1. An arbitrary message is chosen by the source (e.g., “Hello World”).
2. The message is encoded into a series of channel symbols (e.g., 011100...) as per a predeter
mined translation scheme, or code.
3. The symbols are then sent through a noisy channel and corrupted (e.g., 010100...).
4. The receiver processes the output of the channel with a message decoder th a t utilizes error-
correcting features (redundancy) of the code to retrieve an estim ate of the entire message.
Figure 3.13b creates an analogy between the classical communication system and a cortically-
controlled prosthetic system. The subject now m ust choose the message and also encode it in the
form of channel symbols. The encoding scheme and channel symbols are predetermined during
the training phase of the prosthetic system. The channel is the prosthetic system th a t translates
the intended symbol into an estim ated symbol. The prosthetic system may not be able to perfectly
estimate the intended symbol; hence, the channel is noisy. Again, the message decoder uses error
correcting features of the code to accurately recover the message.
Figure 3.13c illustrates the approach we used to quantify system performance. We focus on
characterizing the communication channel by presenting various sets of reach targets to the sub
ject. These targets are the channel symbols. The subject’s neural signals first represent the in
tended target. We record these neural signals from the electrode array, spike sort, and decode the
target location. (Each of these steps can inject noise into the final predicted target: for example,
we are only recording a small population of intrinsically noisy neurons, our spike sorting, though
good, is not perfect, and our decoder assumes a param etrized model th a t is surely not entirely con
sistent with the underlying neural representation.) We repeat our m easurements many hundreds
of times to allow us to characterize the error patterns (statistics) of this noisy channel. Ultimately
we can establish bounds on information transm ission using the techniques from information the
ory. Particular targets (symbols) may be decoded incorrectly, but one can asymptotically achieve
perfect reconstruction of the message using error correcting techniques (Shannon 1948).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 56
a
Message Estim ate of Message
M essag eE ncoder
M essag eD ecoder
C hannelp(ylx)
"Hello World" 011100... 010100... "Hello World"
b
Estim ate of Message
M essag eD ecoder
ProstheticSystem
H um an c h o o se s m essag e ; thinks of channel sym bols
Neural R epresentation; Recording Setup; T arget D ecoder {
Figure 3.13: Schematic diagram of a communication system. We focus on characterizing the red components of the system, a. Classical communication system where a message is encoded and sent through a noisy channel, adapted from Cover and Thomas (1991). A decoder is able to reconstruct the original message despite individual errors in the transmission, b. An analogy to a hum an communication prosthesis. The subject thinks of a message and encodes it. The channel consists of the prosthetic system th a t estim ates the intended symbol (in a potentially noisy fashion). The output of the channel is a series of symbols th a t are then fed into the message decoder, c. Illustration of how we characterize the communication channel. The communication channel is a black box th a t encapsulates the neural representation of the target, our recording setup, and our target decoder.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 57
3.7.2 Com putations
We sta rt by testing whether the target predicted from neural activity coincides with the target
presented. If there is a match, the tria l is deemed correct. Otherwise, the tria l is an error trial.
For error trials, we note the normalized frequency a t which particular targets are decoded given
a particular target presented. This allows us to characterize the “channel” of the communication
system.
With these measurements, there are three ways in which to assess the information transfer
(IT) per trial. For all calculations, the key quantity of interest is the m utual information metric:
I (X;Y ) = H ( X ) - H ( X \ Y ) = - £ p (x ) lo g 2p(x) - -^ p (x ,y ) lo g 2p(x|y)] (3.6)x x ,y )
This equation simply states th a t the m utual information (I) between the set of presented targets
(X) and estim ated targets (F) is the difference in entropy (or uncertainty) of the presented ta r
get set (H(X)) and the entropy after making an estimation (H(X\Y)). The experimental data is
used to compute p(y\x), which are the fractional occurrence of each estim ated target y given a spe
cific presented target x. The other quantities of interest are found from basic probability theory;
p{x,y) = p(x)p(y\x) and p(x|y) =
If Y provides a perfect estim ate of X , H(.X\Y) = 0; hence, the information transfer is maximal
and equal H(X), or the information contained in the presented stimuli. Taking an example of 8
targets, all presented with equal frequency, H(X) = 3, p(x,y) = g if y = x and 0 otherwise. Below
we discuss three possible ways to compute the IT per trial:
1. L evel-1 IT approxim ation — convert the average prediction accuracy across an exper
im ent to bits per trial. Again, if we have 8 targets, all presented with equal frequency,
p(x,y) = if y = x and p(x,y) = y j fe otherwise, where p c is the fraction of occurrences th a t
the estimated target matches the presented target averaged over all target presentations.
(The number 7 appears in the denominator to equally distribute error across the remaining
y x targets.)
2. L evel-2 IT approxim ation — take into account error structure by computing the true m u
tual information between presented targets and decoded targets. This provides a more ac
curate representation of p(x,y). In other words, every element of p{x,y) is the fractional
occurrence of the presented-estimated target pair (x,y), measured from our experiments.
This can be a more accurate representation of information transfer. If, for example, errors
are always distributed adjacent to the correct target, such a pattern is useful and taking it
into account will lead to increased information transfer.
3. In fo rm ation T ra n sfe r C apacity (ITC) — determine the full theoretical capacity of the
communication system using the Blahut-Arimoto algorithm. The algorithm attem pts to find
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 58
the “capacity” (C) of the channel, namely the bits per use of the channel (averaged over many
uses of the channel) such th a t there is zero probability of error.
C = max I (X;Y). <3.7)p ( x )
The algorithm starts with a guess for p(x) and iteratively improves the estim ate by solving
successive constrained maximization problems with Lagrange multipliers until C converges
to its global optimum (Cover and Thomas 1991). Unlike the level-2 approximation, the sys
tem is not constrained to the relative frequencies of presented targets, p(x), used during data
collection. Importantly, this approach yields a system th a t utilizes certain targets (e.g., those
th a t can be decoded more accurately) more often than other targets.
Figure 3.14 illustrates the pronounced structure in error pattern by plotting two error distribu
tions based on experimental data. Each is a 2D histograms depicting which target was estimated
(y) for each target presented (x). If target 1 was correctly estimated on every presentation, and
likewise for all 8 targets, all eight squares along the unity diagonal line would be red, and all other
squares would be blue. If the distribution is more diffuse (probability mass more spread away
from the unity diagonal), there is more “confusion” between the presented and estim ated targets.
Figure 3.14a is from one 8-target experiment and demonstrates th a t when a mistake does occur it
is generally only one target off. Figure 3.14b illustrates the error structure from a different exper
iment where errors were more broadly spread, but still somewhat clustered around the diagonal.
The ITC of panel a is higher than th a t of panel b and both are higher than their respective level-2
or level-1 IT values.
a b
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8E s t i m a t e d T a r g e t ( y ) E s t i m a t e d T a r g e t ( y )
Figure 3.14: Confusion matrices from two experiments. There can be structure in the error pattern. This structure can be exploited (see level-2 IT approximation and ITC, but not level-1 IT approximation) to allow for greater information transfer through the system.
Table 3.2 shows the values obtained when using these different methods of calculating IT for
a few representative BCI experiments. In general there was a large gain between the level-1 IT
and level-2 IT calculations but only a modest gain (~15%) between the level-2 IT and ITC calcula
tions. One could use any of these three methods of computing IT to then produce an information
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 59
Table 3.2: Comparing Methods for Calculating Information Transfer
Targets Performance# of targets max bits/trial
(max bpt)accuracy
(%)levell-IT
(bpt)Ievel2-IT
(bpt)ITC(bpt)
H 8 3 68.9% 1.2 1.6 1.98 3 71.4% 1.3 1.6 1.716 4 51.1% 1.1 1.9 2.2
G 4 3 93.0% 1.5 1.6 1.68 4 73.5% 1.4 1.8 1.98 4 76.8% 1.6 2.1 2.1
transfer rate (by dividing by the entire tria l length, ^1skip+^’in t+-^dec+rend^ The ITC is the standard performance metric in information theory (Shannon 1948). Again, this metric represents the
maximum information per use of the channel and is asymptotically achievable with zero trans
mission error by using an infinite length error-correcting code. We use the ITC to obtain the ITRC
total triaUength ) anc thereby evaluated and optimized our BCI.
Figure 3.15 shows the measured level-1 IT, level-2 IT, and ITC from all 8-target BCI exper
iments with monkey H. As expected, the ITC for a given experiment is greater than its corre
sponding level-2 IT, which is in tu rn greater th an its corresponding level-1 IT. This reflects the
fact th a t there is structure in the decoding algorithm’s errors (i.e., when a prediction is wrong, a
nearby target is often chosen). As a result, level-2 IT and ITC use the more accurate characteriza
tion of the communication channel and this structured error distribution allows for more efficient
error-correcting codes to be employed.
3.7.3 N otes
It is im portant to clarify th a t our prosthetic system has not actually achieved the ITC or ITRC. We
have simply measured and quantified the fundamental maximum bit rate given the channel’s error
statistics. This is standard practice in the communications literature and neuroprosthetic research
(Wolpaw and McFarland 2004; Taylor et al. 2003). Any realizable encoding scheme th a t is used
with this channel will achieve performance less than or equal to this bound. The use of information
transfer capacity as a metric is critical for a comparison between the channel properties of different
prosthetic systems.
While ITRC is a well-established metric, it is often useful to translate this m easure into a
more tangible number, namely how many words can be typed per minute using a given prosthetic
system. With a 5 bps communication prosthesis a patient could select one key per second from a
32 key keyboard (25 = 32 keys; 26 letters + space bar + 5 numbers). The 5 bps communication
prosthesis would allow one 5-character word (from this 32 key keyboard) to be selected every 6
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 60
In fo rm ation p e r TRIA L v s . A ccu ra c y
3.5
N=t<DCLCoraEoc
0.5
10040 60A c c u ra c y (% )
Figure 3.15: Information transfer as a function of accuracy. Solid blue curves represent the theoretical relationship between accuracy and level-1 IT for different numbers of targets. Diamonds correspond to measured IT values from online experiments with monkey H using an 8-target configuration. For a given experiment, the three diamonds are plotted against the experiments average single-trial accuracy: blue diamonds denote the level-1 IT, green diamonds denote the level-2 IT, and red diamonds denote the complete ITC.
seconds (including the need for a space bar selection) resulting in 10 words/minute. Furthermore,
an intelligent entry scheme can increase the communication throughput by exploiting redundancy
in the language of interest (e.g., text prediction software for mobile devices).
When reporting our primary results, we first made a rough conversion of 6.5 bps measurem ent
to 15 words/minute by assuming a 32 key keyboard, 5-character words (including the space bar),
and no text prediction. When making the same calculation for 6-character words, the result is
13 words/minute. There is room for an increase of a t least several words per m inute by using text
prediction algorithms th a t leverage the underlying entropy of English (or similarly French, Tamil,
etc.). This is why we arrived a t the quoted value of ~15 words/minute.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 61
3.8 Credits
The work detailed in this chapter has been published in a peer-reviewed journal (Santhanam
et al. 2006b). It would not have been possible if not for the support of a number of individuals.
Dr. Stephen Ryu was responsible for the initial experimental concept and surgical im plantation
of the electrode array. He also materially assisted with experimental design, anim al training,
data collection, and preliminary analysis. Byron Yu and Afsheen Afshar supported this study
with animal training, data collection, and analysis. I was responsible for experimental design,
infrastructure development, anim al training, data collection, and in-depth analysis.
We also thank Missy Howard for surgical assistance and veterinary care and Dr. Nicho Hat-
sopoulos for surgical assistance (monkey G implant), Drs. M ark Churchland and Maneesh Sahani
for scientific discussions, and Drs. Eric Knudsen and Tirin Moore for comments on our Nature
manuscript.
This study was supported by NDSEG Fellowships (GS,BMY), NSF Graduate Research Fellow
ships (GS,BMY), the Christopher Reeve Paralysis Foundation (SIR,KVS), the NIH Medical Scien
tist Training Program (AA) and the following awards to KVS: a Burroughs Wellcome Fund Career
Award in the Biomedical Sciences, the Stanford Center for Integrated Systems, the NSF Center
for Neuromorphic Systems Engineering a t Caltech, ONR, the Sloan Foundation, and the W hitaker
Foundation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 4
Factor A nalysis Investigation
4.1 Overview
In Chapter 3, we demonstrated th a t the careful design and implementation of prosthetic systems
can provide substantial increases in overall performance. At the same time, it is im portant to rec
ognize th a t our improvements were largely a product of our approach and careful choice of system
parameters, as opposed to the use of complex target decoding algorithms. We employed simple
Gaussian and Poisson models of neural firing rate due to their acceptance in the neuroscience field
(Dayan and Abbott 2001) and ease of computation. Now, we investigate whether a more sophisti
cated decoder can be developed and thereby achieve higher prosthetic performance.
As discussed in Section 3.2.2, we assumed th a t the spike counts for each neuron were indepen
dent once the reach endpoint was specified.1 This construction implies th a t there are no high-level
factors (e.g., overall attentiveness to the task, reach speed of the upcoming movement, reach cur
vature, etc.) th a t influence the recorded neural data (other than the reach target itself). If there
were, then these factors th a t are uncontrolled, and often unobserved, would modulate the under
lying firing ra te of our observed neurons in predictable fashions, thereby inducing measurable
unit-by-unit correlations in the spike counts th a t we observe. This would negate the assumption
of conditional independence (conditioned on endpoint).
With this in mind, our initial assumptions of conditional independence — despite being useful
for achieving a high performance system in Chapter 3 — are certainly gross approximations. While
one of the primary influences on PMd activity is reach endpoint (Messier and Kalaska 2000),
there is evidence th a t PMd activity can depend on factors other than target location, including the
type of grasp (Godschalk et al. 1985), the required accuracy (Gomez et al. 2000), reach curvature
(Hocherman and Wise 1991), reach speed (Churchland et al. 2006a), and (to some degree) force
(Riehle et al. 1994). If a given model only describes reach endpoint, the model cannot accurately
1For the Gaussian models, this assumption was made to avoid a problem of too little training data when fitting a full covariance matrix. For the Poisson models, independence is a natural consequence of the distribution that we chose.
62
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 63
reflect how the firing rate might change if any one of the unaccounted properties (e.g., reach speed)
perturbs the underlying firing rate. These fluctuations will appear as “noise” on the recorded
neural output, though the noise will be correlated between the observed neurons.
For example, consider the cartoon illustration in Fig. 4.1. Panel a shows the expected number
of spike counts of five neurons for a given reach endpoint (e.g., leftward reach). Panels b and c
show the observed spike counts on a two separate trials. For panel b, we suggest th a t the subject
might have been planning a slightly faster than average reach. Conversely, for panel c, the subject
might have been planning a slightly slower than average reach. Note how the reach speed does
not necessarily affect all neurons with the same polarity and magnitude. Some neurons elevate
their firing ra te (and hence observed counts) for faster reaches. Other neurons do the opposite and
they do so with different amplitudes. This neuron-by-neuron difference in polarity and magnitude
is commonplace among response properties (e.g., Churchland et al. 2006a).
c
iL lidL1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Neuron Neuron Neuron
Figure 4.1: Simple cartoon illustrating how spike counts can co-vary from trial to trial, a. Nominal mean spike counts for 5 neurons for a particular reach endpoint, b. Spike counts during a given trial for the same reach endpoint. Activity is either elevated or suppressed relative to panel a. The modulation may be due to an uncontrolled factor (e.g., speed), c. Spike counts during another trial.
In reality, we may not know if it is reach speed or some other variable th a t is causing the
trial-by-trial modulation; m any different factors can be involved and many of them are simply
unobservable (e.g., cognitive attentiveness to the task). We can instead attem pt to infer a set of
abstract factors for each trial, along with the mapping between the factors and the underlying
firing ra te of the recorded neurons. A good target decoding algorithm can use this knowledge to
then avoid m istaking the relatively unim portant trial-to-trial variations as being the signature for
an entirely different reach endpoint.
In this chapter, we first survey existing methods for learning these trial-by-trial abstract fac
tors. With these techniques, we were able to find lower-dimensional representations (e.g., 1-2
abstract dimensions) of our high-dimensional data (e.g., ~100 neural units). We found th a t a
small but measurable amount of the total variability in our data can be attributed to the unob
served factors (~15%). We then extended the relevant models to handle m ulti-target data. With
these modifications, we built a classifier th a t leveraged these learned factors to perform target
decoding. The use of this decoder led to a reduction of the decode error by up to ~75% (~20%
total prediction error became ~5%). For these models, we also tested whether Poisson-based mod
els were a better choice over Gaussian-based models. We found no benefits to using the more
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 64
computationally complex Poisson-based models, especially if the Gaussian-based model is fitted to
square-root-transformed data.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 65
4.2 Methods
4.2.1 Latent Variable M odels
The work here is based on “laten t variable models” which have been a statistical tool for analyz
ing empirical data since the early 1900s. In his brief and clear introduction to the topic, Everitt
(1984) defines laten t variables as “essentially hypothetical constructs invented by a scientist for
the purpose of understanding some research area of interest, and for which there exists no oper
ational method for direct measurement. Although laten t variables are not observable, certain of
their effects on measurable (manifest) variables are observable, and hence subject to study.” In
our case, the observable (i.e., output) variables are the neural spiking data th a t we record from
the electrode array. The latent variables represent the cognitive state of the subject. They encap
sulate the intended reach endpoint, as well as the uncontrolled and unobserved variables present
during the task. We can use the larger number of observed output variables to help triangulate
the smaller number of unobserved latent variables of the system.
The two classic methods to reduce dimensionality, and in essence reveal the underlying latent
variables, are Principal Components Analysis (PCA) and Factor Analysis (FA). As shown by Roweis
and Ghahram ani (1999), both of these techniques posit a generative model (or probabilistic process
of creating the data of each trial) with the following form:
x~Af(0 ,I) (4.1)
y |x ~ )V (C x ,R ) . (4.2)
The laten t state vector, x e [Rpx l, is Gaussian distributed with mean 0 and covariance I. Most
often, it is unobserved. The output, y e IR9xl, is then generated from a Gaussian distribution. The
m atrix C e [R9Xp provides the mapping between laten t state and observations, and R e Mgxq is a
diagonal covariance m atrix of the output noise process. In classic FA literature, the param eters C
and R are often referred to as the loading and uniqueness matrices, respectively. The variables x n
and y n denote independent draws from this generative model over N observations (trials), with
n e {1,. .. , N}. For non-zero centered y, the m ean across all training trials m ust be first subtracted
before fitting and applying the model.
Both PCA (or rather, sPCA2) and FA require th a t R be a diagonal matrix. In other words,
the variability in the output space is independent once x is specified. Without knowing the latent
variables, x, the individual components of the data may appear correlated but this correlation
solely arises from their underlying dependence on the factors in x. The difference between sPCA
and FA lies in the form of R. In sPCA, R is constrained to have the form el. For FA, R can be any
positive-definite diagonal matrix. This distinction is important. It is often quoted th a t the intrinsic
2The “sensible” PCA (sPCA) model is a probabilistic approach to PCA and yields the same mapping between latent states and observations as conventional PCA. This is demonstrated by Roweis (1998).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 66
variability of neurons is nearly proportional to the mean (e.g., Shadlen and Newsome 1998). The
proportionality constant is approximately 1. Therefore, forcing all of the neural units (components
of y) to have equal variance would not capture the property tha t higher firing rate units tend to
have higher overall variability and lower firing ra te units tend to have lower overall variability.
Using sPCA on neural data has the disadvantage th a t the mapping between laten t space and
observation space is chosen based on the most variable output units ra ther than capturing a more
accurate representation of the laten t variables.
The procedure of system identification, or “model training,” requires learning the param eters
from the observed data. The observed data includes N trials of y, an identically and independently
distributed (i.i.d.) sequence ( y i , y 2 , - - - , y i v ) denoted by {y}. With the model shown in Eqs. 4.1-4.2,
we only consider a single reach endpoint. Restricting the fit to only a single endpoint allows for
the characterization of the unobserved factors th a t influence the observations.
The model fitting procedure is an unsupervised problem since the hidden states are unobserved
and therefore unknown - we cannot use known values of the latent variables to help fit the param
eters C and R. The classic approach to system identification in the presence of unobserved latent
variables is the Expectation-Maximization (or EM) algorithm. The algorithm maximizes the like
lihood of the observed data over the model param eters (i.e., 6 = {C,R}). The algorithm is iterative
and each iteration is performed in two parts, the expectation (E) step and the maximization (M)
step. Iterations are performed until the likelihood converges. This results in the param eters th a t
correspond to the highest data likelihood P({y} 16). We can then estim ate the most likely x for the
observed data y. The exact fitting procedures for sPCA and FA are described elsewhere (Roweis
and Ghahram ani 1999; Ghahram ani and Hinton 1997) and are omitted here for the sake of brevity.
One open question is how to select p , the number of laten t dimensions. The objective of model
training is to best describe the training data within the constraints imposed by Eqs. 4.1—4.2. How
ever, with too many laten t dimensions the model training procedure will explain the training data
so well through the laten t space th a t there will be unrealistically small amounts of independent
observation noise (R). This is contrary to obtaining a simpler model (fewer la ten t dimensions) with
a more reasonable amount of observation noise. For example, when p is large, the model will have
enough laten t dimensions to explain a high proportion of variability (and importantly, covariance)
without using the independent observation noise. In this case, the model will not generalize well
for new (test) data. The technical term for this is “overfitting.” We used the standard approach of
partitioning data into training and test sets to assess a t which choice of p does overfitting become
a problem. The choosing of p is part of the process of “model selection.”
4.2.2 Poisson Output Model
Standard FA uses a Gaussian noise model but this might not be the most appropriate for our type
of data. Recall th a t our output variables are the spike counts from the recorded neurons and these
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 67
are naturally nonnegative integers. Furtherm ore, the means of these data are relatively low (e.g.,
<10). Hence, such data is not necessarily well-suited for a Gaussian distribution, since a Gaussian
has nonzero probability density for rational numbers and negative numbers. Neural count data
are usually considered to be Poisson or Poisson-like in their distribution (Dayan and Abbott 2001).
There are two possibilities to contend with this issue. One approach is to modify the raw data
by first applying a square-root to the counts and then centering the data about zero. It can be
shown th a t the approximation error induced when using a Gaussian distribution to fit Poisson
data is diminished if the Poisson data is first square-rooted (Thacker and Bromiley 2001). The
transformed data is then inputted into the standard FA. This is an approach th a t we tested.
The second option is to alter the generative model to allow for Poisson distributed noise in the
output variables. With this change, the model is now w ritten as follows:
x ~ N ( 0 , 1) (4.3)
y l |x~Poisson(M c'-x + cOA) for i e l , . . . ,q . (4.4)
The outputs, y* e Mo, are generated from a Poisson distribution where h is a link function mapping
IR — IR+, c* e [Rpxl and d l e IR are constants, and A e 1R_ is the time bin width. The function h ensures
tha t mean firing ra te argument to the Poisson distribution is nonnegative. We call this family of
models “Factor Analysis with Poisson O utput” (FAPO). The Poisson output distribution along with
the nonlinear mapping function h, makes an analytic solution to the EM algorithm intractable.
Hence, we m ust use a few approximations when performing the Expectation-Maximization algo
rithm.
E Step
The E step requires computing the expected log joint likelihood, E [log P({x},{y} | Q)\, over the pos
terior distribution of the hidden state vector, P ({x} | {y}, 0*), where are the param eter estimates
a t the Mh EM iteration. Since the observations are i.i.d. we can equivalently maximize the sum of
the individual expected log joint likelihoods, E [log P (x„, y„ 16)]. The posterior distribution can be
expressed as follows:
P(*n \ y n,Sk)oc P (y n Ix„,0&)P(x„ | dk). (4.5)
Because P (y n I x„) is a product of Poisson distributions ra ther than a m ultivariate Gaussian, the
state posterior P (x„ | y„) will not be of a form th a t allows for easy computation of the log joint
likelihood. Instead, we approximated this posterior with a Gaussian centered a t the mode of
log P (xn | y n) and whose covariance is given by the negative inverse Hessian of the log posterior a t
th a t mode. Certain choices of h, including h\(z) = ez and h^iz) = log(l + ez), lead to a log posterior
th a t is strictly concave in x n. In these cases, the unique mode can easily be found by Newton’s
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 68
method. We chose h = h i , to avoid the problem with h\\ namely, = ez and thus small changes
in the trial-by-trial hidden state lead to very large changes in the underlying output mean if biased
in regime of large z (opposite effect in regime of small z).
For each tria l n, let Qn be a Gaussian distribution in tha t approximates P (x n | y n,6k)- The
expectation of the log joint likelihood for a given observation can be expressed as
&n =EQn [logP(x„,y„ |0 )], (4.6)
and the expectation of the log joint likelihood over all of the N trials is simply the sum of the
individual S n terms:
6 = E q [log P ({x},{y} | 0)]N
= L S n -n-1
M Step
The M step requires finding (learning) the Qk+i th a t satisfies:
6k+i = argmax E q [log P({x},{y} | 0)]. (4.7)9
This can achieved by differentiating £ with respect to the param eters, 6. Learning the c l
and d l param eters in Eq. 4.4 can be somewhat challenging. We wish to maximize the following
objective function, with respect to c l and d l :
N
n=1
q
£ - h (c!-x„ + d lj A + y lnlog {h [c!-x„ + d lj a |i=1
(4.8)
This optimization can be solved by recasting the expectation over Qn, applying Gaussian quadra
ture approximations, and then iteratively searching for a solution using conjugate gradient meth
ods (Yu 2007).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 69
4.2.3 E xtensions to Accom m odate M ultiple Targets
All of the prior techniques are intended to be used for data collected while the subject is reaching
to a single target and can help quantify unobserved factors th a t affect the neural activity. To use
FA (or FAPO) to help decode target endpoint, we tried two different forms of the generative model.
We cover these two forms in the context of the Poisson-based framework (refer to Eq. 4.4), but the
same formulation is applicable for the Gaussian-distributed outputs as will be shown later.
The first closely mimics the decode algorithms th a t we used in our BCI experiments (see Sec
tion 3.2.2). We fit a separate FAPO model for each target and this is formally w ritten as follows:
x ~ N (0,1) (4.9)
y l |x ,s ~Poisson(ft(Cg-x + dg)A) for i e (4.10)
The random variable s is the mixture component indicator and is a discrete probability distribution
over M} (e.g., P(s) = ns). During model fitting, we assume s is known and we take ns = for
all s. We then decode test trials by choosing the FAPO model, indexed by reach endpoint, th a t best
describes the data. We do so by finding s (the most likely s), using the following operation:
s = argm axP(s |y ,0 ) (4.11)S
P(y I s,6)P(s)= a rg m ax ----------- (4.12)P (y 16)
= argmax P (y \s ,6 ) (4.13)s
= argmax 1 P (y ,x | s,8) d x (4.14)S J x
= argmax f P (y | x ,s,0 )P (x ) dx. (4.15)S Jx
The second approach is to share the same output mapping between target locations and incor
porate the effect of reach endpoint through the shared latent space. We can formalize this model
as follows:
x \ s ~ N ( f i s, Ls) (4.16)
y l | x ~ Poisson(/i(c!-x-i-<P)A) for i e l , . . . ,q . (4.17)
To find s we then performed the operation:
s = argmax f P (y |x ,s ,0 )P (x |s )d x . (4.18)S J x
The difference between these models is subtle but important. In Eq. 4.10, there is a separate
set of c l and d l variables for each target location. Essentially this generative model defines a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 70
different laten t space for each reach endpoint. In Eq. 4.17, however, the c l and d l variables are
shared and the data for each endpoint is separated by their different means in the laten t space </is).
Regarding Xs, if it were set to I, the noise properties in the latent space are forced to be identical
for all target locations. Alternatively, if Zs is allowed to vary per target location, the model can
capture the possibility th a t certain targets might incur less variability in the plan activity than
other targets. In order to simplify our model, we chose Xs = I.
We also tried Gaussian-based output distributions with models analogous to the ones above.
We dubbed these “Factor Analysis with Gaussian Output” or FAGO for short. They are w ritten as
x ~ N ( 0 , 1) (4.19)
y I x ,s ~ AffCgX, R s) (4.20)
and
x | s ~ N ( t i s, I .s) <4.21)
y |x ~ A « C x ,R ) , (4.22)
respectively. Note th a t the observations are no longer mean-centered about zero for the latter
model. Rather, the mean observation vector for a particular reach endpoint is mapped from the
underlying latent space mean (i.e., C fis). An example of how these clusters might appear is shown
in Fig. 4.2. We chose the number of laten t dimensions to be 3 to allow for convenient plotting of
the data.
r.3
-5-5
-10 -10
Figure 4.2: Latent Space Example for FAGO. Each point corresponds to the inferred latent space variable x for a given trial. The coloring of the data points denotes the the upcoming reach target. Points of similar color are clustered together since all of these trials correspond to the same reach target.
A short derivation is provided for FAGO in Section 4.6. The derivation for FAPO is omitted as it
is lengthy and does not additionally provide any im portant insights. While the details are similar
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 71
to FAGO on a high level, the computations are significantly more laborious due to the need for
approximations to non-linear functions during the course of the EM algorithm. Yu (2007) provides
a description of the operational details.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 72
4.3 Results and D iscussion
4.3.1 Data Characterization
We first wanted to better characterize the empirical variance in our data.3 Primarily, we asked
how much of the total variance is due to the underlying trial-to-trial variability and how much is
due to the intrinsic noise properties of the output neurons. Mathematically, in the model described
by Eqs. 4.1 and 4.2, the laten t space variability manifests itself in the output space as CC' (shared
variance). The independent variance is found in the m atrix R. Once the model training is complete,
CC' + R is the best fit covariance of the raw data. I t does not necessarily match the empirical
covariance of the data due to the reduced rank of C and the diagonal constraint on R.
Let us assume, for example, th a t the shared variance is large compared to the independent
variance. In this situation, a FA model may be better apt to describe a tria l than the simple
decoding models in Chapter 3. The simple decoding models ascribe all of the variance in spike
counts to be independent along the output dimensions (neural units). However, as previously
discussed, variations in spike counts from their mean values might actually be due to a change in
some unobserved factors. The hope is th a t FA can identify these trial-by-trial variations, provide
a richer description of the data, and allow for a more robust mechanism by which we can decode
target endpoint. Understanding the relative proportion of shared variance to independent variance
can help build intuition on how much improvement FA might be able to deliver when applied to
our data.
To this end, we started by segregating our data by reach target, and fit a separate model
to each endpoint (Eqs. 4.19 and 4.20). We considered the spike counts in the window [150:350]
after target presentation. As discussed a t the end of Section 4.2.1, an appropriate p, the number
of latent dimensions, m ust be chosen. To assess this free parameter, we further split our data
into approximately two equal halves, one to serve as a training set and the other as a test set.
For each test trial, we computed the likelihood using the FA model built for th a t tria l’s reach
endpoint {CS,R S}. We summed the test tria l’s likelihoods to obtain the total test likelihood. The
test likelihood as a function of p is shown in Fig. 4.3 with two curves, one each for monkeys G and
H. The curves are normalized such th a t their peaks occur a t 1.
The results show th a t overfitting is an issue for even relatively small values of p. Why might
this be? There are surely many independent factors th a t influence the upcoming reach — reach
direction, distance, curvature, speed, force, etc. However, for our dataset, we may have limitations
in our ability to resolve the laten t space. One, our reaching task was highly stereotyped. There
was low variance in the reaches within the subset of reaches to the same endpoint. Two, the
intrinsic noise properties of our neurons may be large relative to the shared reach variability.
Given a relatively small number of neurons (~100) and training trials for each reach target (~50),
3 Again, the data is simply the neural data binned into spike counts per sorted neural unit.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 73
1.005
M onkey G M onkey H
<o5 0.995"O
ioc
1 2 3 6 84 5 7N um ber of L aten t D im ensions (p)
Figure 4.3: Test likelihood reveals the ideal number of latent dimensions. As overfitting sets in, the test likelihood declines. Data from monkey G (red) and monkey H (blue) are plotted. Curves are self-normalized.
it is likely th a t we are unable to identify more hidden factors without m isestimating the model
param eters and overfitting. We were careful to choose p to be small for the remainder of our
analyses, usually equal to 1 when building a model for a single reach target, and equal to 8 when
building a m ulti-target model according to the description in Section 4.2.3.
Having chosen the number of laten t dimensions in our model, we returned to investigating the
partitioning of the total variance between the shared and intrinsic processes. Again, we segregated
the training trials by reach target (s) and fit the standard FA models (FAGOs). For each FAGOs,
we obtained the intrinsic variance per neural un it (R*.). We only retained the neurons th a t were
tuned for target location in this analysis, as per our standard tuning criteria (ANOVA; p < 0.05).
Then, for each neuron-target pair, we also computed the total raw variance from the data aloneRS
(v|). We finally derived the fraction of the total variance attributable to the intrinsic variance
and took the mean of this ratio across all neuron-target pairs.
For monkey G (dataset G20040508; p = 2), the intrinsic variance contributed 85% of the total
variance, on average. For monkey H (dataset H20041217; p - 2), the result was similar with the
intrinsic variance accounting for 89% of the total variance, on average. This indicates th a t there
is measurable shared variance found by the FA model fit, but this quantity is relatively modest.4
We therefore inferred from the intrinsic-to-total variance ratio th a t there should be a small to
moderate improvement in target decoding when using the FA-based target for these datasets.
This is indeed the case and will be shown later.
Finally, using the same FA models as above, we computed the “intrinsic” Fano Factor (ratio
of intrinsic variance, RA, over the mean spike counts for th a t neuron-target combination). This
allowed us to compare the intrinsic Fano Factor (FF) against the theoretical FF of a Poisson noise
4It is important to note that we may certainly be underestimating the amount of shared variance in this system. If p were larger, there would be greater opportunity to assign more output variability to the shared variability of the system. But as we showed before, given our data limitations, it appears that we overfit for larger p . To find the true value of p , we may need a very large number of trials and neural units.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 74
distribution. This quantity is often of interest in the neuroscience community (Tolhurst et al.
1983; Gur et al. 1997; Bair and O’Keefe 1998; Averbeck and Lee 2003). The FF of any Poisson
distribution should be 1 since the variance is equal to the mean. Figure 4.4 shows a histogram of
the intrinsic FF for all neuron-target pairs as analyzed on two separate datasets. The red overlay
corresponds to the raw FF th a t is computed on the data alone, making it clear th a t the intrinsic FF
shifts considerably to the left when shared variance is taken into account. The average intrinsic
FF was 0.97 while the average raw FF was 1.18 for monkey G.
Monkey G
250
200
100
Monkey H
F a n o F a c to r F a n o F a c to r
F ig u re 4.4: Intrinsic Fano Factor. The distribution of intrinsic FF (blue histogram) was computed by taking the intrinsic variance over the m ean after fitting a FA model. The distribution of FFs computed from the raw data is overlaid (red). A simulation was also performed so th a t the intrinsic FF distribution could be compared to the theoretical distribution (gray curve), a . Results from monkey G (dataset G20040508). b. Results from monkey H (dataset H20041217).
We then took random draws from Poisson distributions th a t had the same means as those
measured for the neuron-target pairs in our data. The number of draws was equal to the number of
trials per condition used to initially tra in the model. We then calculated the FF for this simulation,
obtaining the “theoretical” FF adjusted for the limited number of samples (i.e., trials). This is the
gray overlay in Fig. 4.4. The figure shows th a t the intrinsic FF is much closer to the theoretical FF
than the raw value. Nonetheless, the intrinsic FF is not a perfect match to the theoretical FF. For
monkey G, the intrinsic FF overrepresents values a t both tails. For monkey H, the intrinsic FF
distribution is mostly overrepresenting for values greater than 1 (often known as “super-Poisson”).
It is difficult to determine whether th is mismatch is a true property of the data or simply an
artifact of a poor FA model fit. Since we are primarily interested in improving the overall decode
performance, we will soon discuss this data fitting issue in th a t context.
4.3.2 Target D ecoding
Given the encouraging results from our deconstruction of the variance into shared and intrinsic
components, we next implemented target decoding using the FA framework. We first compared 4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 75
different decoders. One pair consisted of the simple, independent-Gaussian (G) and -Poisson (P)
models. The second pair were the FA models th a t fit separate output mappings per target endpoint
(Eqs. 4.19-4.20 and 4.9-4.10). We refer to these models as FAGOsep and FAPOsep, respectively.
For all of the decode analyses, the window for counting spikes started 150 ms after target
presentation (i.e., Tgjt ip=150 ms). We initially computed decode accuracy for two separate plan
lengths, ■ int =150 ms and ■ int =250 ms. The trials were first shuffled so as to remove any effects
such as reduced attention or muscle fatigue th a t systematically progress over the course of the
experiment. Then, we set aside 50 trials per condition to tra in the model. The number of latent
dimensions was chosen to be p = 1 (we found th a t increasing p did not improve the overall per
formance). The fitted model was later used to decode the reach target for the remainder of trials,
as previously described in Eq. 4.15. The average decode accuracies for each model is shown in
Table 4.1.
Table 4.1: Factor analysis performance comparison. Decode accuracy was computed using data from both monkey G (dataset G20040508) and monkey H (dataset H20041217) for various models.
spike window (ms)Decoding Models
G (%) P(%) FAGOSeP (%) FAPOsep (%)H [150:300] 82.0 87.1 90.3 87.2
[150:400] 87.6 91.5 96.1 96.1G [150:300] 88.2 91.5 93.8 91.0
[150:400] 90.9 93.8 95.6 95.1
There are two im portant points to note regarding the findings in Table 4.1. First, the perfor
mance improvement when using an FA style model is relatively small. The increase in decode
accuracy was as little as 2.3% and only as high as 4.6%. The dataset for monkey H, with analy
sis window [150:400] is the most promising. The performance obtained with the FAGOsep decoder
(96.1%) is statistically significantly different than th a t of the Poisson decoder (91.5%), as confirmed
by checking the 95% confidence interval of each estimate.
Secondly, the Poisson-based FA (FAPOsep) does not perform as well as the Gaussian-based
version (FAGOsep).5 This is somewhat surprising since a Poisson distribution often better models
neural data than a Gaussian (see Fig. 3.4), especially for small time windows. I t appears th a t this
difference vanishes if the Gaussian is fit to the square-root transformed data and a FA approach
is employed. The performance difference between the two models is less noticeable for longer
windows. The effects we saw could be due to a variety of reasons:
1. The Poisson fitting procedure includes several approximations, which could be resulting in a
5 Note that the simPle Poisson model easily outPerformed the simple Gaussian Partially because the data for the latter was not preprocessed with a square-root transform.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 76
sub-optimal fit, even if the training likelihood increased with every EM iteration. Often the
training likelihood would even s ta rt to drop after ~100 iterations, albeit slowly.
2. The output noise process may not be exactly Poisson as evidenced by Fig. 4.4, despite its
wide popularity in the neuroscience community. Gaussian models may better capture the
sub- and super-Poisson characteristics of the data.
3. The Poisson models are able to match the decode performance of the Gaussian models for
longer windows. This might be due to the larger signal-to-noise ) in this large-window
situation. Since the variance is equal to the mean for a Poisson distribution, the signal-
to-noise increases as firing rates are higher. For smaller windows, the spike counts for the
neural units can be ra ther low. In a window such as [150:200] ms after target presentation,
the trial-by-trial spike counts for a un it are most often only 0, 1, or 2. This would make it
difficult to determine the underlying mean and correlation structure, especially if there are
only a few number of units and trials. We speculate th a t the sub-optimal EM approximations
may be easily susceptible to erroneous model fitting in this sort of regime.
Again, the above discussion covers FA models th a t employ separate output mappings for each
target endpoint. We also examined the FA models th a t share a single combined output mapping for
all of the target endpoints (Eqs. 4.21-4.22 and 4.16-4.17). These are the FAGOcmb and FAPOcmb
models. Before using these models to decode, we need to choose an appropriate value of the model
param eter p, the number of la ten t dimensions. We had done this for the single-target FA model
in Section 4.3.1. For FAGOcmb and FAPOcmb, since the model m ust share a single output mapping
for all reach targets, there exists a slight complication as detailed below.
Intuition suggests th a t a well-fit model of the form in Eq. 4.22, should be such th a t the ex
pected observation mean for a given target (i.e., E (y \ s)) closely matches the empirical mean of the
data for th a t same target. The model states th a t the expected mean is Cfis, where s is the target
location of interest. The observations lie in a q-dimensional space while the vector p s is in a lower
p-dimensional space, and C is clearly rank p. Thus, the vector C ps lies within a p-dimensional
subspace spanned by the columns of C. If there are M total targets, the space of possible obser
vation means lie in a t most an (M - l)-dimensional space. Hence, p m ust be a t least (M - 1) if
we are to ensure th a t the model can always capture the appropriate target-specific output means.
Recall th a t for our prior analysis of the la tent space, we segregated data by target location, fit the
FA model, and then computed the test likelihood for increasing values of p .6 We had found the
optimal p was either 1 or 2. Given this information, we posited th a t p = M (simply one more than
p = (M -1 )) might be an ideal choice for FAGOcmb or FAPOcmb.
To test our intuition, we fit a FAGOcmb model for different values of p and decoded target
endpoint on a separate set of trials. The results of this analysis using data from monkey H (dataset
6Since the data was separated by target for that computation, the mean was fit separately and does not influence p .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 77
H20040928) are shown in Fig. 4.5. The spike count window was set to [150:400] ms and p was
swept from 1 to 15. The plot indicates th a t performance saturates early. While p < 8 could be a
safe choice for the number of laten t dimensions, p = 8 appears to be safe, as we had intuited. We
set p - 8 for all further analyses with FAGOcrab and FAPOcmb.
10090
350
40
2 4 6 8 10 12 14
Figure 4.5: Choosing the number of latent dimensions for FAGOcmb- We found the average decoder accuracy for each tested value of p. The performance saturates around p = 6.
Next, we compared FAGOcmb and FAPOcmb to their counterparts, FAGOsep and FAPOsep. Sim
ilar to Table 4.1, we chose the time period of [150:300] ms after the target presentation to calculate
the spike counts. Table 4.2 shows the performance from the FAGOcmb and FAPOcmb techniques
versus the FAGOsep and FAPOsep approaches. The most striking aspect of the comparison is th a t
FAPOcmb does not suffer from the same performance degradation as FAPOsep. In FAPOcmb, a
smaller number of param eters are fit jointly against all of the trials. This results in a simpler
model th a t is not as susceptible to overfitting as FAPOsep. Additionally, we also noted th a t the
performance of FAGOcmb only slightly edges th a t of FAGOsep while being roughly equivalent to
FAPOcmb- This trend continued over various different window sizes and across two additional
datasets (data not shown). Consequently, we chose to use FAGOcmb over the other alternatives
due to its overall decode performance and speed of computation.
Table 4.2: Factor analysis performance comparison between separate output mapping models and combined output mapping models. Decode accuracy was computed using datasets from monkey G (dataset G20040508) and monkey H (dataset H20041217).
Decoding Modelsspike window (ms) FAGOSeP (%) FAPOsep (%) FAGOcmb (%) FAPOcmb (%)
H [150:300] 90.3 87.2 91.9 92.2G [150:300] 93.8 91.0 94.5 93.9
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 78
4.3.3 D atasets w ith More Shared Variability
The datasets th a t we used so far contain highly stereotyped reaches and the timing of the trials
were very regular. As such, it is perhaps not surprising th a t we were unable to benefit greatly from
FA, a technique th a t identifies shared variability. The previous data characterization analyses
showed th a t the shared variability accounted for approximately 15% of the total variability. Thus,
the datasets G20040508 and H20041217 simply do not have a large amount of shared variability,
and a significant portion of the decode error in due to the intrinsic variance of the neural units.
Furthermore, the baseline performance (as computed from our simple Poisson-based decoder) was
already sufficiently high. Hence, there was little room for improvement in the average decode
accuracy when we employed the FA techniques.7
We attem pted to locate a dataset th a t possessed trial-by-trial variability and one in which
shared processes contribute heavily to the overall data variability. As chance may have it, we
have precisely th a t dataset from our BCI experiments. In those experiments, we presented a mix
of BCI trials (short trials, chained rapidly together) and standard reach trials. The BCI trials in
G20040427 and H20040928 had total tria l lengths of approximately 400 ms. For the real reaches,
most trials had plan periods greater than 400 ms and we discarded any catch trials with tim
ings shorter than this. Therefore, we could analyze neural activity up to ~400 ms after target
presentation regardless of the tria l type (BCI versus real reach).
We know from other related studies (Kalmar et al. 2005; Gilja et al. 2005) th a t there can
be substantial gain modulation as a chain of BCI trials progresses and th a t simply normalizing
single-trial responses by the average firing ra te across the array can improve decode performance.
Therefore, we trained on a set of data th a t included both BCI trials and reach trials. This resulted
in an ideal type of dataset. The FA methods could potentially represent the gain modulation as an
underlying factor and the target decoder could perhaps benefit from this more accurate model.
Figure 4.6 shows a comparison between the simple Poisson-based decoder and the FAGOcmb
decoder.8 The FAGOcmb model had 8 laten t dimensions (p - 8). We have plotted the decode error so
as to better illustrate the difference between the two methods. A number of window lengths were
tested for each monkey. The performance differential between simple Poisson-based decoding and
FAGOcmb decoding was appreciable. For monkey H, however, there was a less dramatic boost
in performance for 75-100 ms windows. As stated before, we suspect th a t the signal-to-noise
ratios is too low for the neural data when measured over these small windows lengths. We do not
have sufficient trials to counteract this phenomenon. For long window lengths, the performance
improvement can be very dramatic (up to ~15%) in both monkeys. These BCI datasets have nicely
illustrated the power of using FAGOcmb for situations where there is variability in the task itself.
7We cannot simply drop units or reduce the training set size in order to bias ourselves in a lower performance regime. This is because adjusting these two parameters will then directly influence how much data we have to accurately fit the FAGOcmb model. Plus, this approach will not increase shared variability.
8We did not analyze the data using the other FA decoders since we had already sufficiently determined that FAGOcmb is the best option from the family of decoders described in Section 4.2.3.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 79
a b
25
20
50 150 200 250 300100
35
25
HI0>T3oo0)D
50 100 150 250200 300Plan W indow Length (ms) Plan Window Length (ms)
F ig u re 4.6: Comparison of simple Poisson-based decoder (black) w ith the FAGOcmb decoder (red), a. Monkey G (dataset G20040427). Models were trained on the first 75 tria ls per condition and tested on the remaining 76 trials per condition in the dataset, b. Monkey H (dataset H20040928). The training set consisted of 65 trials per condition and the test set had 67 trials per condition.
It is worthwhile to express the improvements in decode accuracy into the ITRC (Information
Transfer Rate Capacity) metric th a t we so vigorously espoused in Chapter 3. For these BCI
datasets, the total ITRC would have increased by approximately 1-1.25 bps if we would have
used FAGOcmb during real-time experiments. This constituted an ITRC increase of 15-20%, which
is more than the 8-15% increase in decode accuracy since the performance increases in decode
accuracy are amplified when expressed in term s of ITRC.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 80
4.4 Summary
In this chapter, we have investigated the use of more sophisticated decode algorithms in the hopes
tha t we can achieve higher prosthetic performance. While we were able to demonstrate significant
breakthroughs in performance with the system previously outlined in Chapter 3, we hoped to
extend these advancements even further here. Factor Analysis techniques were used to help better
account for trial-by-trial variations in uncontrolled and unobserved aspects of the prosthetic task.
Simple data characterization analyses showed th a t there was sufficient potential for performance
improvement using these methods.
We applied minor extensions to the conventional FA model and adapted it for the purpose of
decoding target endpoint. We found th a t using an entirely separate model for each reach end
point was not as effective as fitting a single model to the entire dataset. The la tter strategy re
quires fewer model param eters and may be less prone to estimation error and overfitting. Surpris
ingly, the complicated extensions to support Poisson-distributed were deemed unnecessary since
the Gaussian-based models did equally well, and even better in some instances, when data were
square-root transformed. This allowed us to dispense with FAPOcmb and avoid the lengthy com
pute times associated with fitting those models.
The full utility of the FA methodology was demonstrated with our BCI datasets where the task
design had different operating modes (BCI vs. reach trials). This resulted in much more shared
variability and FAGOcmb was able to consistently and significantly outperform the conventional
methods. For a clinical prosthetic setup, the situation of mixing BCI and reach trials would not
be realistic since the patient would be paralyzed. However, even for a clinical BCI the set of
actions available to the patient may be so heterogeneous th a t there may be underlying factors th a t
significantly modulate the outputs, even though the factors are irrelevant to the task itself. If this
is the case, FA can be one tool by which the system designer can combat performance degradation.
Finally, we chose to use a probabilistic framework and construct a generative model a priori
th a t we felt reasonably describes how our neural data relates to the reaching task. A potential
disadvantage, however, is th a t we m ust employ an unsupervised learning algorithm to optimally
fit the model without assigning a cost for how well or how poorly the model can classify the data.
The fact th a t our FA approach improves the performance, despite this drawback, indicates th a t
we may be revealing something intrinsic about the system. On the other hand, another algorith
mic strategy would be to tra in a classifier th a t expressly accounts for misclassifications during the
fitting process. This is the approach taken by several standard algorithms in the field of machine
learning, including neural network classifiers, support vector machines (SVMs), and Gaussian
processes. Furthermore, there is also the possibility of using a hybrid between supervised and
unsupervised methods. An eventual comparison between the FA approach and these other ap
proaches would be fruitful and will help better frame the broader impact of what we have shown
here.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 81
4.5 Credits
This effort was greatly facilitated due to initial computational work by Byron Yu and Maneesh
Sahani. M athematical techniques developed for other purposes were directly applicable for our
decoding problem. Furthermore, there were many valuable conversations with Byron and Ma
neesh during the course of this project. Lastly, the rotation projects of Vikash Gilja and Rachel
Kalmar were thought-provoking for the work here. Their studies explored the question of non-
stationarities in the BCI data — Gilja et al. (2005) showed tha t a simple array-mean firing-
rate normalization could improve decode performance and Kalmar et al. (2005) highlighted the
gain modulation occurring within a BCI chain. This helped us identify the key datasets for Sec
tion 4.3.3.
We also thank Dr. Stephen Ryu for performing the electrode im plant operations for monkeys
G and H, Missy Howard for surgical assistance and veterinary care, Dr. Nicho Hatsopoulos for
surgical assistance (monkey G implant), and Afsheen Afshar for helping with animal training and
data collection (monkey H).
This study was supported by NDSEG Fellowships (GS,BMY), NSF G raduate Research Fellow
ships (GS,BMY), the Gatsby Charitable Foundation U nit (MS,BMY), and the following awards to
KVS: a Burroughs Wellcome Fund Career Award in the Biomedical Sciences, the Stanford Center
for Integrated Systems, the NSF Center for Neuromorphic Systems Engineering a t Caltech, ONR,
the Sloan Foundation, the W hitaker Foundation, and the Christopher Reeve Paralysis Foundation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 82
4.6 Appendix: M athematical Derivations for FAGO
The generative model is
x |s ~ J V (Ms>I s) <4.23>
y |x~ A T (C x ,R ). (4.24)
The random variable s is the mixture component indicator and is a discrete probability distribution
over M] (e.g., P(s) = ns). Given s, the la tent state vector, x e R px l, is Gaussian distributed
with mean fis and covariance Zs. The output, y e (R<?xl, are generated from a Gaussian distribution
where C e IR9xp provides the mapping between laten t state and observations and R e M9*9 is a di
agonal covariance matrix. The variables x n and y n denote independent draws from this generative
model over N trials. The set of all trials are denoted as {x} and {y}, respectively.
4.6.1 E Step
The E step of EM requires computing the expected log joint likelihood, E [log P ({x}, {y}, {s} 10)], over
the posterior distribution of the hidden state vector, P({x} | {y},{s},0*)> where 0* are the param eter
estimates a t the Mh EM iteration. Since the observations are i.i.d. we can equivalently maximize
the sum of the individual expected log joint likelihoods, E [log P (x n,y n,sn | 0)].
The laten t state and output observations are jointly Gaussian given s:
(4.25)(
Yn C Ps„ Z u Z12 'p | s n = A( i
<.x ». j I Psn L21 Z22. ,
= NCZS C + R CZg
And, therefore, the posterior distribution of the hidden state can be w ritten as
■P(x n I y n>s n) = N {Psn + ^“21^11 (y« — C P s n) > ^22 — ^12)
= N [Psn + Psn (y n ~ ) > 2 Sn - Psn CZSJ ,
(4.26)
(4.27)
(4.28)
where fiSn = ZSnC'(R + CZSnC') 1. The inverse in /5Sn can be computed efficiently using the matrix
inversion lemma:
(R + CZS C T 1 = R ”1- R “1C (Z :1 + C ,R “1C r 1C 'R -1. (4.29)vn on
For observation n, let Qn be the Gaussian posterior la ten t state distribution th a t has mean %n
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 83
and second moment En:
4n = E [xra I y«;sn]
= P e n + P * n (y « - C t* S n ) <4 '3 0 >
S n = £ [ x nxJl |y n,s„]
, =V ar(xre \ yn,sn) + E [ x n |y „ ,s „ ]£ [x „ |y „ ,s„ ] '
= Z8a- p anCZtn +Zn?n (4.31)
The expectation of the log joint likelihood for a given observation can be expressed as follows:
£ n = E Qn [log P (x„, y n, sn | 0)] (4.32)
= Qn[p(y„ | x„) + log P (x„ | s„) + log P (sn)] (4.33)
= E Qn [ - |lo g (2 x )- ^ lo g (|R |)- ^y[jR _1yre +y[jR_1Cxre - ix ^ C 'R _1Cx„
- | lo g ( 2 x ) - ^log(|X s„ |) - “ xJjXjJxn + fi'SnL ; ! x n - V . , <4’34>
■t log P(s„)].
The term s th a t do not depend on x„ or any component of 6 can be grouped as a constant, C, outside
the expectation. Doing so, and simplifying further, we have
&n = Eq n [y[lR _1Cx„ - i X;C 'R -1Cxn + fi'SnL ^ x n - ix^XJ^x,*]
- ly'n^yn - | i o g ( i R i > - l ^ x " 1/! ., - |iog(|2:«, I) + c
= y'nR~1C * E Qn [x„] - ± T r (C 'R ^ C * E Qn [ x ^ D
+ < I %1 * % w - ^ T r (2 s; 1 * % [* » < ]) <4-35>
- ynR_1y«- ^iog(iR i)- ^ ' rex“VSre - i i o g ( |x , J ) +c
= y ’nR -1CSn - ± T r (C 'R ^ C * S„)
+ < ^ n - ^ T r ( X s- 1*E„) (4.36)
- J y ^ V * - Jlo g d R D - i ^ X - V s * - ^log(|Xs„ \) + C.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 84
The expectation of the log joint likelihood over all of the N observations is simply the sum of
the individual &n terms:
8 = E Q [log P ({x}, {y}, {s} 10)]N
= t S n .71— 1
4.6.2 M Step
The M step requires finding (learning) the 6k+i th a t satisfies:
dk+i = argmax E q [logP({x},{y},{s} 10)]. (4.37)e
This can achieved by differentiating £ with respect to the parameters, 9, as shown below. The
indicator function, I(sn = s ) will prove useful. Also, let N s = L ^=1/( s« = s).
• Prior probability of mixture component identification s:
1 Nns = — £ /(s„ = s) (4.38)
M = i
• State vector mean, for mixture component identification s:
d& ^a = ^ I(sn = s ) — sP s] = 0°P s n=1
1 NPs = 1rr Z I (sn = sH n (4.39)
Ms n=l
State vector covariance, for m ixture component identification s:
= £ H s n = s) j - ^ T r ( i ; 1 * S n) + fi'si ; 14n - ^ p ' ^ P s - ^ lo g d Z J1!)
= L ■1 («» = «) (ZS_1 ( l E 'n~ + lu l l ' s ] s ; 1 - ^ J 1) = 0
N1 = L *(«» = S)ZS 1 f J e b - Hs{'n + IflsHs) Zs 1
2 » = i M 2 ^ " 2 '
1 N 2 N i NZs = T T L I ( -Sn = s)S„ - — Us £ / («„ = s)^ + ITT P sP s Y . I ( -s n = s>
iv s n = l •'vs n = l ■A's n = l
1 ^= F l Jr(s'1 = s )E '1_ (4-40:>
i V S 7 1 = 1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4. FACTOR AN ALYSIS INVESTIGATION 85
Loading matrix:
NCnew = 11
\n= 1 / \ 7 i — 1
N<4.41>
• Noise matrix:
R new = ± diag ] £ y ny'n - c n™sny'n71 = 1
(4.42)
where the diag operator sets all of the off-diagonal elements of a m atrix to zero.
4.6.3 Inference
Once the model param eters have been chosen, the generative model can be used to make infer
ences on the training data or new observations. For the training data, the hidden state vector
x is the only variable th a t m ust be inferred. The posterior distribution of x is a Gaussian, ex
actly as described previously. This yields in a distribution Q with mean fis + /3S (y„ - C ^s) and
covariance l s - p sCLSn. Therefore, the maximum a posteriori estim ate estim ate of x is simply
Ps + Ps{yn-Cf l s).
When performing inference for a new observation, the mixture component identification, s, is
now unknown. The posterior distributions of both s and x, given the data, y, are of interest. The
first of these distributions can be expressed as follows:
P (s |y ,0 )o c P (y |s ,0 ) .P (s |0 )
oc ns r e x p [(y -C /ts)'(C i:sC' + R )_1(y - C / is) | .|CZsC' + R|2 1 j
To infer x given the data, the following derivation applies:
(4.43)
MP (x |y ,0 )cx £ P ( x |y ,s ,0 ) P ( s |y ,0 ) ,
S = 1
(4.44)
where the first factor in the summation is the conditional Gaussian (see 4.28) and the second is a
weighting as shown above. Simply put, the distribution of x given y (but not conditioned on s) is a
mixture of Gaussians.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 5
HermesB
5.1 Overview
In the previous chapters, we have shown how chronically implanted electrode arrays have en
abled a broad range of advances in basic electrophysiology and neural prostheses. Those successes
motivate new experiments, particularly the development of prototype implantable prosthetic pro
cessors for continuous use in freely behaving subjects, both monkeys and humans. However, tra
ditional experimental techniques require the subject to be restrained, limiting both the types and
duration of experiments. In this chapter, we present a dual-channel, battery powered neural
recording system with integrated 3-axis accelerometer for use with chronically implanted elec
trode arrays in freely behaving primates. The recording system, called HermesB, is self-contained,
autonomous, programmable and capable of recording broadband neural (sampled a t 30 kS/s) and
acceleration data to a removable compact flash for up to 48 hours. We have collected long duration
datasets with HermesB from an adult macaque monkey which provide insight into timescales and
free behaviors inaccessible under traditional experiments. Variations in action potential shape
and RMS noise are observed across a range of timescales. The peak-to-peak voltage of action po
tentials varied by up to 30% over a 24 hours including step changes in waveform amplitude (up
to 25%) coincident with high acceleration movements of the head. These initial results suggest
th a t spike-sorting algorithms can no longer assume stable neural signals and will need to tran
sition to adaptive signal processing methodologies to maximize performance. During physically
active periods (defined by head mounted accelerometer), we observed significantly reduced 5-25
Hz local field potential (LFP) power and increased firing rate variability. Using a threshold fit to
LFP power, 93% of 403 five-minute recording blocks were correctly classified as active or inactive,
potentially providing an efficient tool for identifying different behavioral contexts in prosthetic
applications. These results demonstrate the utility of HermesB, and motivate using this type of
system to advance neural prosthetics and electrophysiological experiments.
86
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 87
5.2 Background
The development of chronically implantable electrode arrays for in vivo neural recording in pri
mates (both monkeys and humans) have enabled a range of advances, in neural prostheses (Ser-
ruya et al. 2002; Taylor et al. 2002; Carmena et al. 2003; Musallam et al. 2004; Santhanam et al.
2006b; Hochberg et al. 2006) and basic electrophysiology experiments (Maynard et al. 1997, 1999;
Hatsopoulos et al. 2004). However, most current state of the a rt experimental systems require the
animal to be restrained, restricting both the types and duration of experiments. As a result there
is limited data available with which to characterize both the nature and content of neural record
ings over the broader range of timescales and free behaviors relevant to future prosthetic and
electrophysiology experiments. To make the transition to new experimental paradigms possible,
continuous, long duration, broadband (sampled as 30 kS/s) neural recordings from freely behaving
subjects are needed. These datasets will enable validation of spike discrimination and decoding
algorithm performance in freely behaving subjects, multi-day plasticity and learning experiments,
determination of neural correlates of free behaviors, and direct measurem ent of the stability of
neural recordings. Here, we present results, using data collected with HermesB, addressing the
latter two questions to demonstrate the utility of long duration recording from freely behaving
subjects.
Recording stability is a critical issue for neural prosthetic systems. Here we define recording
stability, or more specifically, recording instability, as the change in the gross presence or absence
of neural signals off of an electrode, time varying fluctuations of the observed action potential
shape, and time varying fluctuations in the background noise process on an electrode. Neural
recordings during any given session are considered to be quasi-stable; there is usually very little
change in the numbers of neurons recording and their action potential shapes during a several-
hour recording session. However, recording instability has been observed between sessions, likely
resulting from the subjects freely behaving in the housing room between sessions (Suner et al.
2005). Long durations datasets will enable us to reconcile the current assumptions of quasi-stable
neural signals during a highly controlled experimental session with the variation in the neural
signals observed between sessions.
Figure 5.1 summarizes the significant timescales in the life of a chronically implanted electrode
array. We are normally only concerned with neural recording stability in the high-yield recording
period during which most experiments are conducted (Schwartz 2004). W ithin this window, neural
interface systems are potentially affected by recording instability a t all three timescales (short,
interm ediate and long). However, current experiments, with their discrete daily recording periods,
are only able to characterize variations on timescales less than a few hours and across days.
Past studies have only characterized neural recording stability on short (seconds or minutes;
Fee et al. 1996; Suner et al. 2005) and long timescales (days; Williams et al. 1999; Suner et al.
2005; Liu et al. 2006). Over very short timescales, variations in action potential waveform shape
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERM ESB 88
Array Recording Lifetime Timescales
Fade In- 3 w eeks
High Yield6 m onths • 1 year
Fade Out
S h o rt1 m s -1 min
Current Stability Characterization
Newly Available ' Characterization
Intermediate1 min - 1 day
Long1 day +
Figure 5.1: Summary of array lifetime and available data for recording from individual, identifiable neurons using a chronically implanted electrode array.
are a function of the short-term spiking frequency of a neuron (Fee et al. 1996); a t high frequencies
the waveform is typically broader (in time) and decreased in amplitude due to depletion of ion gra
dients in and around a highly active neuron. At longer timescales, the variation in spike waveform
is not as systematic, potentially arising from a number of mechanisms such as neural plasticity,
physical movement of the electrode relative to nearby neurons, chemical degradation of the elec
trode tip, or immunological reactions to the im plant (Lewicki 1998; Schwartz 2004). Studying
neural stability a t interm ediate timescales will enable characterization (along with existing short
and long timescale data) of the full range of timescales relevant to a neural interface system and
may also provide insight into long timescale phenomena.
Experimental protocols in which the subject is retrained limit the types of behaviors th a t can
be observed. Long duration datasets recorded during free behavior provide neural data associated
with a broader range of behaviors than traditionally possible. To maximize system performance,
prostheses m ust be sensitive to behavioral and neural changes across the day and m ust react
robustly in the face of variable background conditions. For example, such systems should reliably
detect different behavioral contexts such as whether the user is awake or asleep, or intending to
be active or not. If a neural prosthetic attem pts to decode the users intentions during sleep, it may
waste battery power or cause undesired behaviors. Alternatively, if such a system does not reliably
detect waking periods, the user may lose the ability to interact with the world. The ability to record
neural activity across a variety of different behaviors and contexts will allow for characterization
of the true neural environment in which chronic implantable systems will operate.
Long duration datasets are of considerable interest for certain, multi-day electrophysiology ex
periments. Chronically implanted electrode arrays can support multi-day learning or plasticity
experiments. However, because the period between traditional daily recording periods is unob
served, there is no reliable method to track single neurons over multiple days. Recording systems
for freely behaving subjects can allow researchers to record while the animal is in its home cage,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 89
providing continuous monitoring of neurons identified during an active experiment. Without such
monitoring, it is not possible to certify th a t the same neuron is being observed day-to-day and
thereby reliably state th a t the adaption is not the result of recording instability in the system.
Recording systems have been developed for freely behaving animals (Vyssotski et al. 2006;
Mavoori et al. 2005; Obeid et al. 2004). However, these systems often have one or more of the fol
lowing limitations: 1) they cannot sample a t full broadband (30 kS/s) potentially missing relevant
signal features, 2) their battery life or storage capacity is limited to a few hours or less for broad
band recording, 3) they cannot switch recording param eters, such as input channel, autonomously,
limiting the range of possible experiments, and 4) they are not designed or tested for portable use
with primates.
Here, we describe the first generation of a portable recording system, dubbed HermesB as a
moniker for “Hours of Electrophysiological Recordings in Monkey with an Extensible System, Ver
sion B.” HermesB addresses the limitations of previous systems by providing a full broadband,
long duration, autonomous recording platform for use with chronically implanted electrode arrays
in primates. An extensible system, HermesB can easily evolve to include new components such
as experimental analog front ends (e.g., Harrison et al. 2006), making HermesB a useful proto
typing platform as well. Importantly, the system interfaces (although not exclusively so) with
the popular 96-channel electrode array manufactured by Cyberkinetics Neurotechnology Systems,
Inc. (CKI). This im plant has been adopted by many electrophysiology research laboratories, is now
FDA approved, and in clinical trials with hum ans (Hochberg et al. 2006). Understanding the char
acteristics of this array and the stability of signals recorded from it can provide great benefit for
translating the technology to the clinical setting.
To demonstrate the utility of HermesB, we present preliminary results derived from multi-day
broadband recordings from a freely behaving macaque monkey which provide insight into previ
ously unobserved timescales and behavioral contexts. A macaque was chosen as it is generally
accepted as the ideal animal model for researching neural prostheses for hum ans (Isaacs et al.
2000; Wessberg et al. 2000; Serruya et al. 2002; Taylor et al. 2002; Shenoy et al. 2003; Carmena
e t al. 2003; Musallam et al. 2004; Santhanam et al. 2006b). In particular we present data quan
tifying the stability of neural recordings over timescales of 5 min - 54 hours. We address three
aspects of recording stability identified by Lewicki (1998): the change in mean waveform shape
over time, changes in the background noise process and changes in the waveform shape due to
electrode movement. We illustrate the ability to identify contextual periods in our long duration
neural recordings and specific attention is paid towards identifying and understanding systematic
differences in firing ra te and local field potential (LFP) during active and inactive periods.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 90
5.3 Methods
5.3.1 System D escription
HermesB is composed of three separate components, as shown in Fig. 5.2. F irst is the specially
designed array connector, a custom designed low profile 96-pin zero insertion force (ZIF) connec
tor. Next, there is the analog signal conditioning pathway, consisting of a printed circuit board
(PCB) with amplifiers and filter. Finally there is the digital signal acquisition unit, comprised of a
separate PCB th a t includes a microcontroller, accelerometer, and compact flash interface. D ata is
stored on a high capacity non-volatile compact flash (CF) card, which is periodically removed and
downloaded to a PC. The system is powered by a pair of high efficiency, rechargeable cell phone
batteries and is entirely housed in a protective and electrically shielded casing attached to the
monkey’s skull. Table 5.1 summarizes system parameters.
Analog Board
8 :1 -
NeuroPort
AccelADC
ARM C ore
Low P a s sH igh P a s s
H(<B)
CompactFlash
Microcontroller
Digital Board
Figure 5.2: HermesB block diagram. The neuroport is a custom 96-channel zero insertion force connector which mates to the electrode array connector. The analog signal conditioning and digitization and storage are implemented on separate circuit boards to reduce noise and provide modularity.
HermesB is architected to be a flexible and extensible experimental platform. The modular
construction allows new components, such as experimental analog front ends (Harrison et al.
2006) or neural decoding backends, to be incorporated into the system without extensive redesign.
Additional ADC channels are available to support new analog data sources, such as chronically
implanted electromyogram (EMG) electrodes. The commercial-off-the-shelf (COTS) CF interface
leverages increasing Type I card capacity without redesign or remanufacturing.
Although capable of interfacing with any electrode array, the current HermesB was designed
to work with the 96-channel chronic electrode array manufactured by (CKI). The array is wired to
a CerePort™ connector pedestal. A custom low profile ZIF connector was developed to m ate to the
pedestal. The new connector is comprised of a mechanical component which allows access to all 96
electrodes and a PCB interface th a t provides access to a subset of 32 electrodes. Three different
PCBs were manufactured and can be interchanged manually to switch between each bank of 32
electrodes.
The analog signal conditioning path is illustrated in the upper dashed box of Fig. 5.2. The
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB
T able 5.1: HermesB Parameters
Interface CapabilitiesSimultaneous active channels 2Programmably accessible channels 16Connector accessible channels 963-axis accelerometer range +6gStorage currently 6 GB
Physical Param etersEnclosure size 60x70x45 mmEnclosure mass 127 gElectronics mass including batteries 77 gNeuroport mass 16 gGrand total mass 220 g
Signal Conditioning ParametersHigh pass filter (-3dB) < .5 HzLow pass filter (-3dB) 7.4 kHzNeural sampling rate 30 kSamples/sAccel, sampling rate 1 kSamples/sADC Precision 12 bits
Battery ParametersBattery Capacity 1600 mAhTypical Battery Life a t 67% recoding duty cycle 19 hrs
Measured Circuit ParametersInput referred noise 3.5 pV RMSInput referred precision 1 pV per LSBAmplifier Gain ~600x
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 92
analog board has a 16-channel input connector and can be mechanically bridged to one of two out
puts of the 32-channel HermesB ZIF connector. First, all 16 input channels undergo impedance
conversion using a CMOS op-amp (Texas Instrum ents TLC2254) in a unity gain configuration.
The desired channels are then digitally selected using two 8:1 analog multiplexers (Analog De
vices ADG658). From here, two identical signal paths are provided to amplify and filter two of the
16 input channels. The selected signals are high-pass filtered to remove electrode DC bias, then
amplified with a differential instrum entation amplifier (Texas Instrum ents INA121). Three path
matched references are provided — two reference signals and analog ground — selectable via a
jumper. Each reference signal corresponds to a platinum-iridium reference wire th a t accompanies
the electrode array and provides an electrical reference local to the implantation site. The ampli
fied signal is further amplified and low-pass filtered (Texas Instrum ents OPA2344) before being
passed to the digital board. The positive and negative voltage supplies are provided by the digital
board.
The digital module is depicted in the lower dashed box of Fig. 5.2. An ARM microcontroller
(Analog Devices ADUC2106) is responsible for system control, digitization of the neural and ac
celerometer signals, and management of the CF card. The analog signals are digitized by a 12-bit
successive approximation ADC integrated into the microcontroller. D ata packets are buffered us
ing the internal memory of the microcontroller and written to the CF card. The 3-axis accelerom
eter (ST Microsystems STM9321) is mounted on the digital board to measure the subject’s head
movement. The digital module includes the necessary positive and negative voltage regulators for
both the digital circuitry and the analog module. The negative and positive supply voltages are
provided by separate batteries (negative: Varta EasyPack, 43.5x35.4x5.8 mm, 14 grams, 3.7 V;
positive: LG Chem ICP633450A1, 49.0x33.6x6.8 mm, 24.3 grams, 3.7 V).
The entire system, including batteries is housed in a lightweight protective aluminum case,
shown in Fig. 5.3a,b, secured with methyl methacrylate, which was in tu rn secured to the skull.
The case encapsulates all of the electronics, batteries, and neuroport connector. The enclosure
was sealed with a watertight gasket and was also electrically connected to the monkey by way of
standard grounding hardware so as to provide electromagnetic (EM) shielding for the electronics
contained inside. Figure 5.3c,d,e provides photographs of the connector, analog module, and digital
module. Figure 5.3a includes a schematic of the tight packing of the components into the protective
shell. We used non-conductive foam to fill any open space; this ensured only very little, if any,
vibration inside the shell. Hence, we can safely state th a t our accelerometer records head motion,
and not any residual board vibrations. Furthermore, the weight of our system (220 g grand total;
Table 5.1) was light enough th a t no behavioral differences were observed in the animal and the
accelerometer data we collect represents natural behavior.
HermesB is controlled by custom firmware. The firmware includes a basic command inter
preter th a t allows the user to interact with the system in real time when tethered to a portable
laptop computer (via a RS232 serial port), as well as write simple sequencing programs for fully
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 93
sc rew ■ _________ protec tive shield
^ a $ ^e t I digital boardf an a lo g board
ib a tte rv ll battery s p T m ethyl m eth ra cry la te
I silicone, e la s t o m e r i
ch ro n ic e le c tro d e array
w hite m atter
illu stra tions n o t to s c a le
acce le ro m e te ra x e s
Figure 5.3: HermesB components, a. Illustration of enclosure mounted on monkey’s head along with side profile showing the stack up of the various components. The space labeled ICS denotes the intracranial space between the dura and skull. This space is larger in humans compared to monkeys and may be a source of greater recording variability when electrode-based systems transition to the clinical domain, b. Aluminum enclosure with centimeter ruler, c. Custom low-profile neuroport connector, d. Digital board, e. Analog board.
autonomous execution. A sample program is shown in Fig. 5.4. The system is highly configurable.
Parameters such as neural sampling ra te and accelerometer sampling ra te can be initially set to
balance sampling precision against data storage capacity. The experimenter can then specify a se
quence of epochs, each either a data sampling period or quiescent sleep period, to balance between
recording duration and battery lifetime.
5.3.2 R ecordings and A nalyses
Prim ary data for this report was collected from an adult, female macaque monkey (monkey D)
freely moving in a home cage. All experiments and procedures were approved by the Stanford Uni
versity Institutional Animal Care and Use Committee (IACUC). We performed a sterile surgery
to implant a head restrain t system. At this time, we also implanted a silicon 96-electrode array.
The electrode array (Cyberkinetics, Foxborough, MA) was implanted in a region spanning the arm
representation of the dorsal aspect of pre-motor cortex (PMd) and prim ary motor cortex (Ml), as
estimated visually from local anatomical landmarks. Surgical methods are very similar to th a t
described in Hatsopoulos et al. (2004).
HermesB was used to record starting in August 2005. A number of recording profiles were
used. One profile consists of recording a t a 67% duty cycle (5 minutes of recording followed by
2.5 minutes of sleep). Total experiment duration is approximately 54 hours, broken up into three
18-hour sessions. The recording-sleeping duty cycling is a compromise between memory capacity
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 94
% Setup th e sam pling frequncy to be 30 kHz % on th e n e u ra l channels and sample th e % acce lero m eter once every 30 n e u ra l sam ples.%
% I n i t i a l 600 sec s le e p p e rio d fo llow ed % by loop o f 300 sec . o f reco d in g from % channels 4 & 6 and 150 sec . o f s leep % and loop in d e f in i t e ly .n e u ra lf re q 30000 % Line 0a c c e lp e rio d 30 % L ine 1addsleep 600 % Line 2addsample 4 6 300 % L ine 3addsleep 150 % Line 4addloop 3 % L ine 5
Figure 5.4: Sample program for autonomous execution. The initial sleep period is added to allow the experimenters sufficient time to close up the protective enclosure before recording commences.
and battery life constraints.1 Between each session, the monkey was transferred from the home
cage to the training chair to replace the battery and download the ~4 GB of recorded data. During
these “pit stops,” recording was continued with a second smaller CF card and a new battery to
m aintain dataset continuity. Other profiles include round-robin recording of 4—8 channels over a
24-hour schedule. Two neural channels were recorded per dataset in full broadband (0.5 Hz to
7.5 kHz a t 30 kSamples/s with 12-bit resolution) and a 3-axis accelerometer fixed to the monkey’s
head was sampled (1 kSamples/s with 12-bit resolution) and stored to compact flash.
Accelerometer data was used from each five-minute data block to label the blocks as either
“active,” “inactive,” or “mixed.” Blocks in which the maximum accelerometer magnitude (MAM)
was greater than 1.25 g were labeled active, blocks in which the MAM was less than 1.15 g were
labeled inactive, and blocks th a t were within these bounds were labeled mixed. These thresholds
were selected to roughly balance the number of active and inactive blocks to a ratio similar to th a t
of day (lights on) versus night (lights off) blocks (as we expect low activity when the lights are off),
while retaining a 0.1 g m argin between classifications.
The recorded neural signals from each five-minute block were post-processed with the Sahani
spike-sorting algorithm, which is a unsupervised clustering algorithm as described in Chapter 2
and by Sahani (1999), and further analyzed by Zumsteg et al. (2005). Spike times were identi
fied using a threshold determined from data across the block (3cr with respect to the RMS noise
estimate from filtered data). As described earlier in Chapter 2, a spike waveform, or snippet, com
prised of a 32 sample window around the threshold event, was extracted and aligned to its center
of mass. Snippets were projected into a 4-dimensional robust, noise-whitened principal compo
nents space (NWrPCA) and clustered using a maximum a posteriori (MAP) clustering technique.
iwhen recording continuously the current memory capacity can be quickly exhausted. At very low duty cycling, the battery is discharged by the static power consumption before the CF card is full, despite sleeping the microcontroller in between recording periods.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 95
Well-isolated neural units were identified and cross-referenced across blocks by hand. For LFP
analyses, broadband data was filtered by applying Chebyshev Type I lowpass and bandpass filters
with a passband ripple of 1 dB. Power spectral density estimates were calculated using the Welch
periodgram method.
5.3.3 R ecording Stability A nalyses
To quantify the stability and consistency of waveforms recorded from our electrode array, we an
alyzed data from our long duration recordings in several ways. First, snippets from the entire
session were extracted using a 3a threshold and projected into a single 2-dimensional principal
components subspace. By graphing a 2D histogram of the snippets in this subspace, snippets with
similar waveform shapes are grouped into distinct clusters. Movement of these clusters across the
session indicates drift in the waveforms. The magnitude of the shifts were assessed by examining
the actual waveform shapes over these periods of interest. Second, to observe more continuous
shifts in waveform shape, we chose a feature of the average waveform shape, the peak-to-peak
voltage (Vpp), and plotted this quantity over the course of the recording session. The Vpp was
determined on a block-by-block basis by using the Sahani algorithm per block, providing local
estimates of the average waveform shapes.
Lastly, to search for potentially abrupt changes in waveform shape, the neural recordings were
analyzed in conjunction with the accelerometer data. An abrupt change in electrode array position
in the cortex would presumably manifest itself as an abrupt change in waveform amplitude, as
the neuron-electrode distance would change. If such changes do occur, we additionally presume
they are correlated with high acceleration events such as vigorous head movement. Therefore,
we examined the neural recordings straddling high acceleration events (>3 g threshold) and ex
amined the Vpp metric around these events. To help search for events of interest, we computed
the local change of the Vpp metric ( V ^ er/V^ ^ ore), constructed from 200 snippets before and 200
snippets after the acceleration event. This allowed us to narrow in on high acceleration events
th a t coincided with large shifts in action potential waveform shape.
Single neural units were used for these analyses to observe the recording stability from our
chronic implant. One im portant concern is th a t if a un it is automatically identified by the spike
sorter, large changes in the un it’s waveform shape could cause the unit to no longer be classified
correctly, thereby obscuring the analyses. Thus, the NWrPCA projections of the selected units
were examined separately by us to ensure th a t snippets were not ignored, or improperly included.
This was accomplished by ensuring all units included in the aforementioned stability analyses
were well-isolated, high-firing-rate single neurons, and sufficiently distinct from other signals on
their respective electrodes such th a t reasonably large variations would not result in a high ra te of
misclassification.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTERS. HERMESB 96
5.4 Results
5.4.1 System Verification
Figure 5.5 shows example data recorded from our anim al subject freely moving in her home cage.
The top traces, Fig. 5.5a, show the three-axis acceleration measurements of the monkey’s head
over a 10 second period. This data segment was recorded in the early evening during a period in
which the monkey was quite active. Figure 5.5b shows 100 ms of broadband neural data recorded
from a single channel on the electrode array. The LFP (local field potential) is easily visible, as are
a number of spikes “riding” on top of the LFP. Figure 5.5c shows the same data segment filtered
with a 250 Hz high pass HR filter, which is the same filter used when spike sorting for our other
HermesB analyses.
2g
o
-2g 1 s
:L5 ms
1X1 5 ms
F ig u re 5.5: Sample neural and accelerometer data recorded from a freely behaving monkey, a. Accelerometer channels, x (blue), y (green), and z (red). The DC levels on the channels is due to the particular orientation of the accelerometer with respect to E arth ’s gravity vector, b. Unfiltered broadband neural data taken from the middle of the recording period, c. Filtered broadband neural data.
D atasets like th a t shown in Fig. 5.5 were used as part of a three step verification process to
ensure the accuracy of HermesB recordings. The steps were 1) measure HermesB circuit param
eters, 2) compare recordings of the CKI Neural Simulator made with HermesB and our standard
laboratory recording system (CKI Cerebus System), and 3) compare HermesB recordings of neural
activity in a rhesus monkey to recordings made by the fixed laboratory system.
The measured circuit param eters are summarized in Table 5.1. The input referred noise, mea
sured with grounded inputs, is comparable to or better than current state-of-the-art commercial
(CKI Cerebus System) and research systems (Harrison et al. 2006). The CKI Neural Simulator is a
playback device th a t simulates 128 channels of neural signals a t the amplitude of array recordings
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 97
(e.g., maximum of ~500 p,V peak-to-peak) and similar output impedance to a standard electrode
array. Figure 5.6a shows a side-by-side comparison of Neural Simulator recordings made with
the CKI Cerebus system (left) and with HermesB (right). The three spike waveforms are clearly
visible, with comparable levels of noise (measured as the spread of the curves) between the two
systems. Figure 5.6b shows a similar comparison for a channel from the electrode array, recorded
from a monkey sitting quietly in a prim ate chair. The figure shows the 10th-90 th percentile in
amplitude of action potential waveforms recorded from a single channel on the electrode array.
CKI C e re b u s H erm esB
.2 m s .2 m s
prpr.2 m s .2 m s
Figure 5.6: Comparison of snippets recorded with CKI Cerebus system (left) and HermesB (right), a. Snippets recorded from CKI Neural Simulator, b. Snippets from four neurons recorded from a single electrode channel in a monkey comfortably in a chair w ith head restrained. Snippets have been sorted and the 10th 901*1 percentile in am plitude indicated by the colored region for each waveform.
A five-minute recording was sorted using the Sahani algorithm which classified the spikes as
belonging to one of four units (indicated by different coloring). There were four separable units.
The spike snippets were projected into a lower dimensional subspace to verify th a t they originated
from separable clusters (data not shown). The waveforms are very similar between the two sys
tems, indicating th a t HermesB is comparable to current state-of-the-art commercial laboratory
equipment. Furthermore, the ability of HermesB to distinguish between several units on a sin
gle electrode builds confidence th a t this apparatus can serve to address the scientific goals posed
earlier.
5.4.2 R ecording Stability
Figure 5.7 shows neural recordings made over the course of 48 hours in October 2005. Figure 5.7a
shows a time series of NWrPCA cluster plots for five-minute data segments recorded a t the times
shown. Each cluster corresponds to a single neuron, and the movement (drift) of the relative
distance between these clusters is readily seen by scanning across the snapshots. The drift of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 98
the clusters in NWrPCA space reflects changes in spike waveform shape. Figure 5.7b shows action
potential shapes (voltage vs. time) from the same recording period. The colored region indicates the
10th-90 th percentile in amplitude. The lines of constant voltage provide a reference against which
one can see the large changes in waveform amplitude. These changes in action potential shape
have been previously observed across once-daily recordings (Suner et al. 2005). Here, preliminary
results from these continuous neural recordings of a freely behaving prim ate indicate substantial
variation in spike waveforms over interm ediate timescales as well.
1 7 : 4 4 - D ay 1 2 0 :2 4 - D ay 1 0 1 :4 4 -D a y 2 0 7 :0 4 - D ay 2
1 2 :2 4 - D ay 2 1 7 :4 4 - D a y 2 2 3 :0 4 - D ay 2 0 1 :4 4 - D ay 3
100(0 /
-1 5 0 (iV
1 7 : 4 4 - D ay 1 0 1 :4 4 - D a y 2 0 7 :0 4 -D a y 2 2 3 :0 4 - D a y 2
Figure 5.7: Neural recordings over a period of 48 hours (dataset D20051008). a. Histogram of spike waveform projections into a fixed 2D NWrPCA space. PCA space determined using 20,000 snippets uniformly selected across the tim e period. Each plot is the projection of 5 m inutes of data recorded from a signal channel a t the time shown. The green and blue circles denote identifiable single neurons th a t are analyzed in the bottom panel, b. Spike waveforms of two neurons for selected five-minute blocks. To better isolate the selected units, spike sorting was performed strictly w ithin a block and irrespective of the data in other blocks. Colored region indicates 10th-90 th percentile in amplitude. Horizontal lines indicate maximum and minimum voltage for each unit. Waveforms shown are recorded from a single channel using the same signal conditioning path. Note th a t between 17:44 (day 1) and 07:04 (day 2) Vpp, the peak-to-peak voltage, of the green waveform increases, while Vpp of the blue waveform decreases, showing th a t waveform changes cannot be attributed to fluctuations in signal conditioning pathway (connectorization, amplifiers, ADC, battery power, etc.).
Figure 5.8 shows a more continuous representation of the waveform changes over time. Panel
c shows the normalized peak-to-peak voltage, for the neuron identified in panel a, recorded from
a single channel over 54-hour periods. The normalized Vpp is simply the m ean Vpp for each
block, normalized by Vpp of the m ean waveform for th a t neuron across the entire 54-hour dataset.
Variability in waveform amplitude, up to 30% relative to the mean, is observed over a range of
timescales. There is a clear variation on the order of a single block (5 minutes of recording with
2.5 minutes of sleep) as well as changes on the order of several blocks, and even several hours.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 99
3 N e u ro n 1 C e1.4Q.
> 1.2
§ 1.0oz n a
1-°0.81" l I U i, ,„ I , , I ■ I fi ,
b N e u ro n 2 d20
? 1 S
flo= 5 , I_____I I . I ........ I..... ..I....., - L.......-LCC g i I I 1 I I I i I
6 AM 6 PM 6 AM 6 PM 6 AM 6 AM 6 PM 6 AM 6 PM 6 AM
F ig u re 5.8: Variation in Vpp and RMS. a,b . Histogram of spike waveform projections into NWrPCA space from two different electrodes recorded for different 54-hour datasets (D20060302. ch2 & D20060225.chl). Selected neurons are indicated by arrows. These neurons were well-isolated, c. Normalized Vpp of neuron #1 recorded over the 54-hour session, d. RMS noise of recorded channels over same period. e,f. Same two plots second dataset. The wide light gray regions indicate night, and the thin pink regions indicate “pit stops,” when the monkey was taken from the home cage and placed in a prim ate chair to service the recording equipment.
Figure 5.8d shows the RMS voltage of filtered neural recordings from three channels recorded
over five-minute blocks. All spikes, identified with thresholding a t 3ct of RMS noise, have been
removed from the dataset prior to the RMS calculation shown. Without the spikes, the RMS
value should offer a better measure of the true background noise process (Watkins et al. 2004).
Even after removing identifiable spikes, the RMS noise is highly correlated to neural activity (as
measured by mean firing rate). These variations (~5 pV) can partly result from distant spike
activity (i.e., neural activity is sensed by the electrode, but the signal does not rise above the spike
threshold because the spike amplitude is too small, or the neuron is too far away). Furthermore,
depending on which data block is analyzed to set the threshold, there can be differences greater
than 15 pV for a 3o threshold.
Figure 5.8e,f show similar results for the neuron identified in Fig. 5.8b which was recorded
from a different electrode during a different 54-hour period. Similar characteristics have been
observed for other channels (data not shown), indicating th a t the changes in waveform amplitude
observed in Fig. 5.8c,e are not unique to those channels. The large change observed a t 13:00 (day
1) in Fig. 5.8c is coincident with a vigorous head movement, and may have resulted from an abrupt
movement of the array, a possibility discussed below.
In our analysis of abrupt waveform changes, examination of recordings straddling high accel
eration events show, in nearly all cases, far smaller changes in waveform amplitude than those
observed over the interm ediate timescales of Fig. 5.8. For example, Fig. 5.9a,b show the local
changes in Vpp ( V ^ er/V^ ^ ore) for all 3+ g acceleration events for the same two neurons in
Fig. 5.8c,e. Over a recording period of ~50 hours for each session, there were ~1700 and ~800 high
acceleration events for panels a and b, respectively.
For nearly all events shown in Fig. 5.9a,b there is less than a 5% change in m ean waveform
amplitude straddling the acceleration event. There are, however, two events in Fig. 5.9b th a t show
much larger changes (labeled event 1 and 2). For the first of these events, the NWrPCA projections
of the before (blue) and after (green) snippets are shown in Fig. 5.10a. The significant change in
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 100
I 1-4jQ| 12
0 H %> 0.8 ® -i A5 1-40£l6 12 >
b ^*2 I
0.8 6 PM
- even t 1
6 AM 6 PM 6 AM 6 PM
F ig u re 5.9: Variation in waveform am plitude straddling high acceleration events, a. Local change in mean waveform am plitude (Vpp'er/Vpp^ore) for 200 snippets before and after 3+ g acceleration events (dataset D20060225. ch lu l) . N o te th a t th e h e ig h ts o f th e s y m b o ls d o n o t c o r re sp o n d to th e m a g n itu d e s o f th e a ccelera tio n even ts , b. Same as previous panel for dataset D20060302. ch2ul. Arrows in panel b indicate events of interest. Similar gray and pink shading as in Fig. 5.8.
waveform amplitude (1.25 x increase) is clearly reflected in the NWrPCA projection. A second unit
on this channel (the other cluster in the NWrPCA projection) shows a smaller change in amplitude
(only a l . l x increase) across the same acceleration event suggesting th a t the observed variation
does not result from changes in the signal conditioning pathway (not shown). For example, a
common shift in signal gain would result in equivalent waveform amplitude change for both units,
which was not the case here.
350
>Eo
d200 50
350
>£
3+ g Acceleration f Event
200 50 s
F ig u re 5.10: Variation in waveform amplitude for events identified in Fig. 5.9b. a. NWrPCA projection of 200 before (blue) and 200 after (green) snippets straddling acceleration event overlaid on NWrPCA histogram for all snippets in a five-minute block. D ataset D20060302. ch2. b. Peak-to-peak voltage of mean waveform amplitude averaged over 200 spikes centered around tim e point shown, for the neuron of in terest in panel a. The red vertical line m arks the >3 g acceleration event. c,d. Similar plots for event 2 in the same dataset.
Figure 5.10b shows a 200 spike moving average of Vpp for the block in which the event was
recorded. The close alignment between the acceleration event (indicated by the red vertical line)
and the step change in waveform amplitude strongly suggests th a t the relationship between the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 101
change in waveform amplitude and the high acceleration event is not coincidental. The profile is
consistent with an abrupt change in array position. Well before and after the shift, the array was
in a stable state, evidenced by the near constant waveform amplitude, while a t the time of the
large acceleration event there is a step change in the Vpp- Figure 5.10c,d show similar results for
the second event in Fig. 5.9b.
Im plications o f Neural R ecording Instability
These analyses of neural recording stability were our most novel investigations with HermesB. It
is in this particular class of experiments th a t HermesB is most differentiated from other portable
recording systems currently in use. W hat might be the cause for these variations in waveform
amplitude? The step changes in waveform amplitude appear, in some cases, to result from abrupt
shifts caused by head movement. For the non-abrupt variation in waveform shape and RMS noise
we believe there could be a number of factors th a t may play a significant role, including changes
in the cortical environment in response to subject activity, including “brain bounce,” changes in in
tracranial pressure (ICP), and other homeostatic factors. At short to interm ediate timescales (i.e.,
longer than bursting periods), Lewicki (1998) suggests th a t array movement, or more specifically
changes in the neuron-electrode distance, m ight play a role in waveform shape change. Fluctu
ations in the ICP could potentially move the cortex tissue relative to the array (or vice-versa).
Confirming such a relationship is beyond the scope of this work, though may be of interest in
future studies.
Since so few of the high acceleration events were coincident with large changes in waveform
amplitude, there is the temptation to dismiss these events as rare and unim portant. However,
a practical neural prosthesis will have to operate 24 hours a day and 7 days a week. As such,
the prosthetic system m ust be able to recognize the 3-4 abrupt changes th a t might occur in a
week, especially when such systems are eventually used for more ambulatory patients. In fact in
one stretch of ~84 hours of recording we found th a t there were many tens of events th a t showed
>10% change in average waveform amplitude coincident with a >3 g acceleration measurement.
Our results are only preliminary, however, and will require more datasets and more animals for
comprehensive characterization.
Traditional experimental protocols th a t utilize discrete, daily recording periods have provided
sparse information regarding neural recording stability. The day-to-day sampling restricts the
potential characterization of variations to timescales of either minutes or days. It is im portant
to note th a t similar variations were not observed within the hour long broadband recordings de
scribed in Suner et al. (2005). However, those recordings were made under a more traditional
experimental protocol in which a restrained monkey performed a repetitive reaching task. I t is
possible the more controlled and consistent environment of those recordings, in contrast to the
animal freely behaving in the home cage, produces a more consistent cortical environment (e.g.,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 102
less “brain bounce,” smaller changes in intracranial pressure, etc.) and thus reduced variation in
waveform shape.
We have shown examples from preliminary datasets of significant waveform shape and RMS
noise variation a t all three timescales. Both types of variation can have adverse affects on spike-
sorting performance, either through the use of an inappropriate threshold or outright misclassi-
fication. The improved statistical characterization of the stability of neural recordings enabled
by these new long duration datasets will allow the principled design and evaluation of sorting
algorithms. Tolerance to some instabilities in neural recordings has already been incorporated
into sorting algorithms. The short timescale variations in spike shape can be addressed by in
corporating firing statistics into the spike-sorting algorithm (Pouzat et al. 2004) and changes in
RMS voltage (from which the threshold is typically derived) can be addressed through adaptive
thresholding (Watkins et al. 2004). Long term variation, however, may require periodic re train
ing of the spike-sorting parameters. With such readjustments, experimenters report the ability
to track single neurons across months or even years (although experimenters cannot be sure the
same neurons are being observed without truly constant tracking, a capability now available with
HermesB). There does not appear to be a consensus on exactly what retraining period is required.
Current experiments th a t use discrete daily recording periods naturally update once per day.
The quality of the trained spike-sorting param eters is paramount. Poor sorting param eters,
and thus poor sorting performance, will affect all aspects of neural prosthetic system performance.
This was demonstrated in Chapter 2. This does not imply th a t systems should re train arbitrar
ily often. Frequent retraining can have significant costs. For advanced spike-sorting algorithms
(Sahani 1999), the training algorithm is computationally expensive. Although our recent power
feasibility study has shown th a t the power consumption of the algorithm in Sahani (1999) is small
relative to real-time classification, it was assumed th a t retraining would be required only every 12
hours (Zumsteg et al. 2005). If a much shorter training period is required, the power consumption
of training could quickly become significant.
Sorting algorithms with an adaptive training approach tha t continuously integrates over an
extended period, similar to the method proposed in (Bar-Hillel et al. 2004), as opposed to discrete
retraining, might be the best approach in light of the instability of neural recordings. A suitable
adaptive algorithm would have an effective training interval short enough to track variations in
waveform shape and background process, without the cost of traditional discrete retraining. The
apparent sparsity of abrupt changes in waveform shape due to rapid array movement may mean
th a t there are fewer problem scenarios in which abrupt retraining is required. Nonetheless, the
presence of these abrupt changes in waveform shape does suggest th a t to maximize spike classifi
cation accuracy, any algorithm would benefit from the ability to initiate discrete retraining when
step changes in the waveform shape are observed. As these chronic electrode arrays are implanted
in amputees (rather than tetraplegics), the head will move substantially. I t is worthwhile to note
th a t the space between the brain and the dura is larger in hum ans than monkeys. Therefore “brain
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 103
bounce” and other non-stationarities may be much more of an issue. Systems like HermesB will be
critical to characterize the recording stability and provide the test data for more robust, adaptive
spike-sorting algorithms.
5.4.3 Neural Correlates o f Behavioral Contexts
Figure 5.11 shows data from two 54-hour recordings. For the first dataset (Fig. 5.11a,c-e), there
were 438 data blocks. Active blocks, in which the monkey was putatively moving in its home
cage, constituted 40% of the blocks while 52% of the blocks were inactive. From the accelerometer
data (Fig. 5.lid ) , it is clear th a t the monkey was more physically active during the day, and as
expected, firing rates tend to be higher during these periods. Note th a t LFP power was generally
lower during these periods as shown in Fig. 5.l ie . During the “pit stops” (battery swap periods)
the monkey’s head was comfortably restrained in a fix position (the time duration indicated by the
pink bands); therefore, accelerometer magnitude remained flat a t 1 g. Likewise, few movements
were made and consequently firing rates were suppressed. These trends were consistent across
two datasets collected from different electrodes and a t different times. Neural activity recorded
simultaneously from a second channel show similar patterns (Fig. 5.11b,f-h).
* f
* ■ .I
« 50
Figure 5.11: Neural and accelerometer data recorded from a freely behaving monkey. a,b. Histogram of spike waveform projections into NWrPCA space from two different electrodes recorded for different 54-hour datasets (D20060302. ch2 & D20060225. chi). Selected neurons are indicated by arrows. These neurons were well-isolated, c. Firing ra te of the neuron shown in panel a calculated over a 1 second interval using a Hamming window. Red and blue data points were recorded in tim e periods labeled as “active” and “inactive,” respectively Green data points were recorded during unlabeled periods, d. Accelerometer magnitude over the recording period downsampled to 100 Hz. e. LFP power per block, recorded from the same electrode, calculated by integrating the power over the 5-25 Hz frequency band. f-h. Same plots for second dataset. Similar gray and pink shading as in Fig. 5.8.
As shown in Fig. 5.12a, the mean LFP power differed between “active” and “inactive” periods
in the 2-30 Hz and 50-100 Hz frequency bands. For the majority of this range the standard devi
ations are large relative to the difference in the mean; this relationship makes power modulation
in these bands an unreliable classifier for per-block behavior (i.e., “active” vs. “inactive”). However,
the 5-25 Hz band was well-separated, so the power in this range can be used to develop a reliable
classifier. This differentiation in LFP power is consistent with previous results showing th a t 10-
100 Hz LFP activity diminished during movement (Donoghue et al. 1998; Santhanam et al. 2003)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 104
and increased during sleep (Destexhe et al. 1999).
a b
■1451.5 2 2 .5 3 3 .5 4
LFP pow er (tiV2 ) x 1 °"10 2 0 30 40 50 60 70 80 90
Frequency (Hz)
Figure 5.12: LFP analyses for dataset D20060302. ch2. a. Power spectral density (PSD) recorded during “active” (red) and “inactive” (blue) periods. The th in lines are the mean PSDs and the standard error of the mean is represented by their thickness. The thickness of the wider translucent lines are the standard deviations. Each PSD is calculated over 5 minutes of data and their distributions were taken from data across the 54-hour dataset for neuron 1. b. Spectral power recorded during “active” (red) and “inactive” (blue) periods for the 5-25 Hz frequency band. The dotted line represents the learned classification threshold between “active” and “inactive” blocks.
Figure 5.12b plots 5-25 Hz LFP power versus MAM (maximum accelerometer magnitude) for
each five-minute block. When we classified the activity level of blocks by thresholding LFP power
a t -56.5 dB, 93% (131/141) of “active” blocks and 92% (175/191) of “inactive” blocks were correctly
classified. Results were similar for a second channel from a different session: 89% (150/169) of
“active” blocks and 88% (81/92) of “inactive” blocks were correctly classified with a threshold of
-57.1 dB. These results were obtained by picking the optimal linear classification boundary using
the first 40 active and first 40 inactive blocks and testing on the remaining blocks. Head posting
during “pit stops” can create confounds since the accelerometer was held in a fixed position even if
the monkey was otherwise active during these periods. Hence, these periods were removed prior
to the aforementioned analysis.
A similar classification was not as successful when using the average firing rate over a five-
minute block (data not shown). The mean and variance of the MAM increased as the firing rate
increased, but the likelihood of a small MAM (i.e., an inactive period) remained relatively high
even for high firing rates (data not shown). Recall th a t the electrode was implanted in a region
spanning PMd and M l, which is strongly believed to be involved in the motor planning and exe
cution of arm movements (Tanji and Evarts 1976; Weinrich and Wise 1982; Weinrich et al. 1984;
Godschalk et al. 1985; K urata 1989; Churchland et al. 2006b). If arm movements are made while
the head position remains fixed, firing rates could increase without large acceleration events. Also,
motor plans can be generated and subsequently canceled. Thus, absolute firing ra te may not be
the best proxy for activity level.
Given th a t “active” and “inactive” periods tended to occur during day and night, respectively,
the variations in firing ra te and LFP might be explained, in part, by circadian rhythm s (or direct
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 105
modulation by light level). One m ight hypothesize th a t 5-25 Hz LFP power is increased and
firing rates are depressed in association with day-night cycles. However, for blocks within a single
activity condition (either “active” or “inactive”), the differences between day and night for both
LFP and firing ra te were at least an order of magnitude smaller than the difference between
“active” and “inactive” blocks during either time period. This suggests th a t circadian rhythm s do
not heavily influence these effects.
As shown in Fig. 5.12, LFP is a promising proxy for activity level. Furtherm ore, LFP power
m easurement consumes less battery power than firing ra te measurem ent (a low-power LFP power
m easurement circuit is described by Harrison et al. (2004)), potentially enabling a power efficient
“sleep” mode when the user is inactive. When LFP power falls below a defined threshold, indicating
th a t the user is active, the prosthetic can switch out of this “sleep” mode. Furthermore, using
LFP thresholding could help prevent undesired movements from the prosthetic system during
“inactive” periods.
In future studies we plan to examine subtler context changes. Some contexts may require
fewer neurons for acceptable performance; under these conditions we can conserve power by dis
abling a subset of the neural channels. Under different contexts, users may require different sets
of behavioral responses (such as discrete target selection vs. continuous motion) or the underlying
dynamics of the observed cortical area may change drastically; we would like to respond to these
concerns by switching the decoding model according to context. By identifying contexts and ad
justing hardware configuration accordingly, it may be possible to boost performance in term s of
power consumption and decoding accuracy.
We were able to identify natural behavior across multiple days using accelerometer m easure
ments and correlating these to neural recordings. Such an ability coupled with more advanced
behavioral monitoring, such as chronically implanted EMG electrodes (Holdefer and Miller 2002;
Morrow and Miller 2003) or motion tracking, can enable the exploration of questions th a t have
been unapproachable until now. Mining large datasets to find the neural correlates of free be
havior may help us to develop new controlled experiments; these datasets are also necessary for
developing and testing neural prosthetics systems with the ability to operate autonomously over
extended periods of time. Similar investigations with EMG recordings are already underway by
other researchers and HermesB can serve as another tool in these types of experiments (Jackson
et al. 2006a,c).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 106
5.5 Summary
HermesB is a new, self-contained, long duration, neural recording system for use with freely be
having primates. I t records dual-channel broadband and 3-axis head acceleration data to a high
density compact flash card. Controlled by simple sequencing programs w ritten by the experi
menter, HermesB can autonomously change recording channel and pause recording during the
experiment. With a single battery charge, HermesB can record for up to 48 hours (at a low duty
cycle). With short breaks to replace the batteries and compact flash card, HermesB can record
nearly continuously for an indefinite period.
The high quality of the broadband recordings, despite being in the electrically noisy environ
ment of the home cage room (e.g., florescent lights), enables results from HermesB to be integrated
into experiments using the traditional laboratory rig. There are a variety of applications for such
a platform. For example, the long duration recordings, in concert with traditional experiments, en
able im portant multi-day learning and plasticity experiments, an application not explored in detail
in our experiments. Researchers can use HermesB to record during periods when the anim al is
outside the laboratory rig to provide continuous monitoring of significant neurons identified during
active experiments. And, we have already detailed how HermesB can be useful for investigating
neural stability and correlating different free-behavioral contexts to neural activity.
Recently, there have also been scientific reports involving the pairing of recording and stim u
lation. Using a portable system th a t is worn by the subject over the period of several days, it has
been shown th a t subpopulations of neurons can be made to induce different behaviors, presumably
due to a reshaping of neural connectivity (Jackson et al. 2006b). HermesB can also be extended
to include stimulation and by doing so can hopefully assist in these types of experiments in the
future.
At present, HermesB is in active use supporting a number of experiments. There is also ongo
ing development to increase recording capabilities. As CF technology and battery energy density
improve, recording duration will be expanded. Future generations of HermesB may also incorpo
rate wireless telemetry, more simultaneous recording channels, EMG recording capabilities, and
stimulation capabilities.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5. HERMESB 107
5.6 Credits
The work detailed in this chapter is in preparation for resubmission to a peer-reviewed journal
(Santhanam et al. 2006a). I t would not have been possible if not for the support from a number
of individuals. Michael Linderman assisted with the analog front-end development from the very
inception of this project and participated with testing the system, data collection, analyses for
neural stability, and the writing of our co-first authored journal article m anuscript. Vikash Gilja
was instrum ental in many aspects of the development, data collection, and analyses on behavioral
correlates. Dr. Stephen Ryu was the primary surgeon for electrode implantation. Afsheen Afshar
created mechanical drawings for the sealed aluminum enclosure th a t houses the electronics of
HermesB. I was responsible for the initial experimental concept and was involved either directly
or through a secondary capacity with all aspects of this work except electrode implantation.
We also thank Shane Guillory of Intragraphix, who designed and laid out the analog module
and laid out the digital module, Jim McCrae of JMC Design, Karlheinz Merkle a t the Stanford
Physics Machine Shop, Pascal Stang and Carter Dunn for their help designing and m anufacturing
HermesB, Mackenzie Risch for expert veterinary care, Dr. Aris Mendiola for medical consultation,
and Sandra Eisensee for adm inistrative assistance.
This study was supported by NDSEG Fellowships (VG,MDL,GS), NSF Graduate Research Fel
lowships (GS), MARCO Center for Circuit & System Solutions (THM,MDL), Medical Scientist
Training Program (AA), Bio-X fellowship (AA), Christopher Reeve Paralysis Foundation (SIR,KVS)
and the following awards to KVS: NSF Center for Neuromorphic Systems Engineering a t Caltech,
ONR Adaptive Neural Systems, W hitaker Foundation, Center for Integrated Systems a t Stanford,
Sloan Foundation, and Burroughs Wellcome Fund Career Award in the Biomedical Sciences.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 6
Future D irections
The work presented heretofore represents valuable progress toward the realization of cortically-
controlled prostheses. Furthermore, our research has helped promote several avenues of further
investigation.
For example, one specific study th a t was begun as a direct result of our BCI experiments was
“optimal target placement.” In our original BCI experiments, our possible reach target locations
were arranged in the patterns shown in Fig. 3.9. Splaying targets out angularly, as opposed to
linearly, recognizes the observation th a t most PMd neurons modulate their firing ra te more for the
upcoming reach direction than for the upcoming reach distance. However, we noticed th a t perturb
ing targets away from high-symmetry locations resulted in minor ITRC improvements and this is
reflected in our reported results from Chapter 3. The performance improvement was partially due
to the tuning properties of the particular neural units recorded from our electrode array. Hence, a
natural question is whether, given the particular neural units a t hand, one can choose the possible
target locations to optimize the single-trial accuracy and the ITRC. Cunningham et al. (2006a,b)
investigated this on quasi-simulated neural data using sophisticated convex optimization tech
niques. Their work suggests th a t true optimization of target placement based on neural response
functions could provide further performance improvements beyond w hat we have accomplished in
Santhanam et al. (2006b). Also, a recent integration of this target optimization technique into a
laboratory experiment has yielded encouraging results (~7% decoding accuracy improvement on
an 8-target task). Further experiments are needed to verify this result more fully.
Prosthetic systems can also help explain how neural circuits function. This, in turn , can re
sult in better prosthetic algorithms. For example, in designing systems for higher performance,
research can also help shed light on the fundamental speed of neural processing. By presenting
stimuli a t an increasingly faster rate, researchers can assess how well neural circuits can cope
with a demand for greater speed. We have recently begun some initial investigations by employ
ing a communication prosthesis with rapid target decodes. We noted th a t performance suffered as
108
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 6. FUTURE DIRECTIONS 109
the number of decodes th a t had occurred in rapid succession increased, and some neurons seemed
to gain modulate their tuning curve peaks as a series of decodes progressed (Kalmar et al. 2005).
While this may be evidence of a fundamental processing speed limit of pre-motor cortex, future
experiments are needed to investigate this further. Classic confounds include potential differences
in attention or reward expectancy as the number of successive decodes increases.
In the reverse, core neuroscience research in the motor regions of the brain will ultim ately
allow us to build better models for decoding neural signals for prosthetic systems. For example, in
Chapter 4, we saw how applying a more appropriate model of neural activity in the planning region
of the brain can help us develop more accurate Bayesian algorithms to decode the reach plan itself.
Likewise, the dynamical models (HNLDS) detailed in Section A.3.4 have potential for improving
system performance. HNLDS is an even more intricate framework by which we can model reach
plans. There are efforts underway to verify HNLDS; then these mathem atical techniques can be
used to improve prosthetic performance.
The future of cortically-controlled prostheses is bright. Continuing research in non-invasive
technologies is encouraging and will result in practical systems th a t offer low surgical risk to dis
abled patients. Performance of EEG systems have improved such th a t researchers are considering
using them for 2D motor control. Intra-cortical systems have been receiving attention in recent
years, proof-of-concept devices have been demonstrated, and our research has helped measurably
advance the field. Further research in this domain should yield even higher performance systems.
Although numerous scientific and technical challenges remain to be solved (e.g., ranging from a
better understanding of cortex to issues specific to the physical recording apparatus), we are op
timistic th a t continued progress is likely. Further innovation will hopefully yield prostheses th a t
can help debilitated patients interact with the world in effective ways.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 7
Publications
7.1 Journal Articles
SANTHANAM G, R y u SI, Y u BM, AFSHAR A, AND SHENOY KV (July 2006). A high-performance
brain-computer interface. Nature, 442(7099), 195-198. doi:10.1038/nature04968.
SANTHANAM G, LlNDERMAN MD, GlLJA V, AFSHAR A , RYU S I, MENG TH, AND SHENOY KV
(December 2006). HermesB: A continuous neural recording system for freely behaving primates.
In preparation for resubmission to IEEE Transactions on Biomedical Engineering.
CHURCHLAND MM, SANTHANAM G, AND SHENOY KV (July 2006). Preparatory activity in pre
motor and motor cortex reflects the speed of the upcoming reach. Journal o f Neurophysiology.
doi:10.1152/jn.00307.2006.
CHURCHLAND MM, Yu BM, R yu SI, SANTHANAM G, AND SHENOY KV (April 2006). Neural vari
ability in premotor cortex provides a signature of motor preparation. Journal o f Neuroscience,
26(14), 3697-3712. doi:10.1523/JNEUROSCI.3762-05.2006.
YU BM, AFSHAR A, SANTHANAM G, RYU SI, SHENOY KV, AND SAHANI M (January 2006). Ex
tracting dynamical structure embedded in neural activity. In Y Weiss, B Scholkopf, and J P latt
(Eds.) Advances in Neural Information Processing Systems 18, pages 1545-1552. MIT Press,
Cambridge, MA.
Y u BM, KEMERE C, SANTHANAM G, AFSHAR A , RYU S I, MENG TH, SAHANI M, AND SHENOY
KV (December 2006). Mixture of trajectory models for neural decoding of goal-directed move
ments. In preparation for resubmission to Journal of Neurophysiology Innovative Methodology.
B a t is t a A P, Y u B M , Sa n t h a n a m G, R y u S I, A f s h a r A , AND SHENOY K V (D ecem ber 2006). A
d irec t com parison of eye-cen tered a n d lim b-cen te red reference fra m es for rea ch p la n n in g in th e
110
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 7. PUBLICATIONS 111
dorsal aspect of the premotor cortex. In preparation for resubmission to Journal of Neurophysi
ology.
ZUMSTEG ZS, KEMERE C, O ’DRISCOLL S, SANTHANAM G, AHMED RE, SHENOY KV, AND MENG
TH (2005). Power feasibility of implantable digital spike sorting circuits for neural prosthetic
systems. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(3), 272-279.
doi:10.1109/TNSRE.2005.854307.
7.2 Conference Talks, Articles, Abstracts
7.2.1 2006
B a t is t a AP, Y u B M , S a n t h a n a m G, R y u S I, A f s h a r A, a n d S h e n o y K V (October 2006).
Influence of eye position on end-point decoding accuracy in dorsal-premotor cortex. In Society
for Neuroscience Abstract Viewer and Itinerary Planner, 148.8. A tlanta, GA. Poster presentation.
C h e s t e r CA, B a t is t a AP, Yu BM, Sa n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV
(October 2006). The relationship between PMd neural activity and reaching behavior is stable
in highly trained macaques. In Society for Neuroscience Abstract Viewer and Itinerary Planner,
148.5. Atlanta, GA. Poster presentation.
G il j a V, L in d e r m a n MD, Sa n t h a n a m G, A f s h a r A, R y u SI, M e n g TH, a n d S h e n o y KV
(October 2006). Multiday electrophysiological recordings from freely behaving prim ates using
an autonomous, multi-channel neural system. In Society for Neuroscience Abstract Viewer and
Itinerary Planner, 148.19. A tlanta, GA. Poster presentation.
L in d e r m a n MD, G il j a V, Sa n t h a n a m G, A f s h a r A, R y u SI, M e n g TH, a n d S h e n o y KV
(October 2006). Neural recording stability of chronic electrode arrays in freely behaving pri
mates. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 13.7. Atlanta, GA.
Slide presentation.
K e m e r e C, B a t is t a AP, Yu BM, S a n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (October
2006). Hidden Markov models for spatial and temporal estimation for prosthetic control. In
Society for Neuroscience Abstract Viewer and Itinerary Planner, 256.17. A tlanta, GA. Poster
presentation.
S h e n o y KV, Sa n t h a n a m G , R y u SI, A f s h a r A , Yu BM , G il j a V, L in d e r m a n M D , K a l m a r
RS, C u n n in g h a m JP, K e m e r e CT, B a t is t a AP, C h u r c h l a n d M M , a n d M e n g TH (Septem
ber 2006). Increasing the performance of cortically-controlled prostheses. In Proceedings o f the
28th Annual International Conference o f the IEEE EM BS. New York, NY. Invited talk.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 7. PUBLICATIONS 112
L in d e r m a n M D , G il j a V, Sa n t h a n a m G, A f s h a r A, R y u SI, M e n g TH, a n d S h e n o y KV
(September 2006). An autonomous, broadband, multi-channel neural recording system for freely
behaving primates. In Proceedings o f the 28th Annual International Conference o f the IEEE
EMBS, ThBP8.7. New York, NY. Poster presentation.
G il j a V, L in d e r m a n MD, S a n t h a n a m G , A f s h a r A , R y u SI, M e n g T H , a n d S h e n o y KV
(September 2006). Multiday electrophysiological recordings from freely behaving primates. In
Proceedings o f the 28th Annual International Conference o f the IEEE EM BS, SaD08.3. New York,
NY. Slide presentation.
L in d e r m a n M D , G i l j a V, S a n t h a n a m G, A f s h a r A , R y u S I , M e n g T H , a n d S h e n o y K V
(September 2006). Neural recording stability of chronic electrode arrays in freely behaving pri
mates. In Proceedings o f the 28th Annual International Conference o f the IEEE EM BS, SaD08.4.
New York, NY. Slide presentation.
B a t i s t a AP, Yu BM, S a n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (May 2006). Hetero
geneous reference frames for reaching in macaque PMd. In 16th Annual Meeting o f the Neural
Control Movement Society, F-12. Key Biscayne, FL. Poster presentation.
7.2.2 2005
Sa n t h a n a m G, R y u SI, Y u BM, A f s h a r A, a n d S h e n o y KV (November 2005). Intra-cortical
communication prosthesis design. In Society for Neuroscience Abstract Viewer and Itinerary
Planner, 519.19. Washington, DC. Poster presentation.
S a n t h a n a m G , R y u SI, Y u BM, A f s h a r A, a n d S h e n o y KV (March 2005). A high perfor
mance neurally-controlled cursor positioning system. In Proceedings o f the 2nd International
IEEE EM BS Conference on Neural Engineering, 5 .1 .2 -6 , pages 49 4 —500. Arlington, VA. Slide
presentation.
A f s h a r A, A c h tm a n N, S a n t h a n a m G, R y u SI, Y u BM, a n d S h e n o y KV (November 2005).
Free-paced target estimation in a delayed-reach task. In Society for Neuroscience Abstract
Viewer and Itinerary Planner, 401.13. Washington, DC. Poster presentation.
B a t i s t a AP, Yu BM, S a n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (November 2005).
Heterogeneous coordinate frames for reaching in macaque PMd. In Society for Neuroscience
Abstract Viewer and Itinerary Planner, 363.12. Washington, DC. Slide presentation.
G il j a V, K a l m a r RS, S a n t h a n a m G, R y u SI, Y u B M , A f s h a r A , a n d S h e n o y KV (Novem
ber 2005). Trial-by-trial mean normalization improves plan period reach target decoding. In
Society for Neuroscience Abstract Viewer and Itinerary Planner, 519.18. Washington, DC. Poster
presentation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 7. PUBLICATIONS 113
K a l m a r RS, G i l j a V, S a n t h a n a m G, R y u SI, Yu BM, A f s h a r A, a n d S h e n o y KV (November
2005). PMd delay activity during rapid sequential movement plans. In Society for Neuroscience
Abstract Viewer and Itinerary Planner, 519.17. Washington, DC. Poster presentation.
SAHANI M, Yu BM, A f s h a r A, S a n t h a n a m G, R y u S I, AND S h e n o y KV (November 2005). Ex
tracting dynamical structure embedded in neural activity. In Society for Neuroscience Abstract
Viewer and Itinerary Planner, 689.14. Washington, DC. Poster presentation.
Yu BM, K e m e r e C, S a n t h a n a m G, A f s h a r A, R y u S I, M e n g TH, S a h a n i M, a n d S h e n o y
KV (November 2005). Mixture of trajectory models for neural decoding of goal-directed move
ments. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 520.18. Washington,
DC. Poster presentation.
C h u r c h l a n d MM, Yu BM, R y u SI, S a n t h a n a m G, a n d S h e n o y KV (April 2006). Motor
preparation and settling activity in PMd. In 15th Annual Meeting o f the Neural Control Move
ment Society, E-13. Key Biscayne, FL. Poster presentation.
C h u r c h l a n d MM, Y u BM, R y u SI, S a n t h a n a m G, a n d S h e n o y KV (March 2005). Neural
variability in premotor cortex provides a signature of motor preparation. In Computational and
Systems Neuroscience 2 0 0 5 ,13, page 26. Salt Lake City, UT. Oral and poster presentation.
Yu BM, A f s h a r A, S a n t h a n a m G, R y u SI, S h e n o y KV, a n d S a h a n i M (March 2005). Ex
tracting dynamical structure embedded in motor preparatory activity. In Computational and
Systems Neuroscience 2005, 290, page 303. Salt Lake City, UT. Poster presentation.
Yu BM, S a n t h a n a m G, R y u SI, a n d S h e n o y KV (March 2005). Feedback-directed state tran
sition for recursive Bayesian estimation of goal-directed trajectories. In Computational and
Systems Neuroscience 2005, 291, page 304. Salt Lake City, UT. Poster presentation.
7.2.3 2004
S a n t h a n a m G, R y u SI, Y u BM, AND S h e n o y KV (October 2004). High information trans
mission rates in a neural prosthetic system. In Society for Neuroscience Abstract Viewer and
Itinerary Planner, 263.2. San Diego, CA. Slide presentation.
Sa n t h a n a m G, Sa h a n i M, R y u SI, AND S h e n o y KV (September 2004). An extensible infras
tructure for fully autom ated spike sorting during online experiments. In Proceedings o f the 26th
Annual International Conference o f the IEEE EM BS, volume 6, pages 4380—4384. San Francisco,
CA. doi:10.1109/IEMBS.2004.1404219.
C h u r c h l a n d MM, Y u BM, R y u SI, S a n t h a n a m G, a n d S h e n o y KV (December 2004). Set
tling recurrent networks underlie motor planning in the primate brain. In PS Churchland and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 7. PUBLICATIONS 114
T Sejnowski (Eds.) Neural Information Processing post-Conference Workshop — The Neurobiol
ogy o f Planning and Deciding: Studies from Many Levels o f Brain Organization. Whistler, BC,
Canada. Invited talk.
K e m e r e C, S a n t h a n a m G, R y u SI, Y u BM, M e n g TH, a n d S h e n o y K V (November 2004). Re
construction of arm trajectories from plan and peri-movement motor cortical activity. In Neural
Interfaces Workshop 2004. National Institutes of Health, Bethesda, MD. Poster presentation.
B a t i s t a AP, Y u BM, S a n t h a n a m G, R y u SI, a n d S h e n o y KV (October 2004). Coordinate
frames for reaching in macaque dorsal premotor cortex (PMd). In Society for Neuroscience A b
stract Viewer and Itinerary Planner, 191.7. San Diego, CA. Poster presentation.
C h u r c h l a n d MM, Y u BM, R y u SI, S a n t h a n a m G, a n d S h e n o y KV (October 2004). Time-
course of PMd processing predicts reaction time. In Society for Neuroscience Abstract Viewer
and Itinerary Planner, 603.5. San Diego, CA. Slide presentation.
K e m e r e CT, S a n t h a n a m G, R y u SI, Y u B M , M e n g TH, a n d S h e n o y K V (October 2004).
Reconstruction of arm trajectories from plan and peri-movement motor cortical activity. In So
ciety for Neuroscience Abstract Viewer and Itinerary Planner, 8 8 4 .1 2 . San Diego, CA. Poster
presentation.
RYU SI, SANTHANAM G, Yu BM, AND S h en o y KV (October 2004). High speed neural prosthetic
icon positioning. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 263.1. San
Diego, CA. Slide presentation.
Y u BM, R y u SI, S a n t h a n a m G, C h u r c h l a n d MM, a n d S h e n o y KV (October 2004). Improv
ing neural prosthetic system performance by combining plan and peri-movement activity. In
Society for Neuroscience Abstract Viewer and Itinerary Planner, 884.11. San Diego, CA. Poster
presentation.
C h u r c h l a n d MM, Yu BM, R y u SI, S a n t h a n a m G, a n d S h e n o y KV (October 2004). Role of
movement preparation in movement generation. In R Shadmehr and E Todorov (Eds.) Advances
in Computational Motor Control III, Symposium at the Society for Neuroscience Meeting. San
Diego, CA. Contributed Talk.
R y u SI, S a n t h a n a m G, Y u BM, a n d S h e n o y KV (October 2004). The speed a t which reach
movement plans can be decoded from the cortex and its implications for high performance neural
prosthetic arm systems. In 54th Annual Meeting Congress o f Neurological Surgeons, 785. San
Francisco, CA. Oral presentation.
H a r r i s o n RR, S a n t h a n a m G, AND S h e n o y KV (September 2004). Local field potential mea
surement with low-power analog integrated circuit. In Proceedings o f the 26th Annual Interna
tional Conference o f the IEEE EM BS, volume 6, pages 4067—4070. San Francisco, CA.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 7. PUBLICATIONS 115
K e m e r e C, S a n t h a n a m G, Y u BM, R y u SI, M e n g TH, a n d S h e n o y KV (September 2004).
Model-based decoding of reaching movements for prosthetic systems. In Proceedings o f the 26th
Annual International Conference o f the IEEE EM BS, volume 6, pages 4524—4528. San Francisco,
CA. doi: 10.1109/IEMBS.2004.1404256.
W a t k in s PT, S a n t h a n a m G, S h e n o y KV, a n d H a r r i s o n R R (September 2004). Validation
of adaptive threshold spike detector for neural recording. In Proceedings o f the 26th Annual
International Conference o f the IEEE EM BS, volume 6, pages 4079-4082. San Francisco, CA.
Y u BM, R yu SI, S a n t h a n a m G, C h u r c h l a n d MM, a n d S h e n o y KV (September 2004). Im
proving neural prosthetic system performance by combining plan and peri-movement activity.
In Proceedings o f the 26th Annual International Conference o f the IEEE EM BS, volume 6, pages
4516-4519. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404254.
Z u m s t e g ZS, A h m e d RE, S a n t h a n a m G, S h e n o y KV, a n d M e n g TH (September 2004).
Power feasibility of implantable digital spike-sorting circuits for neural prosthetic systems. In
Proceedings o f the 26th Annual International Conference o f the IEEE EM BS, volume 6, pages
4237-4240. San Francisco, CA. doi:10.1109/IEMBS.2004.1404181.
7.2.4 2003
S a n t h a n a m G, C h u r c h l a n d M M , S a h a n i M , a n d S h e n o y K V (November 2003). Local field
potential activity varies with reach distance, direction, and speed in monkey pre-motor cortex.
In Society for Neuroscience Abstract Viewer and Itinerary Planner, 918.1. New Orleans, LA.
Poster presentation.
SANTHANAM G a n d S h e n o y KV (March 2003). Methods for estimating neural step sequences in
neural prosthetic applications. In Proceedings o f the 1st International IEEE EM BS Conference
on Neural Engineering, 5.3.4-7, pages 344-347. Capri, Italy. Poster presentation.
S h e n o y KV, C h u r c h l a n d MM, S a n t h a n a m G, Y u BM, a n d R y u SI (September 2003). Influ
ence of movement speed on plan activity in monkey pre-motor cortex and implictions for high-
performance neural prosthetic systems design. In Proceedings o f the 25th Annual International
Conference o f the IEEE EM BS, 6.1.1-3, pages 1897-1900. Cancun, Mexico. Invited talk.
7.2.5 2002
K e m e r e CT, S a n t h a n a m G , Y u B M , S h e n o y KV, a n d M e n g TH (October 2002). Decoding
of plan and peri-movement neural signals in prosthetic systems. In IEEE Workshop on Signal
Processing Systems (SIPS’02), pages 2 7 6 -2 8 3 . San Diego, CA.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Appendix A
Select C ollaborations
This appendix will briefly outline some notable developments by other researchers in the lab. I
provided a measurable amount of support for these projects. Although much of the work described
below involves basic neuroscience, these results can help us better understand the brain’s motor
systems and thereby improve neural prosthetic performance in the long-term. For fuller descrip
tions of each study, please refer to the referenced literature.1
A.1 Speed Tuning in PMd
A. 1.1 M otivation
In understanding how movements are prepared, it seems im portant th a t we determine which
reference frames describe the neural responses a t each temporal, anatomical, and functional stage.
(By reference frame we simply mean a low-dimensional set of variables, spatial or otherwise, upon
which neural activity is posited to depend in some straightforward fashion.) Such knowledge
should also have immediate practical significance, given recent efforts to guide motor prostheses
using preparatory activity (Musallam et al. 2004; Santhanam et al. 2006b; Shenoy et al. 2003). It
is often assumed th a t reach preparation occurs in a predominantly spatial reference frame (e.g.,
van Beers et al. 2004). In support, preparatory activity in PMd is tuned for target direction and
distance (Kurata 1993; Messier and Kalaska 2000; Riehle and Requin 1989), and is more closely
tethered to the visuo-spatial location of the target than to the direction of the reach (Shen and
Alexander 1997). Recent work has asked whether the relevant spatial reference frame translates
with the hand, eye, or both (Cisek and Kalaska 2002; Pesaran et al. 2006). Yet some results suggest
tha t PMd/Ml preparatory activity might not obey a simple spatial reference frame. PMd activity
can depend on factors other than target location, including the type of grasp (Godschalk et al.
Most of the text that follows was taken verbatim from journal articles or manuscripts on which I share authorship, but on which I am not the primary author.
116
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 117
1985), the required accuracy (Gomez et al. 2000), reach curvature (Hocherman and Wise 1991),
and (to some degree) force (Riehle et al. 1994).
Our goal was to determine whether preparatory activity in PMd and M l reflects a non-spatial
aspect of the upcoming reach: its speed, instructed by target color. This work has been published
in a peer-reviewed journal (Churchland et al. 2006a).
A. 1.2 R esults
Two monkeys were trained to reach a t different speeds, with green and red targets instructing
“slow” and “fast” reaches. Monkeys performed the task well. Even “slow” reaches to green targets
were fairly swift, with durations of 150-300 ms depending on target distance. “Fast” reaches to
red targets were swifter still, with durations of 100-200 ms. Their success rates were high and
would take practice for a hum an to equal.
We consider the 95 neurons for which we obtained a “direction” series (7 directions x 2 dis
tances x 2 speeds). For each neuron and each condition (i.e., target-location / instructed-speed;
28 total conditions) we computed the mean delay-period firing rate. The mean number of tr i
als/condition was 14. Figure A .l plots the delay-period firing ra te versus direction for several
example neurons. Red and green traces correspond to red (fast) and green (slow) targets. Dashed
and solid traces correspond to near (7 cm) and far (12 cm) targets.
The examples in Figure A. 1 illustrate a number of features typical of recorded responses. First,
delay-period activity often showed a large influence of instructed speed, in addition to the previ
ously known influence of target direction and distance. Second, interactions between the effects
of direction, distance and speed were common. For example, for the neuron shown in the bottom
panels, speed had an effect primarily for near targets. Third, while direction tuning was typically
robust, it was not always invariant. Preferred directions are similar (outer arcs whose arc lengths
denote ±1 SE) but not identical across the different distances and instructed-speeds.
Our prim ary new finding is th a t the instructed speed has a large influence on delay-period re
sponses. Of tuned neurons, 74% showed a significant main effect of speed, while 94% showed some
effect (main or interaction) involving speed. Firing rates could be higher before instructed-fast
reaches (e.g., A19, A29), or before instructed-slow reaches (e.g., A01, B114). Considering each di
rection/distance combination separately (a total of 95x7x2 comparisons), 61% (39%) of significant
effects involved a preference for fast (slow) reaches. Thus, there was an overall tendency for the
“fast” instructed-speed to evoke higher firing rates, but the opposite effect was not uncommon. It
was also not uncommon for a neuron to prefer far targets and the slower instructed speed (e.g.,
A01, A06) or to prefer near targets and the faster instructed speed (e.g., A19).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 118
10 spikes/s40 spikes/s
20 spikes/s25 spikes/s10 spikes/s.
50 spikes/s
10 spikes/s 10 spikes/s
5 sptkes/s
Figure A.1: Responses of twelve example neurons, illustrating the range of observed responses. Each subpanel shows a polar plot of delay-period firing rate versus target direction. Error bars on each symbol plot the SE across trials. Arcs a t the outside of the plot show, for each condition, the preferred direction ±1 SE. The black circle a t center shows baseline firing rate (mean over the 300 ms preceding target onset). Neuron identities are given a t the top of each panel. Labels (in spikes/s) indicate the scale provided by the outer gray circle.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 119
A. 1.3 Significance
The current results demonstrate th a t delay-period activity robustly reflects non-spatial aspects of
how the reach is to be executed. (By non-spatial we mean influenced by something other than the
spatial location of the target/reach-trajectory. The influence of instructed speed could of course be
due to different activation patterns of the muscles, which are certainly distributed in space.) While
theoretical and behavioral studies have often assumed th a t motor planning is primarily spatial,
finding non-spatial features represented in preparatory activity is not surprising. If preparatory
activity is part of a causal chain th a t will eventually generate movement, then presumably all
aspects of the movement m ust be reflected (at least implicitly) in th a t activity.
The discovered mapping from preparatory activity to behavior might not conform to any simple
representational framework. This would be consistent with our experimental observations, which
revealed considerable heterogeneity in tuning across neurons, and failed to reveal a simple set
of param eters th a t yielded invariant tuning. Of course, 'we may simply not be plotting our data
against the right movement parameters. Perhaps there is a straightforward relationship between
PMd preparatory activity and pending muscle activity (certainly both show preferred direction,
or PD, rotations). Still, it is im portant to a t least consider the possibility th a t no fundamental
reference frame exists — a lack of invariant tuning for any of the tested param eters, together with
a high degree of heterogeneity across neurons, question the idea th a t preparatory activity obeys
any clear reference frame. This skepticism is also put forth in an independent focus piece (Cisek
2006) written in response to Churchland et al. (2006a).
A number of prior results also suggest the absence of a fundamental reference frame. The
principal finding of Shen and Alexander (1997) was th a t delay-period direction tuning in PMd was
more closely tied to the visual location of the target than to the direction of the actual impend
ing reach. Yet, both clearly had an effect, arguing th a t the operative reference frame is neither
extrinsic nor intrinsic. The findings of Scott and Kalaska (1997), Scott et al. (1997), and Kakei
et al. (1999) make a similar point regarding movement-related activity. The PDs of M l and PMd
neurons rotated with arm posture, but not in ways adequately captured by either intrinsic or ex
trinsic reference frames. From a computational standpoint, such properties are not necessarily
problematic, and may even confer advantages (Deneve et al. 2001; Pouget et al. 2002; Zipser and
Andersen 1988).
Ultimately, these aforementioned results provide novel experimental data th a t should inform
researchers in efforts to discover a unifying model of the motor system. N aturally a better model
of the motor system will promote higher performance neural prosthetic systems.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 120
A.2 Reference Frames in PMd
A.2.1 M otivation
When we reach out to grasp an object we see, our brain m ust rapidly determine an appropriate
pattern of muscular contractions th a t will bring the hand to the object. At the heart of visually-
guided reaching is a reference frame transformation, from the initial retinal representation of
an object’s location to the required pattern of muscular contractions. A network of cortical areas
between the parietal and frontal lobes are thought to subserve the reference frame transformation
for reaching (reviewed in Boussaoud and Bremmer 1999; Caminiti et al. 1996). An important
node in this network is the dorsal aspect of the pre-motor cortex (PMd). This area receives input
signals related to vision, limb posture, and motor planning from the parietal lobe (Johnson et al.
1996), and in turn , PMd projects both directly to the spinal cord (Dum and Strick 1991; Galea
and Darian-Smith 1994), and also to the primary motor cortex (M atsumura and Kubota 1979), the
cortical region thought to be chiefly involved in the control of reaching.
As we have already cited in the previous (and closely related) section, many studies have ex
plored the role of PMd in the planning and performance of visually-guided reaches. An im portant
open question is whether reach planning activity in PMd encodes reach goals in an eye-centered
or a limb-centered reference frame. The medial bank of the intraparietal sulcus (MIP) projects
monosynaptically to PMd (Tanne-Gariepy et al. 2002). Neurons in area MIP represent reach plans
in eye-centered coordinates (Baker et al. 1999; B atista et al. 1999; Medendorp et al. 2003). Limb-
centered reference frames for reaching have been reported in PMd (Caminiti et al. 1991; Cisek and
Kalaska 2002). On the strength of this evidence, it appears a complete transformation from an eye-
centered to a limb-centered reference frame might occur between MIP and PMd. In contrast, other
reports have indicated th a t PMd neurons are influenced by the direction of gaze (Boussaoud et al.
1998; Boussaoud 1995) or sensory location of reach targets (Shen and Alexander 1997), suggesting
tha t the transformation to a limb-centered reference frame may still be incomplete a t the level of
PMd.
To explore this im portant unresolved issue, we sought to directly compare the relative degree
of eye-centered spatial coding and limb-centered spatial coding in PMd neurons. This work has
appeared as abstracts (Batista et al. 2004, 2005) and is currently in m anuscript form (Batista et al.
2006).
A.2.2 R esults
Two monkeys were trained over several months to perform a delayed-reach task as described in
Chapter 3. D ata were then collected while monkeys performed a version of the delayed reach task
called the reference frame task. This task was designed to independently assess the effects on
neural activity of target position relative to the eyes and the hand (see B atista e t al. 1999). The
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 121
reference frame task is a simple extension of the delayed reach task in which the initial eye and
hand position, and target location, are varied for each trial. Four different s ta rt conditions were
used. In two of them, the eye position is the same, while the initial hand position is different. Thus,
a given target is a t the same location in eye-centered coordinates between the two conditions, but is
a t different locations relative to the hand and arm. We call these the “eye-aligned configurations.”
In the other two conditions, the initial hand position is the same, but the fixation point differs. This
manipulation altered the locations of the targets in eye-centered coordinates, while m aintaining
them in hand-centered coordinates. We call these the “hand-aligned configurations.” For each of
the four s ta rt configurations, reaches were instructed to targets a t the same locations (relative to
the screen). All targets were presented above the initial eye and hand position.
The design of the reference frame task allowed us to independently measure the effect of al
tering the position of the reach goal relative to the arm, or relative to the eyes. A neuron th a t is
insensitive to the hand-centered location of the reach target should show a high degree of simi
larity between the firing rates observed in the two eye-aligned configurations. Such a similarity
would rule out the possibility th a t the neuron uses a limb-centered reference frame for encoding
reach goals, but it leaves open the possibility th a t the cell uses an eye-centered reference frame.
Similarly, a neuron th a t is insensitive to the eye-centered location of the reach target would show a
high degree of similarity between the firing rates measured in the two hand-aligned configurations.
Such a similarity would allow us to rule out the possibility tha t the neuron uses an eye-centered
reference frame, while still leaving open the possibility th a t the cell uses a limb-centered reference
frame. Our primary analysis in this study reflects this logic: for each PMd neuron, we attem pted
to independently rule out the possibilities th a t the cell uses an eye-centered or a limb-centered
reference frame.
Figure A.2 show 4 neurons with very different reference frame properties. The 5x2 grids show
the average firing rate during the delay period (computed over the 500 ms epoch extending from
250 ms after the appearance of the reach target) for each of the 10 target locations in the two
eye-aligned and two hand-aligned configurations. Panel A.2a illustrates a hand-centered cell. For
this neuron, the top two activity maps show a greater similarity than do the bottom two panels,
indicating th a t the cell is relatively insensitive to the eye-centered location of the targets. Further
more, the bottom two activity maps in each panel show th a t the response field of this neuron tends
to move along with the hand; this cell encodes target locations using a extrinsic limb-centered
reference frame (perhaps centered on the hand). Panel A.2b depicts a neuron th a t is eye-centered.
The bottom two activity maps (where the hand position changes, but the eye position is the same)
are more similar to each other than are the top two activity maps (where the hand position is the
same, but the eye position changes). Hence, we posit th a t this cell encodes target locations in a
retinotopic reference frame.
Panels A.2a and A.2b illustrate th a t a t least some PMd neurons employ a reference frame
for reach planning th a t can be reasonably well-characterized as eye-centered or hand-centered.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 122
a
ft <*>
b
ft <*>
c
ft <m>
d
ft <*>
<*> ■ P <*> ft
-<*> ft <m> ft <*> ft <*> ft
ft <*> ft <*>
Unit H20041231.40.2
Figure A.2: Each panel depicts one neuron. Within each panel are four activity maps, corresponding to the four different start configurations (indicated by hand and eye icons.) Each activity map shows averaged neural response during the delay epoch for reaches to the ten targets. White indicates the highest firing rate for that neuron, while black is the lowest.
However, many of the neurons we observed in PMd encode locations in more complex reference
frames. For example, the neuron in Panel A.2c is more active when the eyes are directed to the left
of the hand (the second and th ird activity maps). Furthermore, the response field of the cell moves
such th a t it remains at a fixed location relative to the combined position of the eyes and hand.
There were also a few neurons th a t did not move when either the direction of gaze or the initial
hand position was varied. Panel A. 2d is the clearest example of such a cell. This neuron is active
preceding reaches to the rightward set of targets, no m atter where on the retina these targets fall,
or the trajectory of the reach needed to acquire them. Surprisingly the neurons th a t are shown in
Fig. A.2 were somewhat exceptional in our PMd population in how they appear to use a reference
frame th a t can be described easily. Many other neurons resisted categorization. They exhibited
complex spatial timing, with no discernible regularities across the different configurations of the
eyes and hand in which we tested the cells.
Finally, when considering the PMd population as a whole, we first compared the degree of
influence on PMd neurons of the eye-centered location of the target and the hand-centered location
of the target, using an ANOVA. Changes in the eye-centered location of the targets (induced by
changing the starting eye position) significantly affected 58% of PMd neurons, almost as many
as did changing the target location relative to the arm (by changing the starting hand posture;
76% of cells). We also compared the spatial coding schemes used by PMd neurons: do cells code
reach goals in a more eye-centered or more hand-centered reference frame? We used a simple
distance metric to compare the dissimilarity between the two eye-aligned configurations and the
dissimilarity between the two hand-aligned configurations. Out of 79 neurons across two monkeys,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 123
51 cells (65%) use a more limb-centered than eye-centered reference frame. This leaves 35% of
these PMd neurons to be classified as apparently more eye-centered than hand-centered.
A.2.3 Significance
The chief finding of this study is th a t the eye-centered location of the reach goal strongly influ
ences motor planning activity in the dorsal aspect of the pre-motor cortex (PMd). I t is perhaps
surprising to find a strong influence of the retinal location of the reach target this far along the
pathway for the processing of visually-guided reaching. PMd projects to the spinal cord, and to the
primary motor cortex, and is intim ately involved in controlling arm movements (Churchland and
Shenoy 2006). Why should neurons this integral to motor planning still carry a signal of the target
location in sensory coordinates? Two categories of explanation exist. I t could be th a t the retinal
information about reach endpoint is still im portant, even a t the advanced stage of movement plan
ning occupied by PMd; our finding of retinal location information in PMd might be evidence for
a rich, m ultipotential spatial coding strategy in the area. Alternatively, of course, this retinal
information could be unimportant: a residue from the initial cortical representation of the reach
endpoint. Perhaps all cortical output stages are influenced by the retinal locations of reach goals,
with the final conversion to limb coordinates actualized only ju st prior to the reach itself (Zipser
and Andersen 1988). Another possibility is th a t residual eye influences simply average away in
the spinal cord or a t the motoneurons and there is no need for cortex to fully eradicate the eye
signals.
Nonetheless, the presence of eye-influenced neural activity in PMd (and perhaps even M l if one
actually rigorously pursued th a t question) has large implications for neural prosthetic systems.
These eye-position-related correlations may cause confounds in the decoding of motor intention.
For example, in the BCI experiments of Chapter 3, we controlled for eye-modulation by fixing the
gaze for each trial. I t is quite possible th a t we may not have achieved so high a performance
if the animal were allowed to freely gaze during trials, as this could add unexplained “noise” to
our neural models. Moreover, should eye position also influence peri-movement activity — which
appears to be the case in our data (unpublished observations) — then current performance of peri-
movement-based, continuous cursor prostheses may be artificially limited as well. Eventually,
more robust systems will have to contend with free gaze in the future.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 124
A.3 M echanisms of Motor Planning
A.3.1 M otivation
In the past two sections, we have described experiments th a t illustrate how the neurons in the
motor planning region of the brain exhibit complex patterns. These neural responses do not fit
into a simple framework. So this begs the question of how we might try to understand the more
mechanistic aspects of motor planning.
One first step in this direction is to focus on the time course of motor preparation. Reaction
times (RTs) (from the go cue until movement onset) are shorter when delays are longer, suggesting
tha t some time-consuming preparatory process is given a head s ta rt by the delay (Riehle and
Requin 1989; Crammond and Kalaska 2000). Thus the progress of a developing motor plan is
likely reflected in the neural activity. Perhaps activity m ust rise above a threshold to trigger the
movement, as seems likely for eye-movement saccades (Hanes and Schall 1996). An instructed
delay could allow activity to approach threshold and shorten the subsequent RT. Supporting this
“rise-to-threshold” hypothesis, higher firing rates are often associated with shorter RTs (Riehle
and Requin 1993; Bastian et al. 2003), although Crammond and Kalaska (2000) found that, peak
firing rates following the go cue (when the movement is presumably triggered) were on average
lower following an instructed delay.
An alternate hypothesis, illustrated in Fig. A.3, assumes th a t the movement produced is a
function of the state of preparatory activity a t the time some trigger is applied. For each possible
movement, there would be an “optimal” subspace of firing rates, appropriate to generate a suffi
ciently accurate movement. Motor preparation might therefore be an optimization: bringing firing
rates from their initial state to the appropriate subspace. Activity m ight drift somewhat while
waiting to execute, but motor preparation would remain “complete” so long as firing rates remain
within the optimal subspace. Is there evidence th a t the brain actively attem pts to bring firing
rates to th a t subspace? Is some penalty paid, perhaps a longer RT, if firing rates are elsewhere?
We show th a t these questions can be addressed by m easuring the variability of firing rates. This
work has been published in a peer-reviewed journal (Churchland et al. 2006b).
A.3.2 R esults
Many of our analyses rely on the measurem ent of neural variability, across trials of the same type,
made as a function of time. A central assumption of this approach is th a t the measured variability
is attributable to both cell-intrinsic variability in spike production and to “true” variability in the
underlying firing ra te on each trial. Our goal was to isolate the latter, as best as possible, by
normalizing with respect to the estim ated contribution of the former. To do so, we compute the
variance of firing ra te across trials and normalize by the mean firing rate, all as a function of time.
We term the resulting measurem ent the normalized variance (NV). The logic behind this metric is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A SELECT COLLABORATIONS 125
neuron 3
left reachright reach
trial 1
trial 2
neuron 2
firing rate, neuron 1
Figure A.3: Illustration of the optimal-subspace hypothesis. The configuration of firing rates is represented in a sta te space, with the firing rate of each neuron contributing an axis, only three of which are drawn. For each possible movement, we hypothesize th a t there exists a subspace of states th a t are optimal in the sense th a t they will produce the desired result when the movement is triggered. Different movements will have different optimal subspaces (shaded areas). The goal of motor preparation would be to optimize the configuration of firing rates so th a t i t lies w ithin the optimal subspace for the desired movement. For different trials (arrows), this process may take place a t different rates, along different paths, and from different starting points.
as follows. Intrinsic spiking variability is thought to be near Poisson for cortical neurons, so th a t its
variance scales linearly with mean firing rate. Thus, if the measured across-trial variability were
attributable solely to intrinsic spiking variability (i.e., the underlying firing ra te were identical
on each trial), the NV should be unity. In the presence of variability in underlying firing rate,
the NV should be greater than unity. In particular, we were interested in whether variability in
underlying firing ra te declined during the course of the tria l (see Fig. A.3) since the underlying
firing ra te is taken from an uncontrolled initial condition to a consistent pre-movement subspace.
In this case, the NV should decline from above one to near one.
As predicted, Fig. A.4a shows th a t the NV (+SE computed across isolations/target locations)
declined after target onset (see arrow), remained a t a rough plateau during the delay, and fell
again after the go cue. Figure A.4a includes three different datasets th a t had different delay
period lengths. This general pattern was also consistent across other datasets and across different
monkeys.
The initial decline in the NV consumed 98-198 ms depending on the monkey and dataset.
This is consistent with the idea th a t the magnitude of the NV indicates the approximate degree of
motor preparation yet to be accomplished. Admittedly, this interpretation rests on some assump
tions. F irst of all, it assumes th a t the increasing consistency of firing rates with time reflects their
increasing accuracy (i.e., their increasing tendency to occupy the optimal subspace, whose bound
aries cannot be easily inferred using current methods). Second, it assumes th a t there is a lim it on
the ra te a t which firing rates approach their putatively optimal values, such th a t progress before
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 126
the go cue shortens the subsequent RT. The first assumption is difficult to test directly. The second
assumption can be tested directly, by comparing the ra te of decline in the NV for trials with differ
ent delay durations. To do so, we used the experimental data tha t had three discrete delay-period
durations (30, 130, and 230 ms, randomly interleaved). As previously stated, Fig. A.4a shows the
NV computed for the three delays, aligned to the onset of the go cue. The ra te of decline in the
NV is similar for the three delay durations. As a consequence, a t the time of the go cue the NV
for the 230 ms delay has dropped to a plateau, while the NV for the 30 ms delay does not reach
the same point until ~80 ms later, potentially explaining why mean RT is longer. Figure A.4b
plots RT versus the NV a t the time of the go cue for the three delays. The relationship increases
monotonically. Thus, the height o f the N V at the time o f the go cue is predictive o fR T , as would be
expected if it reflected the average degree of motor preparation yet to be accomplished.
firing ra te
1-1.5130 ms 30 ms
NV2 30 m s delay
target 100 m s m ovem ent on set
•“S-130 m s delay
NV a t go cue
30 m s delay
E,
- ircCO4)E
275
E
<-275
0 8(24:C hange in rate (spikes/s) by go cu e
Figure A.4: NV results of an experiment using three discrete delay-period durations: 30, 130 and 230 ms. Data are from one day’s recording using monkey G (39 isolations, 957 trials), a. Traces a t top show the change in mean firing ra te from baseline (SE), across all isolations and target locations. Traces below show the NV (SE). Analysis was performed with data aligned to the go cue. This means th a t for each delay duration, analysis was also aligned to target onset, although th a t occurred a t different tim es prior to the go cue. b. Mean RT versus the NV, m easured a t the time of the go cue for the three delay-period durations. Bars show standard errors, c. Mean RT versus the change in firing rate from baseline, m easured a t the tim e of the go cue for the three delay-period durations. Black symbols plot the mean change averaged across all neurons and conditions. Gray symbols plot the same analysis but including only each neuron’s preferred condition. Note th a t the x-axis has been rescaled in the la tte r case.
In contrast, Fig. A.4c shows th a t there was no simple relationship between RT and m ean firing
rate a t the time of the go cue. This was true whether we considered all conditions (black) or ju s t
preferred conditions (gray). Note th a t this would also have been true had we considered firing ra te
a t some fixed time (e.g., 100 ms) after the go cue. At th a t point, the 30 ms delay (which produced
the longest RTs) produced the highest firing rates (see Fig. A.4a, top). At no time after the go cue
were firing rates highest for the 230 ms delay duration, although it produced the shortest RT.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 127
A.3.3 Significance
The NV reveals a previously unknown degree of temporal structure in the variability of neural
activity during a delayed reach task. Variability declines ra ther dramatically after target onset,
and more modestly after the go cue. Because the NV is a measurem ent of across-trial firing-rate
variability, the most natura l interpretation is th a t there is a decline in the across-trial variability
of the underlying firing rates. Alternately, the decline in the NV might reflect a change in the
within-trial cell-intrinsic process of spike production (e.g., from cortex-like statistics to vestibular-
afferent-like statistics). A number of controls (see Churchland et al. 2006b) exclude the most
obvious ways this might happen (most trivially, with increasing firing rate), but it is difficult to
completely exclude this possibility given extra-cellular recordings alone. Still, the proposal th a t
cell-intrinsic spiking statistics change would be quite radical.
If a movement is in whole or in p art a consequence of the preparatory activity present a t the
time it is triggered, then it would seem critical th a t such activity be optimized before triggering. We
hypothesize th a t such optimization is the behaviorally-inferred process of motor preparation. Our
experiments and analyses were designed to tes t two central predictions of this hypothesis. F irst,
if the brain is actively “trying” to bring firing rates to a particular state, then this should produce
a decline in variability. Second, if the brain can sense when preparatory activity is accurate, and if
activity is on average roughly accurate, then RTs should be shortest when variability is low — th a t
is, when firing rates are closest to their mean (we are not suggesting th a t the brain cares about
variability per se, but ra ther th a t reduced variability is a correlate of increased accuracy). That
these two predictions were born out lends support to the optimal-subspace hypothesis.
Measurements of variability have been extensively employed in the analysis of neural data
(Tolhurst et al. 1983; Gur et al. 1997; Bair and O’Keefe 1998; Averbeck and Lee 2003). Yet the
present study is, to our knowledge, the first to use a measurem ent of variability in an attem pt
to track the time-course of internal processing (although this interpretation was anticipated by
Horwitz and Newsome 2001). Given present results, it seems plausible th a t the measured in
crease in consistency reflects an increase in accuracy — an increasing likelihood th a t firing rates
have reached their appropriate values. This highlights an advantage of m easuring firing ra te vari
ability. Even when little is known regarding the “representation” used by an area of interest (so
tha t the experimenter cannot know which firing-rate vectors count as “accurate” or “appropriate”)
an index of variability can potentially allow one to infer the time-course with which firing rates
become accurate.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 128
A.3.4 Beyond NV
The NV results suggest th a t the network underlying motor preparation exhibits rich dynamics.
However, NV provides little insight into the course of motor planning on a single trial. A gradual
fall in trial-to-trial variance might reflect a gradual convergence on each trial, or m ight reflect
rapid transitions th a t occur a t different times on different trials. All the NV tells us about the
dynamic properties of the underlying network is the basic fact of convergence from uncontrolled
initial conditions to a consistent pre-movement preparatory state. The structure of any underlying
attractors and corresponding basins of attraction is unobserved.
To better understand the underlying mechanism of motor planning, one can adopt la tent vari
able methods. These methods can identify a hidden dynamical system th a t summarizes and ex
plains the simultaneously-recorded spike trains. The central idea is th a t the responses of different
neurons reflect different views of a common dynamical process in the network, whose effective
dimensionality is much smaller than the total number of neurons in the network. While the un
derlying state trajectory may be slightly different on each trial, the commonalities among these
trajectories can be captured by the network’s param eters, which are shared across trials. These
param eters define how the network evolves over time, as well as how the observed spike trains
relate to the network’s state a t each time point.
Recall th a t the NV results inform us th a t neural activity is initially variable across trials, but
appears to settle during the delay period. A dynamical system model capable of expressing these
types of behaviors of neural systems is a fully-connected recurrent network with Gaussian noise:
x* |xf—i ~ N , Q)(A.l)
f(x) = (1 - k ) -x + k ■ W ■ g(x),
where the state x.t e IR^*1 is a vector of the node values in the recurrent network a t time t= l , . . . ,T ,
k e M is related to the time constant of the network, W e Rpxp is a connection weight matrix, and
Q sM pxP is a covariance matrix. The function f : IRpxl ->■ lRpxl defines the non-linear state dynamics
and g is a non-linear activation function th a t acts element-by-element on its vector argument. We
took g to be the error function defined by
erf(z)= —= f e d t . (A.2)\ /7 l JO
We chose the error function because it made the fitting algorithm analytically tractable. The
initial state is Gaussian-distributed:
x i~ A ((p i,V i) , (A.3>
where p i e Rp><1 and Vi e IRpxp are the mean vector and covariance matrix, respectively.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 129
The output distribution is a generalized linear model th a t defines the relationship between all
nodes in the state x* and the spike count y\ e {0,1,2,...} of neuron i = 1 ,..., q in the tth time bin
y\ | x* ~ Poisson [h (c'; x f + d t) • A), <A.4>
where c ; e x 1 and d i e R define a linear function of the state and A e IR+ is the time bin width. For
notational compactness, the spike counts for all q simultaneously-recorded neurons are assembled
into a q *1 vector y t, whose ith element is y\. The link function h : IR — IR+ is chosen to be h(z) =
log (1 + e2) so as to ensure th a t the mean rate param eter of each Poisson distribution is non
negative.
The computational details and data simulations are skipped here and the interested reader is
invited to refer to Yu et al. (2006a); Yu (2007). Applying this laten t variable method to delayed-
reach neural data, we can try to reveal the otherwise hidden, cognitive state of the monkey, while
he is in the midst of planning a reach to the presented targets. Figure A.5 shows the means of
the marginal state posteriors P (x* | {y}^) (black traces) for 100 test trials based on the dynamical
model with recurrent state dynamics; note th a t a separate trajectory is inferred for each trial.
The blue and green dots correspond to 50 ms after target presentation and 50 ms after the go
cue, respectively. Despite the trial-to-trial variability in the delay period neural responses, the
state evolves along a characteristic path on each trial, presumably from an idle state to a fully
formed reach plan. Even with the characteristic structure however, the state trajectories are not
all identical. This presumably reflects the fact th a t the motor planning process is internally-
regulated, and its time course may differ from trial to trial, even when the presented stimulus (in
this case, the reach target) is identical.
8s]
4 s
x 3 o .
-4 N
-8,
Figure A.5: Inferred state trajectories (black) in la ten t x space for 100 test trials, based on the model with recurrent state dynamics. Dots indicate 50 ms after target onset (blue) and 50 ms after the go cue (green). The radius of the green dots is logarithmically-related to delay period duration (200, 750, or 1000 ms).
• Target onset + 50ms
M ia
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 130
While these results are promising, they are still somewhat preliminary. The trajectories agree
with intuition, but it is necessary to relate some aspect of the trajectory (e.g., closeness to the
convergence region) with behavior (e.g., reaction time). Since we cannot m easure directly the
hidden cognitive state of the animal during the planning process, we m ust use indirect behavioral
correlates to help confirm the validity of these inferred single-trial hidden trajectories. This is a
subject of ongoing efforts in the Shenoy laboratory.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 131
A.4 M ixture of Trajectory Models
A.4.1 M otivation
One of the key components of a prosthetic device is its decoding algorithm, which translates neural
activity into arm reaches. Examples of decoding algorithms th a t translate neural activity around
the time of the movement (termed peri-movement activity) into continuous arm trajectories include
population vectors (Taylor et al. 2002) and linear filters (Serruya et al. 2002; Carmena et al. 2003).
Both of these decoding algorithms assume a linear relationship between the neural activity and
arm state. In general, the arm state may include, but is not limited to, arm position, velocity, and
acceleration.
While these linear decoding algorithms are effective, recursive Bayesian decoders have been
shown to provide more accurate trajectory estimates (Brown et al. 1998; Brockwell et al. 2004; Wu
et al. 2004, 2006). Recursive Bayesian decoders are based on the specification of a probabilistic
model comprising (1) a trajectory model, which describes how the arm state changes from one time
step to the next, and (2) an observation model, which describes how the observed neural activity
relates to the time-evolving arm state. If the modeling assumptions are satisfied, then Bayesian
estimation makes optimal use of the observed data, as well as provide confidence regions for the
arm state estim ates and allow for non-linear relationships between the neural activity and arm
state.
The functionality of the trajectory model is to build into the recursive Bayesian decoder prior
knowledge about the form of the reaches. The degree to which the trajectory model reflects the
dynamics of the actual reaches directly affects the accuracy with which trajectories can be decoded
from neural data. A commonly-used trajectory model is the random walk (Brown et al. 1998;
Brockwell e t al. 2004), which captures the fact th a t arm trajectories tend to be smooth. In other
words, small changes in arm state from one time step to the next are more likely than large
changes. An alternative trajectory model is based on linear dynamics perturbed by Gaussian
noise, termed a linear-Gaussian model (Wu et al. 2004; Shoham et al. 2005; Wu et al. 2006).
I t is often the case th a t there are a finite number of distinct objects th a t a disabled patient may
wish to reach for in his/her workspace. Examples include reaching for the lighting, bed, or temper
ature controls; typing on a keyboard; or picking up the phone.2 N atural reaching movements in
such settings exhibit the following three properties. F irst, many, though clearly not all, reaching
movements in the workspace will be directed to this set of discrete goals. Second, multiple reaches
to the same goal are not all identical. For example, there may be variability in reach speed or
curvature. Third, the trajectories generally s ta rt a t rest, proceed out to the reach goal, and end a t
rest. C urrent trajectory models, such as the random walk or linear-Gaussian models, are limited
in their ability to capture all three aforementioned properties. In particular, it is not possible to
2 See Hochberg et al. (2006) for additional descriptions and videos of a spinal-cord-injured patient operating a neural prosthesis.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 132
specify multiple discrete reach goals a t which the trajectories are likely to come to rest. Thus, we
seek a trajectory model th a t better captures the dynamics of goal-directed reaches, which should
in turn yield more accurate trajectory estimates.
In addition, on a given trial, there can be information available about the identity of the upcom
ing reach goal before the reach begins. For example, as we have discussed throughout, information
about the goal of an upcoming reaching movement can often be deduced before the reach begins
from neural activity related to motor preparation (delay activity). It should be possible to use this
goal information, when available, to improve the accuracy of the decoded trajectory.
We present a mixture of trajectory models (MTM) framework th a t provides (1) a suitable tra
jectory model for goal-directed reaches, and (2) a principled way to incorporate information about
the identity of the upcoming reach goal. This work has partially appeared in conferences and
peer-reviewed journals (Kemere et al. 2003, 2004a,b) and the latest incarnation is presently in
m anuscript form (Yu et al. 2006b).
A.4.2 M ethods
Ideally, we would like to construct a complete model of neural motor control th a t captures the
hard, physical constraints of the limb, the soft constraints imposed by neural mechanisms, as well
as the physical surroundings and context. One way to approximate such a complete model is to
build a separate trajectory model for each group of movements with similar objectives. Here, we
group the movements by reach goal. At the onset of a new movement the desired reach goal is
unknown, or imperfectly known, and so the full trajectory model is composed of a mixture of the
individual, goal-specific trajectory models. We develop here a recursive Bayesian decoder based on
a mixture of trajectory models (MTM).
The decoding of a continuous arm trajectory involves finding the likely sequences of arm states
corresponding to the observed neural activity. At each time step t, we seek to compute the distribu
tion of the arm state x t given the peri-movement neural activity y i, y 2 , .. •, y< (or {y} ) observed up
to tha t time. This distribution is P (x* | {y}*) and term ed the state posterior. Here, y ; is a vector of
binned spike counts across the neural population a t time step t, and t = 1 corresponds to the time
a t which we begin to decode movement. If the desired reach goal m* is perfectly known before the
reach begins, then we can compute the state posterior based on the individual trajectory model
corresponding to th a t reach goal. This distribution is P (x* | {y}p7n*) and termed the conditional
state posterior. In general, the desired reach goal is unknown or imperfectly known, so we need to
compute P (x< | {y } , m) for each m e { 1,.. .,M}, where M is the number of possible reach goals.
To combine the M conditional state posteriors, we can simply expand P (x< | {y} ) by condition
ing on the reach goal m
Mp {xt I (y}i) = £ P (x* | {yj‘ ,m )P (m | {y}*). (A.5)
771 = 1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 133
In other words, the state posterior is a weighted sum of the conditional state posteriors. The
weights P [m | {y} ) represent the probability th a t the desired reach goal is m, given the observed
spike counts up to time t. Bayes’ rule can then be applied to these weights in Eq. (A.5), yielding
the key equation for the MTM framework
, t \ r t \ -F* ({yJi I m )P(m)P (xi I {y}J = £ P (xf | {y}1;m ) p ------ . <A.6>
The conditional state posteriors P (x* | {y}^,m) and data likelihoods P ({y} | m) in Eq. (A.6) can
be computed or approximated using any of a number of different recursive Bayesian decoding tech
niques, including Bayes’ filter (Brown et al. 1998), particle filters (Brockwell et al. 2004; Shoham
et al. 2005), and Kalman filter variants (Wu et al. 2004, 2006). If available, information about
the identity of the upcoming reach goal can be incorporated naturally into the MTM framework
via P(,m) in Eq. (A.6). This information m ust be available before the reach begins and may differ
from trial-to-trial. If no such information is available, a uniform distribution (P(m) = M) can be
used across all trials. Alternatively, we can use the maximum-likelihood methods described in
Section 3.2.2 (see Eqs. (3.5) and (3.4)) to find P(m) from the delay-period activity.
MTM m odel
The particular probabilistic model explored in this work is
x f |x i_ i,m ~ )V (A mXf_i-i-bm,Q m) (A.7)
x.1 \m ~ N (jrm,V m) (A.8)
s i-lag; Ix * ~ Poisson (ec’x‘+d>A), (A.9>
where m e {1,...,M} indexes reach goal and M is the number of reach goals. The dynamical arm
state a t time step t e {1,...,T} is x t e Kpxl, which includes position, velocity, and acceleration
terms. The corresponding observation, s '_ lag e {0,1,2,...}, is a peri-movement spike count for unit
i e {1,..., q] taken in a time bin of width A, where lag; is the time lag (in time steps) between the
neural firing of the ith unit and the associated arm state. For notational convenience, the spike
counts across the q simultaneously-recorded units are assembled into a q x 1 vector y t, whose ith
element is s '_ lag . This is the y * th a t appears in Eqs. (A.5) and (A.6>. The param eters A m £Rpxp,
bm e Rpxl, Qm e n m e R-P”1, Vm e R-px-p, lag; e Z, c; e R-p,xl, di e R do not depend on time and
are fit to training data.
Equations (A.7) and (A.8) define the trajectory model, which describes how the arm state x t
changes from one time step to the next. In this case, the full trajectory model is a mixture of
standard linear-Gaussian trajectory models, each describing the trajectories toward a particular
reach goal indexed by m.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 134
Equation <A.9> defines the observation model, which describes how the recorded peri-movement
spike counts s\ , relate to the arm state x*. In Eq. (A.9), the linear mapping c '.xt + dj is a cosinet - i a g i i
tuning model (Georgopoulos et al. 1982), where c* is the “preferred state vector.” This linear
mapping is then passed through an exponential to ensure th a t the m ean firing ra te of the ith unit
a t time t - lagj, ec>x*+rfi, is non-negative. Note that, whereas each mixture component indexed by
m in the trajectory model (Eqs. (A.7) and (A.8)) can have different param eters leading to different
arm state dynamics, the observation model (Eq. (A.9» is the same for all m.
Arm trajectories can be decoded from neural activity by applying Bayes’ rule to the statistical
relationships Eqs. (A.7)-(A.9). Having observed the neural data, we seek the likely sequences
of arm states th a t could have led to those neural observations. When the trajectory and obser
vation models are both linear-Gaussian, all of the relevant distributions are Gaussian and the
appropriate integrals can be computed exactly. In this case, the solution is identical to applying
the standard Kalman filter. For our model, however, given th a t the observation is a Poisson noise
model in Eq. (A.9), approximations are required to develop the appropriate estimation filter. These
approximations are omitted here for sake of brevity bu t the interested reader can refer to Brown
et al. (1998); Yu et al. (2006b).
Random Walk Trajectory Model
For comparison, we also implemented the random walk trajectory model with Poisson observations
presented by Brockwell e t al. (2004):
= v t - i - v t-2 + et (A. 10)
~ N ( n , V) (A .ll)
s^-iag. 1 v< ~ Poisson [ec'iy,+di a | , (A.12)
where et ~ N (0 , Q) in Eq. (A.10), vz e Kpxl is the arm velocity a t time t, v* is defined to be [v't ||v* ||]'
in Eq. (A.12), and | |V f | | is the arm speed a t time t. As in Eq. (A.9), sj_lag. is the peri-movement
spike count of the ith unit a t time t - lagj, where lag; is the time lag between the neural firing of
unit i e {1,..., q] and the associated arm velocity. Spike counts are taken in time bins of width A.
The param eters Q e Upxp, n e R2pxl, V e M2px2p, lagj e Z, Cj e IR(j:,+1)xl, dj e OS are fit to training
data, as described below. Note th a t the random walk trajectory model is a special case of the
linear-Gaussian trajectory model with appropriately chosen param eters in Eqs. (A.7) and (A.8).
Equations (A.10) and (A .ll) define the random walk trajectory model th a t imposes smoothness
in acceleration; Eq. (A.12) defines the Poisson observation model. To decode arm trajectories using
this probabilistic model, we followed Brockwell e t al. (2004) and implemented particle filtering
with 2500 particles a t each time step. This yielded a velocity estim ate a t each time step. To obtain
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 135
a single decoded position trajectory, the means of these velocity estim ates were integrated over
time. Because the arm state does not include positional variables in this model, we assumed the
actual initial arm position was known. Thus, the decoder based on the random walk trajectory
model was given a slight advantage over the other decoders.
A.4.3 R esults
Two monkeys were trained to perform a delayed-reach task as described in Chapter 3, which
provides for both plan and peri-movement activity. The reach goal was presented a t one of eight
possible radial locations (30, 70, 110, 150, 190, 230, 310, 350°) 10 cm away. We considered three
trajectory models: a random walk model (RWM, Eqs. (A.10) and (A .ll)) in acceleration, a single
linear-Gaussian trajectory model (STM, Eqs. (A.7) and (A.8) for special case of M = 1), and a
mixture of linear-Gaussian trajectory models (MTM, Eqs. (A.7) and (A.8)). Each of the trajectory
models was fitted to the arm data with a time step of d t = 10 ms.
For the STM and MTM, the following physical quantities were included in the arm state vector
xt: position, velocity, acceleration, position magnitude, and velocity magnitude. The param eters of
all three trajectory models were fit using least squares. For the STM, a single linear-Gaussian tra
jectory model was shared across all goal locations. The STM is similar to the trajectory model used
by Donoghue and colleagues (Wu et al. 2004, 2006), where it was applied to pursuit-tracking and
“pinball” tasks. In contrast, for the MTM, a separate linear-Gaussian trajectory model was trained
for each reach goal, based only on reaches to th a t goal. The trajectory model can be viewed, in the
space of all possible trajectories, as a specification of which trajectories are more likely than others
and by how much. This information is encoded in the parametric form of the trajectory model (e.g.,
random walk or linear-Gaussian), as well as in the fitted values of the model parameters.
For each observation model (Eqs. (A.9) and (A.12», we sought the optimal lag for each unit and
the param eters {cj,<i;}, where i indexes unit. The optimal lag refers to the temporal relationship
between the activity of a neural unit and the arm trajectory (Moran and Schwartz 1999b). Here,
we obtain the optimal lags using Bayesian model selection (MacKay 2003) the details of which can
be found elsewhere (Yu 2007).
For all decoders, we first fit the model param eters to training data. The test data for a single
trial consisted of (1) the arm trajectory, taken from 50 ms before movement onset to 50 ms after
movement end a t d t = 10 ms time steps, (2) the peri-movement spike counts, taken in overlapping
A = 20 ms bins and temporally offset from the arm trajectory by the optimal lag found for each
unit, and (3) the delay period spike counts, taken in a single 200 ms bin starting 150 ms after the
appearance of the reach goal. We quantify the trajectory error as the root-mean-square position
error between the decoded trajectory and the actual trajectory for each test tria l (Em s).
Figure A.6a demonstrates how the MTM framework was used to decode arm trajectories for two
particular test trials. The upper subpanel compares the actual position trajectory (thick black)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 136
with those decoded using the STM (thick green) and MTM (thick orange). For the purposes of
this plot, only the state elements corresponding to arm position are shown. The MTM decoded
trajectory is a weighted sum of component trajectory estimates E [x* | {y}^,m], one for each reach
goal indexed by m e {1,... ,8}. The three component trajectory estimates with the largest weights
for this tria l are plotted in the upper subpanel (cyan, blue, magenta). The lower subpanel shows
how the corresponding weights P (m | {y} ) evolved during the course of the trial. The values of
these weights a t time zero (t = 0) represent the probability th a t the upcoming reach goal is m,
before any peri-movement neural activity had been observed. This is set from the plan activity,
as previously discussed. As time proceeded, these weights were updated as more and more peri-
movement activity was observed. The weight for the actual reach goal (cyan) in the lower subpanel
was higher a t every timepoint, the clearest effect seen during the first 200 ms.3 The weighted sum
of the eight component trajectory estim ates (of which three are plotted in the upper-right panel)
using the weights shown in the lower subpanel yield the MTM decoded trajectory (thick orange,
E rJnns: 7.4 mm) in the upper subpanel.
With delay activity With delay activity
100
£E<0oQ.•e<o>
-100 0H o rz p o s (m m )
> - 5 0
0 100H o rz p o s (m m )
&m• t 0 .5
_ / l200
T im e (m s)
55?• t 0 .5
200 T im e (m s)
Figure A.6: Two representative test tria l in which the use of delay activity improved the MTM decoded trajectory. Upper panels: actual trajectory (thick black), STM decoded trajectory (thick green), MTM decoded trajectories with delay activity (thick orange). Lower panels: the three corresponding MTM component weights as they evolve during the trial. Time zero corresponds to 60 ms before movement onset (i.e., one time step before we begin to decode movement). For left trial, E rms was 17.4 and 7.4 mm for STM and MTM with delay activity, respectively. For right trial, Enns was 16.7 and 13.4 mm for STM and MTM w ith delay activity, respectively. (Experiment G20040508, tria l IDs 686 and 676.)
Figure A.6b shows a different test trial. In this case, the dominant weight a t t = 0 (blue)
did not correspond to the actual reach goal (cyan). In other words, the delay activity incorrectly
indicated the identity of the upcoming reach goal. However, as these weights were updated by
the observation of peri-movement activity, this “error” was soon corrected (within approximately
80 ms). From th a t point on, the weight corresponding to the actual reach goal dominated. Despite
3Without the delay-period activity, there was competition between the actual reach goal (cyan) and the neighboring goals. This particular scenario is not shown here.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 137
this error a t the beginning of the trial, the MTM decoded trajectory (thick orange, E rmS: 13.4 mm)
in the upper-right panel remained nearly identical to th a t in the upper-left panel. The reason is
th a t the error occurred early-on in the trial, when all eight component trajectory estim ates were
still near the origin of the workspace; the weighted sum of these component estim ates lies near
the origin no m atter how they are weighted.
Having demonstrated how the MTM framework produces trajectory estimates, we can quan
tify and compare the performance of decoders based on different trajectory models. Figure A.7
compares the trial-averaged decoding performance using the RWM, STM, MTM without delay ac
tivity (labeled MTMm, since only peri-movement activity is used), and MTM with delay activity
(labeled MTMdm, since both delay and peri-movement activity are used). For each monkey, the
trend was the same: E m s decreased when going from RWM to STM, from STM to MTMm, and
from MTMm to MTMjjm (Wilcoxon paired-sample test, p < 0.01). The superior performance of the
MTMm compared to the RWM and STM can be attributed to the fact th a t the MTM better cap
tures the dynamics of goal-directed reaches. If delay activity is available, th is additional source
of information can be naturally incorporated in the MTM framework to further improve decoding
performance (MTMdm)- The RWM can be seen as a restricted form of the STM, which explains the
higher £rms of the RWM compared to the STM in Fig. A.7.
b30
2 5
20E- & 15ujg 10
5
0
Monkey G Monkey H
RWM STM MTM., MTM„
30
25
RWM STM MTM., M TM„„M DM
Figure A.7: E n a s (mean ± SE) comparison for decoders using the RWM, STM, MTM without delay activity (MTMm), and MTM with delay activity (MTMdm)- a. Monkey G (98 units), b. Monkey H (99 units).
A.4.4 Significance
The mixture of trajectory models framework provides (1) a suitable trajectory model for goal-
directed reaches, and (2) a principled way to incorporate information about the identity of the
upcoming reach goal. In contrast to current trajectory models, a mixture of linear-Gaussian tra
jectory models (MTM) can capture the notion of goal-directed control, whereby trajectories s ta rt
a t rest, proceed out to one of M discrete reach goals, and end at rest. Because the MTM better
describes the dynamics of goal-directed reaches, its decoded trajectories were on average more
accurate than those based on the random walk and linear-Gaussian (STM) trajectory models.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIX A. SELECT COLLABORATIONS 138
As detailed in Chapter 1, the field of cortical prosthetics has largely been split based on which
of the two types of information should be used: plan activity to decode the intended reach goal
or peri-movement activity to decode the moment-by-moment details of a trajectory. By combining
the two types of information, the MTM decoder can be viewed as a way to bridge differences in
the design approach of cortical prosthetics. Also, the work outlined in Section A .l and detailed in
Churchland et al. (2006a) suggests th a t delay period activity can provide a probabilistic prior for
peak movement speed as well.
While devising a complete model of neural motor control would be ideal, the MTM framework
provides an effective and general discrete approximation. In this work, we grouped trajectories by
reach goal. In other contexts, the trajectories can be grouped by other criteria such as reach speed,
reach curvature, etc. Extensions to this work include applying the MTM framework to settings
with (1) novel reach goals, as well as (2) larger numbers of reach goals. We are also interested in
extending the MTM framework from M discrete reach goals to a continuum of goal locations.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Bibliography
A f s h a r a , A c h t m a n N, S a n t h a n a m G, R yu SI, Yu BM, AND SHENOY KV (2005). Freepaced target estimation in a delayed-reach task. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 401.13. Washington, DC. Poster presentation.
ASHE J a n d G e o r g o p o u lo s A P (1994). M ovem ent param eters and neural activity in motor cortex and area 5. Cerebral Cortex, 4(6), 590-600 .
AVERBECK BB AND L e e D (2003). N eural noise and m ovem ent-related codes in th e m acaque supplem entary motor area. Journal o f Neuroscience, 23(20), 7630-7641 .
BAIR W a n d O ’K e e f e LP (1998). The influence of fixational eye movements on the response of neurons in area MT of the macaque. Vision Neuroscience, 15(4), 779-786.
BAKER JT, DONOGHUE JP, a n d Sa n e s JN (1999). Gaze direction modulates finger movement activation patterns in hum an cerebral cortex. Journal o f Neuroscience, 19(22), 10044-10052.
BAR-HlLLEL A, SPIRO A, AND S t a r k E (2004). Spike sorting: Bayesian clustering of non- stationary data. In LK Saul, Y Weiss, and L Bottou (Eds.) Advances in Neural Information Processing Systems 17, pages 105—112. MIT Press, Cambridge, MA.
BASTIAN A, SCHONER G, AND RlEHLE A (2003). Preshaping and continuous evolution of motor cortical representations during movement preparation. European Journal o f Neuroscience, 18(7), 2047-2058.
B a t is t a AP, B u n e o CA, S n y d e r LH, a n d A n d e r s e n RA (1999). Reach plans in eye-centered coordinates. Science, 285(5425), 257-260.
B a t is t a AP, Yu BM, Sa n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (2005). Heterogeneous coordinate frames for reaching in macaque PMd. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 363.12. Washington, DC. Slide presentation.
B a t is t a AP, Yu BM, S a n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (2006). A direct comparison of eye-centered and limb-centered reference frames for reach planning in the dorsal aspect of the premotor cortex. In preparation for resubmission to Journal of Neurophysiology.
B a t is t a AP, Yu BM, S a n t h a n a m G, R y u SI, a n d S h e n o y KV (2004). Coordinate frames for reaching in macaque dorsal premotor cortex (PMd). In Society for Neuroscience Abstract Viewer and Itinerary Planner, 191.7. San Diego, CA. Poster presentation.
B ir b a u m e r N , G h a n a y im N , H in t e r b e r g e r T, I v e r s e n I , K o t c h o u b e y B , K u b l e r A , PERELMOUTER J , TAUB E , AND FLOR H (1999). A spelling device for th e p a ra ly sed . Nature, 398(6725), 297-298.
BOUSSAOUD D (1995). Prim ate premotor cortex: modulation of preparatory neuronal activity by gaze angle. Journal o f Neurophysiology, 73(2), 886-890.
139
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 140
BOUSSAOUD D AND BREMMER F (1999). Gaze effects in the cerebral cortex: reference frames for space coding and action. Experimental Brain Research, 128(1-2), 170—180.
BOUSSAOUD D, JOUFFRAIS C, AND BREMMER F (1998). E ye position effects on th e n eu ro n a l ac tiv ity of d o rsa l p rem o to r cortex in th e m acaq u e monkey. Journal o f Neurophysiology, 80(3), 1132-1150.
BROCKWELL AE, R o ja s AL, AND K a s s RE (2004). Recursive Bayesian decoding of motor cortical signals by particle filtering. Journal o f Neurophysiology, 91(4), 1899-1907.
B r o w n EN, F r a n k LM, T a n g D, Q u ir k MC, a n d W i l s o n MA (1998). A statistical paradigm for neural spike tra in decoding applied to position prediction from the ensemble firing patterns of ra t hippocampal place cells. Journal o f Neuroscience, 18(18), 7411-7425.
CAMINITI R, F e r r a in a S, a n d J o h n s o n PB (1996). The sources of v isu al inform ation to the prim ate frontal lobe: a novel role for th e superior parietal lobule. Cerebral Cortex, 6(3), 3 1 9 - 328.
CAMINITI R, J o h n s o n PB , G a l l i C, F e r r a in a S, a n d BURNOD Y (1991). M aking arm m ovem ents w ithin different parts of space: the premotor and motor cortical representation of a coordinate system for reaching to v isu al targets. Journal o f Neuroscience, 11(5), 1182-1197 .
C a rm en a JM , L e b e d e v MA, C r is t RE, O ’D o h e r t y JE , S a n t u c c i DM, D im itr o v DF, P a t i l PG, HENRIQUEZ CS, a n d N i c o l e l i s MAL (2003). Learning to control a brain-machine interface for reaching and grasping by primates. PLoS Biology, 1(2), 193—208.
C h a n d r a k a s a n AP, S h e n g S, AND B r o d e r s o n RW (1992). Low-power CMOS digital design. IEEE Journal o f Solid-State Circuits, 27(4), 473-484.
C h a p in JK , M o x o n KA, M a r k o w itz RS, a n d N i c o l e l i s MAL (1999). Real-time control of a robot arm using simultaneously recorded neurons in the motor cortex. Nature Neuroscience, 2(1), 664-670.
C h u r c h la n d MM, S a n th a n a m G, a n d S h e n o y KV (2006a). Preparatory activity in premotor and motor cortex reflects the speed of the upcoming reach. Journal o f Neurophysiology. doi: 10.1152/jn.00307.2006.
CHURCHLAND MM AND S h e n o y KV (2006). Delay of movement caused by disruption of cortical preparatory activity. Journal o f Neurophysiology. doi:10.1152/jn.00808.2006.
C h u r c h la n d MM, Yu BM, R yu SI, S a n th a n a m G, a n d S h e n o y KV (2006b). Neural variability in premotor cortex provides a signature of motor preparation. Journal o f Neuroscience, 26(14), 3697-3712. doi:10.1523/JNEUROSCI.3762-05.2006.
ClSEK P (2006). Preparing for speed. Focus on: "Preparatory activity in premotor and motor cortex reflects the speed of the upcoming reach". Journal o f Neurophysiology, doi: 10.1152/jn.00857.2006.
ClSEK P AND KALASKA J F (2002). Modest gaze-related discharge modulation in monkey dorsal premotor cortex during a reaching task performed with free fixation. Journal o f Neurophysiology, 88(2), 1064-1072.
ClSEK P AND KALASKA JF (2004). Neual correlates of m ental rehearsal in dorsal premotor cortex. Nature, 431(1011), 993-996.
COVER TM a n d T h o m a s JA (1991). Elements o f Information Theory. John W iley and Sons, Inc., N ew York, NY.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 141
CRAMMOND D J AND KALASKA J F (1995). M odulation of preparatory neuronal activity in dorsal premotor cortex due to stim ulus-response compatibility. Journal o f Neurophysiology, 71(3), 1281-1284.
CRAMMOND D J AND KALASKA J F (2000). Prior information in motor and premotor cortex: activity during the delay period and effect on pre-movement activity. Journal o f Neurophysiology, 84(2), 986-1005.
CUNNINGHAM JP, Yu BM, AND S h e n o y KV (2006a). Optimal target placement for neural communication prostheses. In Proceedings o f the 28th Annual International Conference o f the IEEE EM BS, FrBP10.3. New York, NY. Poster presentation.
CUNNINGHAM JP, Y u BM, AND S h e n o y KV (2006b). Optimal target placement for neural communication prostheses. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 256.21. Atlanta, GA.
DAYAN P AND A b b o t t LF (2001). Theoretical Neuroscience: Computational and Mathematical Modeling o f Neural Systems. MIT Press, Cambridge, MA.
DENEVE A, L a th a m PE, AND POUGET A (2001). Efficient computation and cue integration w ith noisy population codes. Nature Neuroscience, 4(8), 826—831.
D e s t e x h e A, CONTRERAS D, a n d S t e r i a d e M (1999). Spatiotemporal analysis of local field potentials and unit discharges in cat cerebral cortex during natural wake and sleep states. Journal o f Neuroscience, 19(11), 4595-4608.
DONOGHUE JP, S a n e s JN , HATSOPOULOS NG, a n d G a a l G (1998). Neural discharge and local field potential oscillations in prim ate motor cortex during voluntary movements. Journal of Neurophysiology, 79(1), 159-173.
DUM RP AND STRICK PL (1991). The origin of corticospinal projections from th e premotor areas in th e frontal lobe. Journal o f Neuroscience, 11(3), 667-689 .
EVERITT BS (1984). A n Introduction to Latent Variable Models. Chapman and Hill, London.
F a r w e l l LA AND D o n c h in E (1988). Talking off the top of your head: toward a m ental prosthesis u tilizing event-related brain potentials. Electroencephalography Clinical Neurophysiology, 70(6), 5 10-523 .
FEE M S, MlTRA PP, a n d K l e i n f e l d D (1996). Variability of extracellular spike waveform s of cortical neurons. Journal o f Neurophysiology, 76(6), 3823-3833 .
FETZ EE (1969). Operant conditioning of cortical un it activity. Science, 163(870), 955—957.
FETZ EE AND BAKER M A (1973). O perantly conditioned patterns of precentral unit activity and correlated responses in adjacent cells and contralateral m uscles. Journal o f Neurophysiology, 36(2), 179-204.
GALEA M P a n d D a r ia n -S m ith I (1994). M ultiple corticospinal neuron populations in th e m acaque m onkey are specified by their unique cortical origins, sp inal term inations, and connections. Cerebral Cortex, 4(2), 166-194 .
G e o r g o p o u l o s AP, KALASKA JF, CAMINITI R, AND MASSEY JT (1982). On the relations between the direction of two-dimensional arm movements and cell discharge in prim ate motor cortex. Journal o f Neuroscience, 2(11), 1527-1537.
G e o r g o p o u lo s AP, S c h w a r t z AB, a n d K e t t n e r RE (1986). N euronal population coding of m ovem ent direction. Science, 233(4771), 1416-1419.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 142
GHAHRAMANI Z AND HINTON G (1997). The EM algorithm for mixtures of factor analyzers. Technical Report CRG-TR-96-1.
G il j a V, K a lm a r RS, S a n th a n a m G, R yu SI, Y u BM , A f s h a r A , a n d S h e n o y KV (2005). Trial-by-trial mean normalization improves plan period reach target decoding. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 519.18. Washington, DC. Poster presentation.
G o d s c h a lk M, L em on RN, K u y p e r s HG, a n d v a n d e r S t e e n J (1985). The involvement of monkey premotor cortex neurones in preparation of visually cued arm movements. Behavioral Brain Research, 18(2), 143-157.
GOLUB G a n d V an l o a n CF (1983). Matrix Computations. Johns Hopkins University Press, Baltimore, MD, 3rd edition.
G om ez JE , F u Q, F la m e n t D, a n d E b n e r T J (2000). Representation of accuracy in the dorsal premotor cortex. European Journal o f Neuroscience, 12(10), 3748-3760.
G u r M, BEYLIN A, AND S n o d d e r ly DM (1997). Response variability of neurons in primary visual cortex (VI) of alert monkeys. Journal o f Neuroscience, 17(8), 2914-2920.
HANES DP AND SCHALL JD (1996). Neural control of voluntary movement initiation. Science, 274(5286), 427-430.
H a r r is o n R, W a tk in s P, K ie r R, B l a c k D, N o r m a n n R, a n d S o lz b a c h e r F (2006). A low- power integrated circuit for a wireless 100 electrode neural recording system. In 2006 IEEE International Conference on Solid-State Circuits Digest o f Technical Papers, pages 554-555.
HARRISON RR (2003). A low-power integrated cicuit for adaptive detection of action potentials in noisy signals. In Proceedings o f the 25th Annual International Conference o f the IEEE EM BS, pages 3325-3328. Cancun, Mexico.
HARRISON RR AND C h a r l e s C (2003). A low-power low-noise CMOS am plifier for neural recording applications. IEEE Journal o f Solid-State Circuits, 38(6), 958-965.
H a r r is o n RR, SANTHANAM G, a n d S h e n o y KV (2004). Local field potential m easurem ent with low-power analog integrated circuit. In Proceedings o f the 26th Annual International Conference of the IEEE EM BS, volume 6, pages 4067—4070. San Francisco, CA.
HATSOPOULOS N, JOSHI J, AND O ’LEARY JG (2004). Decoding continuous and discrete motor behaviors using motor and premotor cortical ensembles. Journal o f Neurophysiology, 92, 1165- 1174.
H in t e r b e r g e r T, S c h m id t S, N e u m a n n N, M e l l i n g e r J, B l a n k e r t z B, C u r io G, a n d B ir - BAUMER N (2004). Brain-computer communication and slow cortical potentials. IEEE Transactions on Biomedical Engineering, 51(6), 1011-1018.
H o c h b e r g LR, S e r r u y a M D, F r ie h s GM , M u k a n d JA , S a le h M, C a p la n A H , B r a n n e r A, CHEN D, P e n n RD, a n d DONOGHUE J P (2006). N euronal ensem ble control o f prosthetic devices by a hum an w ith tetraplegia. Nature, 442(7099), 164-171.
HOCHERMAN S AND WISE SP (1991). Effects of hand movement path on motor cortical activity in awake, behaving rhesus monkeys. Experimental Brain Research, 83(January), 285-302.
HOLDEFER RN AND MILLER LE (2002). Prim ary motor cortical neurons encode functional muscle synergies. Experimental Brain Research, 146(2), 233-243.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 143
H o r iu c h i T, S w i n d e l l T, S a n d e r D, a n d A b s h ir e P (2004). A low-power CMOS neural am plifier with amplitude measurem ents for spike sorting. In Proceedings o f the 2004 IEEE International Symposium on Circuits and Systems (ISCAS ’04), volume 4, pages 29-3 2 . Vancouver, Canada.
HORWITZ GD a n d NEWSOME WT (2001). Target selection for saccadic eye movements: direction- selective visual responses in the superior colliculus. Journal o f Neurophysiology, 86(5), 2527- 2542.
H u m p h r e y DR, S c h m id t EM, AND T h o m p s o n WD (1970). Predicting measures of motor performance from multiple cortical spike trains. Science, 170(3959), 758-762.
ISAACS RE, W e b e r DJ, a n d S c h w a r t z AB (2000). Work toward real-time control of a cortical neural prosthesis. IEEE Transactions on Rehabilitation Engineering, 8(2), 196-198.
JACKSON A, MAVOORI J, AND F e tz EE (2006a). Correlations between the same motor cortex cells and arm muscles during a trained task, free behavior and natural sleep in the macaque monkey. Journal o f Neurophysiology, Epub. doi:10.1152/jn.00710.2006.
JACKSON A , MAVOORI J, AND F e t z EE (2006b). Long-term motor cortex p lasticity induced by an electronic neural im plant. Nature, 444(7115), 56 -60 .
J a c k s o n A, M o r it z CT, M a v o o r i J, L u c a s TH, a n d F e t z EE (2006c). The neurochip BCI: towards a neural prosthesis for upper limb function. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2), 187-190.
J o h n s o n PB , F e r r a in a S, B ia n c h i L, a n d CAMINITI R (1996). Cortical networks for visual reaching: physiological and anatom ical organization of frontal and parietal lobe arm regions. Cerebral Cortex, 6(2), 102-119 .
KAKEI S, HOFFMAN DS, AND S t r i c k PL (1999). M uscle m ovem ent representations in the prim ary motor cortex. Science, 285(5436), 2136—2139.
KALASKA J F AND CRAMMOND D J (1995). Deciding not to GO: neuronal correlates of response selection in a GO/NOGO task in prim ate premotor and parietal cortex. Cerebral Cortex, 5(5), 410—428.
K a l m a r RS, G il j a V, S a n t h a n a m G, R y u SI, Y u BM, A f s h a r A , a n d S h e n o y K V (2005). PMd delay activity during rapid sequential movement plans. In Society for Neuroscience A bstract Viewer and Itinerary Planner, 519.17. Washington, DC. Poster presentation.
KANDEL ER, SCHWARTZ JH , AND JESSELL TM (2000). Principles o f Neural Science. McGraw-Hill Medical, 4th edition.
K e m e r e C, SAHANI M, a n d M e n g TH (2003). Robust neural decoding of reaching movements for prosthetic systems. In Proceedings o f the 25th Annual International Conference o f the IEEE EMBS, 6.4.2-3, pages 2079-2082. Cancun, Mexico.
K e m e r e C, S a n t h a n a m G, Y u BM, R y u SI, M e n g TH, a n d S h e n o y KV (2004a). Model- based decoding of reaching movements for prosthetic systems. In Proceedings o f the 26th Annual International Conference o f the IEEE EM BS, volume 6, pages 4524—4528. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404256.
K e m e r e C, S h e n o y KV, AND m e n g TH (2004b). M odel-based neural decoding of reaching m ovements: a m axim um likelihood approach. IEEE Transactions on Biomedical Engineering - Special Issue on Brain-Machine Interfaces, 51(6), 925-932 .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 144
K e m e r e CT, S a n th a n a m G, Y u BM , S h e n o y KV, a n d M e n g TH (2002). Decoding of plan and peri-movement neural signals in prosthetic systems. In IEEE Workshop on Signal Processing Systems (SIPS’02), pages 276-283 . San Diego, CA.
K e n n e d y P, A n d r e a s e n D, E h ir im P, K in g B, K ir b y T, M ao H, a n d M o o r e M (2004). Using hum an extra-cortical local field potentials to control a switch. Journal o f Neural Engineering, 1(2), 72-77.
K e n n e d y PR AND B a k a y RAE (1998). Restoration of neural output from a paralyzed p atient by a direct brain connection. NeuroReport, 9(8), 1707-1711 .
K e n n e d y PR , B a k a y RAE, M o o r e MM , A d am s K, a n d G o ld w a it h e J (2000). D irect control of a com puter from th e hum an central nervous system . IEEE Transactions on Rehabilitation Engineering, 8, 198-202.
KURATA K (1989). Distribution of neurons with set- and movement-related activity before hand and foot movements in the premotor cortex of rhesus monkeys. Experimental Brain Research, 77(2), 245-256.
KURATA K (1993). Premotor cortex of monkeys: set- and movement-related activity reflecting amplitude and direction of w rist movements. Journal o f Neurophysiology, 69(1), 187-200.
LEUTHARDT EC, SCHALK G, WOLPAW JR , OJEMANN JG , AND MORANN DW (2004). A brain- computer interface using electrocorticographic signals in humans. Journal o f Neural Engineering, 1(2), 63-71.
LEWICKI MS (1998). A review of methods for spike sorting: the detection and classification of neural action potentials. Network: Computation in Neural Systems, 9(4), R53-R78.
LIU X, McCREERY DB, BULLARA LA, AND AGNEW WF (2006). Evaluation of the stability of intracortical microelectrode arrays. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(1), 91-100.
M acK ay DJC (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge, UK.
MATSUMURA M AND KUBOTA K (1979). Cortical projection to hand-arm motor area from post- arcuate area in macaque monkeys: a histological study of retrograde transport of horseradish peroxidase. Neuroscience Letters, 11(8), 241-246.
MAVOORI J, J a c k s o n A, D io r io C, a n d F e t z E (2005). An autonomous implantable computer for neural recording and stimulation in unrestrained primates. Journal o f Neuroscience M ethods, 148(1), 71-77.
M a y n a rd EM , H a t s o p o u lo s N G , O ja k a n g a s CL, A c u n a B D , S a n e s JN , N o r m a n n RA, AND DONOGHUE J P (1999). Neuronal interactions im prove cortical population coding of m ovem ent direction. Journal o f Neuroscience, 19(18), 8083-8093.
M a y n a rd EM , N o r d h a u s e n CT, AND N o r m a n n RA (1997). The U tah intracortical electrode array: a recording structure for potential brain-computer interfaces. Electroencephalography Clinical Neurophysiology, 102(8), 228-239 .
M c F a r la n d DJ, S a r n a c k i WA, a n d WOLPAW JR (2003). Brain-computer interface (BCI) operation: optimizing information transfer rates. Biological Psychology, 63(8), 237-251 .
M e d e n d o r p WP, GOLTZ HC, V i l i s T, a n d C r a w f o r d JD (2003). Gaze-centered updating of visu al space in hum an parietal cortex. Journal o f Neuroscience, 23(15), 6209-6214.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 145
MENG TH, H u n g AC, T s e r n EK, AND G o r d o n BM (1998). Low-power signal processing system design for wireless applications. IEEE Personal Communications, 5(3), 20-31.
MESSIER J AND KALASKA J F (2000). Covariation of prim ate dorsal premotor cell activity w ith direction and am plitude during a m em orized-delay reaching task. Journal o f Neurophysiology, 84(1), 152-165 .
MORAN DW a n d S c h w a r t z A B (1999a). Motor cortical activity during drawing m ovem ents: population representation during sprial tracing. Journal o f Neurophysiology, 82(5), 2693-2704 .
MORAN DW a n d S c h w a r t z A B (1999b). Motor cortical representation of speed and direction during reaching. Journal o f Neurophysiology, 82(5), 2676-2692 .
MORROW MM AND M i l l e r LE (2003). Prediction o f m uscle activity by populations of sequentially recorded prim ary motor cortex neurons. Journal o f Neurophysiology, 89(4), 2279-2288 .
MURMANN B AND BOSER BE (2004). Digitally Assisted Pipeline ADCs. Kluwer Academic Publishers, The Netherlands.
M u s a l la m S, C o r n e i l BD, G r e g e r B, S c h e r b e r g e r H, a n d A n d e r s e n RA (2004). Cognitive control signals for neural prosthetics. Science, 305(5681), 258-262.
OBEID I, NICOLELIS ML, AND W o l f PD (2004). A m ultichannel telem etry system for single unit neural recordings. Journal o f Neuroscience Methods, 133(1-2), 33-38.
OLDS J (1965). Operant conditioning of single unit responses. In 23rd International Congress o f Physiological Sciences 1965, pages 372-380. Excerpta Medica Foundation, Tokyo, Japan.
OWEISS KG, A n d e r s o n DJ, AND PAPAEFTHYMIOU MM (2003). Optimizing signal coding in neural interface system-on-a-chip modules. In Proceedings o f the 25th Annual International Conference o f the IEEE EM BS, pages 2216-2219 . Cancun, Mexico.
PATIL PG, CARMENA JM , N i c o l e l i s MAL, AND T u r n e r DA (2004). Ensem ble recordings of hum an subcortical neurons as a source of motor control signals for a brain-m achine interface. Neurosurgery, 55(1), 27-38 .
PESARAN B, NELSON M, AND A n d e r s e n R (2006). D orsal premotor neurons encode th e relative position of th e hand, eye, and goal during reach planning. Neuron, 51(1), 125-134.
POUGET A , D e n e v e S, a n d D u h a m e l JR (2002). A com putational perspective on the neural basis of m ultisensory spatial representations. Nature Reviews Neuroscience, 3(9), 741-747 .
POUZAT C, DELESCLUSE M, VlOT P, AND DlEBOLT J (2004). Improved spike-sorting by modeling firing statistics and burst-dependent spike amplitude attenuation: a Markov chain Monte Carlo approach. Journal o f Neurophysiology, 91(6), 2910-2928.
RlEHLE A, M a cK a y WA, AND REQUIN J (1994). Are extent and force independent movement parameters? Preparation- and movement-related neuronal activity in the monkey cortex. Experimental Brain Research, 99(1), 56 -74 .
RlEHLE A AND REQUIN J (1989). Monkey primary motor and premotor cortex: single-cell activity related to prior information about direction and extent of an intended movement. Journal o f Neurophysiology, 61(3), 534-549.
RlEHLE A AND R e q u in J (1993). The predictive value for performance speed of preparatory changes in neuronal activity of the monkey motor and premotor cortex. Behavioral Brain Research, 53(1-2), 35-49.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 146
ROWEIS S AND GHAHRAMANI Z (1999). A unifying review of linear gau ssian models. Neural Computation, 11(2), 305-345.
ROWEIS ST (1998). EM algorithms for PCA and SPCA. In MI Jordan, M J Kearns, and SA Solla (Eds.) Advances in Neural Information Processing Systems 10. MIT Press, Cambridge, MA.
SAHANI M (1999). Latent Variable Models for Neural Data Analysis. Ph.D. thesis, Computational and Neural Systems, California Institute of Technology, Pasadena, CA.
S a n th a n a m G, C h u r c h la n d MM, S a h a n i M, a n d S h e n o y KV (2003). Local field potential activity varies with reach distance, direction, and speed in monkey pre-motor cortex. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 918.1. New Orleans, LA. Poster presentation.
S a n th a n a m G, L in d e r m a n M D, G i l j a V, A f s h a r A , R yu SI, M e n g TH, a n d S h e n o y KV (2006a). HermesB: A continuous neural recording system for freely behaving primates. In preparation for resubmission to IEEE Transactions on Biomedical Engineering.
S a n th a n a m G, R y u SI, Yu BM, A f s h a r A, a n d S h e n o y KV (2006b). A high-performance brain-computer interface. Nature, 442(7099), 195-198. doi:10.1038/nature04968.
SANTHANAM G, S a h a n i M, R y u SI, AND S h e n o y KV (2004). An extensible infrastructure for fully automated spike sorting during online experiments. In Proceedings o f the 26th Annual International Conference o f the IEEE EM BS, volume 6, pages 4380-4384. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404219.
SCHMIDT EM (1980). Single neuron recording from motor cortex as a possible source o f signals for control o f external devices. Annals o f Biomedical Engineering, 8(4-6), 339-349.
SCHWARTZ AB (1992). Motor cortical activity during drawing movements: single-unit activity during sinusoidal tracing. Journal o f Neurophysiology, 68(2), 528-541.
SCHWARTZ A B (1993). Motor cortical activity during drawing m ovem ents: population representation during sinusoidal tracing. Journal o f Neurophysiology, 70(1), 28 -36 .
SCHWARTZ AB (1994). Direct cortical representation of drawing. Science, 265(5171), 540-542.
SCHWARTZ AB (2004). Cortical neural prosthetics. Annual Review o f Neuroscience, 27, 487-507.
SCOTT D, BOSER B E , AND P i s t e r K SJ (2003). An u ltra low-energy ADC for Sm art Dust. IEEE Journal o f Solid-State Circuits, 38(7), 1123—1129.
SCOTT SH (2006). Neuroscience: converting thoughts into action. Nature, 442(7099), 141-142.
SCOTT SH a n d K a la s k a JF (1997). Reaching movements with similar hand paths but different arm orientations. I. Activity of individual cells in motor cortex. Journal o f Neurophysiology, 77(2), 826-852.
S c o t t SH, S e r g i o LE, a n d K a la s k a JF (1997). Reaching movements with similar hand paths but different arm orientations. II. Activity of individual cells in dorsal premotor cortex and parietal area 5. Journal o f Neurophysiology, 78(5), 2413—2426.
SEESE TM, HARASAKI H , S a i d e l GM, AND DAVIES CR (1998). Characterization of tissue morphology, angiogenesis, and tem perature in adaptive response of muscle tissue to chronic heating. Lab Investigation, 78(12), 1553-1562 .
SERBY H, YOM-TOV E, AND INBAR GF (2005). An improved P300-based brain-computer interface. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(1), 89-98.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 147
S e r r u y a MD, H a t s o p o u lo s NG, P a n in s k i L, F e l l o w s MR, a n d D o n o g h u e J P (2002). Instan t neural control of a movement signal. Nature, 416(6877), 141-142.
SHADLEN M N AND NEWSOME WT (1998). The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. Journal o f Neuroscience, 18(10), 3870- 3896.
SHANNON CE (1948). A m athem atical theory of com m unication. Bell System Technical Journal, 27, 379 -4 2 3 and 623-656 .
SHEN L AND A le x a n d e r GE (1997). Preferential representation of instructed target location versus lim b trajectory in dorsal premotor area. Journal o f Neurophysiology, 77(3), 1195-1212.
S h e n o y KV, M e e k e r D, C ao S, K u r e s h i SA, P e s a r a n B, M itr a P, B u n e o CA, B a t i s t a AP, BURDICK JW, a n d A n d e r s e n RA (2003). N eural prosthetic control signals from plan activity. NeuroReport, 14(4), 591-596.
SHOHAM S, FELLOWS M , AND N o r m a n n R (2003). Robust, autom atic spike sorting u sing m ixtures of m ultivariate t-distributions. Journal o f Neuroscience Methods, 127(2), 111-122 .
S h o h a m S, P a n in s k i LM, F e l l o w s MR, H a t s o p o u lo s NG, D o n o g h u e JP, a n d N o r m a n n RA (2005). S tatistical encoding m odel for a prim ary motor cortical brain-m achine interface. IEEE Transactions on Biomedical Engineering, 52(7), 1313-1322.
SMITH AC AND B r o w n EN (2003). Estim ating a state-space model from point process observations. Neural Computation, 15(5), 965-991.
S p a ld in g MC, V e l l i s t e M, J a r o s ie w ic z B , a n d S c h w a r t z A (2005). 3-D cortical control of an anthropomorphic robotic arm for reaching and retrieving. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 401.3. W ashington, DC.
S u n e r S, F e l l o w s MR, V a r g a s - I r w in C, N a k a t a GK, a n d D o n o g h u e J P (2005). Reliability of signals from a chronically implanted, silicon-based electrode array in non-human primate primary motor cortex. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(4), 524-541.
TANJI J AND EVARTS E V (1976). Anticipatory activity of motor cortex neurons in relation to direction of an intended movement. Journal o f Neurophysiology, 39(5), 1062-1068.
T a n n e -G a r ie p y J, ROUILLER EM, AND BOUSSAOUD D (2002). Parietal inputs to dorsal versus ventral premotor areas in the macaque monkey: evidence for largely segregated visuomotor pathways. Experimental Brain Research, 145(1), 91-103.
T a y lo r DM, H e lm s T i l l e r y S I, AND S c h w a r t z AB (2002). Direct cortical control of 3D neu- roprosthetic devices. Science, 296, 1829-1832.
T a y lo r DM , H e lm s T i l l e r y S I, a n d S c h w a r t z A B (2003). Inform ation conveyed through brain-control: cursor vs. robot. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(2), 195-199.
THACKER NA a n d BROMILEY PA (2001). The effects of a square root transform on a Poisson distributed quantity. Technical Report 2001-010.
TKACH DC, R e m ie r J, AND H a t s o p o u lo s N G (2005). A hybrid neurom otor brain-m achine interface u sing trajectory and goal sta te control modes. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 707.11. W ashington, DC.
TOLHURST DJ, MOVSHON JA, a n d D e a n AF (1983). The statistica l reliability of signals in single neurons in cat and m onkey v isu al cortex. Vision Research, 23(8), 775-785.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 148
T r u c c o l o W, E d e n UT, F e l l o w s MR, D o n o g h u e JP, a n d B r o w n EN (2004). A point process framework for relating neural spiking activity to spiking history, neural ensemble and extrinsic covariate effects. Journal o f Neurophysiology, Epub. doi:10.1152/jn.00697.2004.
VAN BEERS RJ, H a g g a r d P, AND WOLPERT DM (2004). The role of execution noise in movement variability. Journal o f Neurophysiology, 91(2), 1050-1063.
VYSSOTSKI A L, SERKOV AN, ITSKOV PM, D e l l ’Omo G, LATANOV AV, WOLFER DP, AND LlPP HP (2006). M iniature neurologgers for flying pidgeons: multichannel EEG and action and field potentials in combination with GPS recording. Journal o f Neurophysiology, 95(2), 1263—1273.
W a tk in s PT, S a n th a n a m G, S h e n o y KV, a n d H a r r is o n RR (2004). Validation of adaptive threshold spike detector for neural recording. In Proceedings o f the 26th Annual International Conference o f the IEEE EM BS, volume 6, pages 4079-4082. San Francisco, CA.
WEINRICH M AND WISE SP (1982). The premotor cortex of the monkey. Journal o f Neuroscience, 2(9), 1329-1345.
W e in r ic h M, W is e SP, AND M a u r itz KH (1984). A neurophysiological study of the premotor cortex in the rhesus monkey. Brain, 107(2), 385-414 .
W e s s b e r g J, S ta m b a u g h CR, K r a l ik JD , B e c k PD, L a u b a c h M, C h a p in JK , Kim J, B ig g s SJ, SRINIVASAN MA, AND NICOLELIS MAL (2000). Real-time prediction of hand trajectory by ensembles of cortical neurons in primates. Nature, 408(6810), 361-365.
WILLIAMS JC , R e n n a k e r RL, a n d K ip k e DR (1999). Stability of chronic multichannel neural recordings: implications for a long term neural interface. Neurocomputing, 26-27, 1069-1076.
WOLPAW JR AND MCFARLAND D J (2004). Control of a two-dimensional movement signal by a noninvasive brain-computer interface in humans. Proceedings o f the National Academy o f Sciences o f the USA, 101(51), 17849-17854.
WOOD F, BLACK MJ, VARGAS-lRWIN C, FELLOWS M, AND DONOGHUE J P (2004). On the variability of m anual spike sorting. IEEE Transactions on Biomedical Engineering, 51(6), 912-918.
W u W, B l a c k MJ, M u m fo r d D, G ao Y, B i e n e n s t o c k E , a n d D o n o g h u e J P (2004). Modeling and decoding motor cortical activity using a switching Kalman filter. IEEE Transactions on Biomedical Engineering, 51(6), 933-942 .
W u W, G a o Y, B i e n e n s t o c k E, D o n o g h u e JP, a n d B l a c k M J (2006). Bayesian population decoding of motor cortical activity using a Kalman filter. Neural Computation, 18(1), 80—118.
YU BM (2007). Neural Dynamics o f Motor Preparation and Execution. Ph.D. thesis, Department of Electrical Engineering, Stanford University, Stanford, CA.
Y u BM, A f s h a r A, S a n th a n a m G, R y u SI, S h e n o y KV, a n d S a h a n i M (2006a). Extracting dynamical structure embedded in neural activity. In Y Weiss, B Scholkopf, and J P la tt (Eds.) A d vances in Neural Information Processing Systems 18, pages 1545-1552. MIT Press, Cambridge, MA.
Y u BM, K e m e r e C, S a n th a n a m G, A f s h a r A , R yu SI, M e n g T H , S a h a n i M, a n d S h e n o y KV (2006b). Mixture of trajectory models for neural decoding of goal-directed movements. In preparation for resubmission to Journal of Neurophysiology Innovative Methodology.
Y u BM, R yu SI, S a n th a n a m G, C h u r c h la n d MM, a n d S h e n o y KV (2004). Improving neural prosthetic system performance by combining plan and peri-movement activity. In Proceedings o f the 26th Annual International Conference o f the IEEE EM BS, volume 6, pages 4516-4519. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404254.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
BIBLIOGRAPHY 149
ZHANG K, G in z b u r g I, M c NAUGHTON B, AND SEJNOWSKI TJ (1998). Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cells. Journal o f Neurophysiology, 79(2), 1017-1044.
ZlPSER D AND ANDERSEN RA (1988). A back-propagation programmed network th a t simulates response properties of a subset of posterior parietal neurons. Nature, 331(6158), 679-684.
Z u m s te g ZS, K e m e r e C, O’D r i s c o l l S, S a n th a n a m G, A h m ed RE, S h e n o y KV, a n d M e n g TH (2005). Power feasibility of implantable digital spike sorting circuits for neural prosthetic systems. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(3), 272-279. doi: 10.1109/TNSRE.2005.854307.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.