Upload
etienne
View
213
Download
0
Embed Size (px)
Citation preview
Effect of sensory experience on motor learningstrategyShou-Han Zhou ∗, Denny Oetomo ∗ , Ying Tan ∗ , Iven Mareels ∗ and Etienne Burdet †
∗Melbourne School of Engineering, The University of Melbourne, Melbourne, Victoria Australia, and †Department of Bioengineering, Imperial College of Science, Technology
and Medicine, London, United Kingdom
Submitted to Journal of Neurophysiology
Shou-Han Zhou, Denny Oetomo, Ying Tan, Iven Mareels, Etienne1
Burdet It is well known that the central nervous system automati-2
cally reduces a mismatch in the visuo-motor coordination. Can the3
underlying learning strategy be modified by environmental factors4
or a subject’s learning experiences? To elucidate this matter, two5
groups of subjects learned to execute reaching arm movements in6
environments with task-irrelevant visual cues. However, one group7
had previous experience of learning these movements using task-8
relevant visual cues. The results demonstrate that the two groups9
used different learning strategies for the same visual environment,10
and that the learning strategy was influenced by prior learning ex-11
perience.12
Human Motor Learning; Task relevant and irrelevant sensory feedback; Sensorimotor13
learning strategies14
Introduction15
Humans have the ability to learn with visual deformations16
effectively, as was demonstrated through the use of prismatic17
glasses (Helmholtz and Southall, 1925; Harris, 1963; Redding18
and Wallace, 1996; Pisella et al., 2006; Michel et al., 2007).19
To systematically analyze visuo-motor coordination learning,20
recent works have observed modifications to arm reaching21
movements when visual feedback is affected during the move-22
ment (Flanagan et al., 1999; Krakauer et al., 2000; Scheidt23
et al., 2005). This learning was interpreted by processes in-24
volving sensory prediction (Tseng et al., 2007; Sarlegna and25
Sainburg, 2009; Wei and Kording, 2010; Marko et al., 2012;26
Schaefer et al., 2012) and adjustment to feed-forward muscle27
inputs (Wang, 2005; Shabbott and Sainburg, 2010). In all28
cases, compensation for the mismatch between the hand and29
the cursor movements was modeled as the gradual modifica-30
tion of the planned trajectory across trials (Wolpert et al.,31
2011; Haith and Krakauer, 2013; Seidler et al., 2013). This32
minimization of the error between the hand and the visual33
target can be interpreted as the optimization of a particular34
set of factors associated with the human and with the environ-35
ment (Todorov and Jordan, 2002; Criscimagna-Hemminger36
et al., 2010).37
However, a mismatch of the visuo-motor coordination is38
not always gradually reduced over trials. In particular, sub-39
jects have been observed to learn through switching of the40
planned movements, as is evident from observation that hu-41
mans are able to learn multiple visuo-motor rotations at one42
time (Ganesh and Burdet, 2013), or through the strategy43
of ignoring the visual disturbances that do not affect the44
task outcome, (i.e. task-irrelevant disturbances) (Franklin45
and Wolpert, 2008; van Beers et al., 2013).46
While the works discussed above show how the mismatch47
of hand and visual target are reduced by learning, this paper48
investigates whether this mismatch can affect the learning49
strategy itself. The above learning strategies/processes may50
rely on visual reflexes as proposed in recent works (Day and51
Lyon, 2000; Saijo et al., 2005; Franklin and Wolpert, 2008;52
Franklin et al., 2012), i.e. involuntary motor responses oppos-53
ing the mismatch between the hand and the visual cursor. In-54
terestingly, these visual reflexes can be inhibited by the CNS55
in carefully designed environments (Franklin and Wolpert,56
2008)). Therefore, could the choice of a strategy, e.g. gradual57
adaptation of planned movement or switching between distinct58
planed movements, be affected by the type of visual environ-59
ment provided? To address this question, we designed an60
experiment in which two groups of subjects performed reach-61
ing movements in a visual environment with a task-irrelevant62
deformation, where one group was previously trained in an-63
other visual environment producing a task-relevant deforma-64
tion. We analyzed the resulting behavior and adaptation.65
The results demonstrate that the task-relevant errors affect66
the subjects’ learning strategy, yielding different learning be-67
haviors.68
Experiment69
Eight right-handed subjects (aged 21 − 42, with 4 females)70
with no reported neurological disorders participated in the71
study as the first group (G1 group). A group of six subjects72
(aged 23 − 40, with 3 females) participated as the second73
group (G2 group). The study was approved by the Impe-74
rial College ethics committee and the subjects gave written75
consent prior to performing the experiment.76
Setup.The apparatus setup for the experiment is shown in77
Figure 1. The robot is a stiff four-bar linkage offering little78
resistance to motion. It is equipped with optical encoders to79
measure the joints angle at a sampling rate of 1 kHz. Each80
human subject is required to sit on a chair while his/her hand81
is strapped to a cuff attached to the robot end effector, which82
prevents any wrist movement and provides support to the83
arm against gravity. The subject’s arm is therefore restricted84
to planar movements and can be modeled as a two bar serial85
linkage with revolute joints at the end of each link. To pre-86
vent movement of the upper body, each subject is required to87
rest against a head-rest which is fixed onto the robot frame.88
The hand movement is recorded in Cartesian coordinates89
[xH yH ]T ∈ R2 relative to the shoulder. The cursor position90
[xC yC ]T ∈ R2 on the computer screen is reflected from a91
mirror which removes the subject’s hand from his/her field of92
vision, enabling the experimenter to generate any computer-93
controlled visual distortions by modifying the cursor position94
from the actual hand position. Both the cursor and the hand95
movements are recorded at 200 Hz.96
Protocol.The experiment task consists of performing target97
reaching movements with the right arm from the start posi-98
tion located at (−15, 15) cm (in front of the subject’s chest)99
to a 1 cm radius target 15 cm away in the y direction (Figure100
1). The arm motion is performed on a plane approximately101
10 cm below the subject’s shoulder level.102
Address for reprint requests and other correspondence: Etienne Burdet, Department of Bioengi-neering, Imperial College of Science, Technology and Medicine (Email: [email protected])
Journal of Neurophysiology 1
Articles in PresS. J Neurophysiol (November 26, 2014). doi:10.1152/jn.00470.2014
Copyright © 2014 by the American Physiological Society.
Before each trial, the target and the cursor appear and103
the robot ceases to apply any force, enabling the subject to104
perform free movement. After each trial, the cursor disap-105
pears and the robot moves the subject’s hand back to the106
starting position for the next trial. In this way, no visual107
feedback of the hand is provided to the subject at the end of108
each movement, preventing him or her from easily noticing109
the discrepancies between the hand and the cursor positions110
(Franklin et al., 2008).111
If the hand reaches the target in 700 ± 100 ms, the tar-112
get displays a ripple and the movement is rewarded with one113
point, as shown in Figure 2A. If the movement is too fast or114
too slow, there is no reward and the target’s color is modified115
accordingly (Figure 2A). The subjects are required to obtain116
a score of 100 points (G1 group) or 180 points (G2 group) in117
order to complete the experiment. The subjects are informed118
that their movements may be affected during the experiment.119
Two types of visual environments are provided to the sub-jects (Figure 2B). In environment 1 (VE1) the cursor position(xC yC)T is related to the hand position (xH yH)T as(
xC
yC
)=
(0.1yH + xC(0)
yH
)[1]
where yH represents the hand’s velocity in the y direction and120
xC(0) = xH(0) is the starting cursor position in the x direc-121
tion. Under this environment, the cursor and the hand start122
at the same position. When the subject begins to move to-123
wards the target, the cursor deviates from the subject’s hand124
trajectory in the x direction proportionally to the speed of125
the hand movement in the y direction. At the end of the trial,126
the subject’s hand stops moving and the cursor settles on the127
line joining the centers of the starting point and the target.128
Therefore, there is no error between the cursor’s final posi-129
tion and the target in the x direction, i.e. this environment130
does not induce any end-point error in the x direction.131
Environment 2 (VE2) is implemented as(xC
yC
)=
(xH + 0.1yH
yH
)[2]
In this environment, the cursor deviates from the hand as132
the subject reaches for the target. The cursor then settles133
at the subject’s hand at the end of the trial when the sub-134
ject’s hand stops moving. In this environment, deviations of135
the hand from the target are therefore reflected on the screen136
which the subject is required to minimize. This is in contrast137
to the first environment (VE1).138
Probe trials are used to observe changes in the planned139
movement. In these trials, the visual cursor is turned off, so140
that subjects have no visual feedback during the movement,141
but can see the position reached by their hand position when142
they have completed the movement.143
Before the experiment, the subjects are informed that144
changes will occur during the experiment but are not in-145
formed of the form of the changes nor when the changes would146
take place. The subjects of both groups progress through147
different phases of the experiment according to the protocols148
given in Figures 2C and 2D, respectively. As a simple reward,149
the subjects progress through different phases of the exper-150
iment by completing a given number of successful trials i.e.151
trials in which the target is reached in the suitable duration.152
This way, the subjects are required to have fully learned to153
perform the task in the given environment before they can154
progress to the next phase.155
The subjects in the G1 group perform the arm reaching156
movements in VE1 according to the protocol given in Figure157
2C. The starting phase consists of trials without visual de-158
formation, during which the subjects can experience the task159
and the robot dynamics. After 30 successful trials, VE1 is ac-160
tivated for the unsuspecting subject. In the subsequent learn-161
ing phase, the subjects carry out the trials until they have162
produced 25 successful trials. This is followed by a learned163
phase with a 25 successful trials target during which probe164
trials are randomly integrated. Finally, a washout phase is165
applied with a target of 20 successful trials in which the en-166
vironment is turned off. Learning effects can be observed by167
comparing the trials of the washout phase with the trials of168
the starting phase.169
The subjects in the G2 group are required to perform arm170
reaching movements in VE2 before completing movements in171
VE1 (Figure 2D). The subjects of the G2 group are required172
to learn VE2 with the same protocol as the subjects of the173
G1 group. The subsequent learning of VE1 occurs immedi-174
ately after the washout phase of VE2. The washout phase of175
VE1 is therefore used to observe the learning of VE2 and is176
also used as the starting phase for the subsequent learning of177
VE1.178
Data Analysis .The data of the hand position is collected179
during the experiment. The hand velocity is computed using180
numerical differentiation followed by a fifth-order zero phase181
Butterworth low pass filter with a 30 Hz cut-off frequency.182
To filter out any movement due to motor noise, the start183
and the end of the recorded movement are determined from184
a velocity threshold of 0.03 m/s as in (Tseng et al., 2007).185
Any movement below this threshold is removed, enabling a186
comprehensive analysis of the subjects’ movement.187
Four measures are used to analyze learning:188
1. The absolute hand path area and the absolute cursor path189
area of each trial are defined as the area delimited by the190
hand path (Burdet et al., 2001), as shown in Figure 3A:191
SH =
N∑i=1
∣∣∣xH(i)− xH(0)∣∣∣ ∣∣∣yH(i)
∣∣∣ [3]
SC =N∑i=1
∣∣∣xC(i)− xC(0)∣∣∣ ∣∣∣yC(i)
∣∣∣ [4]
where N is the total number of points collected during the192
trial.193
2. The hand initial direction αH and the cursor initial direc-tion αC are computed as the direction of the hand and thecursor for the first quarter of the trajectory (Figure 3B)
αCk = arctan
(yC(N/4)− yC(0)
xC(N/4)− xC(0)
)αCk = −|αC
k | [5]
αHk = arctan
(yH(N/4)− yH(0)
xH(N/4)− xH(0)
)αHk = −(sign(αC
k )αHk ) [6]
With this definition, the cursor is always in the negative194
direction while a positive hand direction indicates that the195
hand is moving away from the cursor and a negative hand196
direction indicates that it is moving towards the cursor.197
3. Similarly, the hand final direction βH and the cursor finaldirection βC are defined as the directions of the hand andcursor positions at the end of the movement relative to
2 Journal of Neurophysiology
the start position (Figure 3C)
βCk = arctan
(yC(N)− yC(0)
xC(N)− xC(0)
)βCk = −|βC
k | [7]
βHk = arctan
(yH(N)− yH(0)
xH(N)− xH(0)
)βHk = −|sign(βC
k )βHk | [8]
4. Finally, the difference between the absolute hand-path error198
and the absolute cursor error is defined as199
SE = SH − SC [9]
where SH and SC are the absolute hand and cursor error200
defined in (3) and (4) respectively.201
All trials are considered in the results analysis. In order to202
compare the performances between different subjects, a spline203
is first fit to the data of each subject across trials. This spline204
is then used to interpolate between trials in order to generate205
200 trials per phase.206
Results207
The results of typical subjects in each group are first de-208
scribed and then systematically analyzed to identify the rel-209
evant learning patterns.210
Evolution of hand and cursor movement paths.The subjects211
of the G2 group performed more trials than that of the G1212
group (a mean of 274 for the G1 group and 388 for the G2213
group). However, the two groups have large standard devia-214
tion (std = 140 for the G1 group and std = 123 for the the215
G2 group). To compare the results of the two groups, the216
movements of one representative subject of each group are217
shown in Figure 4. The subjects performed a similar number218
of trials (364 and 387 trials, respectively).219
In the Null Field, the hand paths made by subject 1 of the220
G1 group and by subject 2 of the G2 group join the starting221
point and the target in approximately a straight line (Figure222
4, Null Field row).223
In the task-relevant environment VE2, subject 2 initially224
moves in the opposite direction to the deformation (Figure225
4B, Initial 10 Movements), before learning to move in the226
same direction as the cursor movement (Figure 4B, Final 10227
Movements). This behavior is maintained in the probe trials228
(Figure 4B, Probe Trial). When the visual deformation of229
VE2 is turned off, the subject quickly reverts to the straight-230
line trajectory (Figure 4B, Washout).231
In the task-irrelevant environment VE1, it was observed232
that subject 1’s hand immediately deviates from the straight233
line trajectory (Figure 4A, Initial 10 Movements). The hand234
path continues to drift with consecutive trials in the oppo-235
site direction to the visual deformation (Figure 4A, Final 10236
Movements). This behavior is not modified in the probe tri-237
als, during which the subjects made the same movements as238
observed when the visual field is turned on (Figure 4A, Probe239
Trials).240
In the same environment VE1, subject 2’s hand path241
moves away from the visual deformation on the very first242
trials (Figure 4C, Initial 10 Movements). However, unlike243
the results of subject 1 in Figure 4A, subject 2 settles on244
moving along the same curve as the visual cursor after suffi-245
ciently many trials, similar to his/her movements in the VE2246
environment (Figure 4C, Final 10 Movements). The subject247
maintains the curved movement in the probe trials, result-248
ing in the hand reaching the actual target (Figure 4C, Probe249
Trials).250
In the washout of VE1, significant adjustments are made251
by subject 1, with the subject returning to the straight line252
trajectory (Figure 4A, Washout). However, the washout tri-253
als of subject 2 in VE1 are not adjusted and the subject con-254
tinues to move along the curved path (Figure 4C, Washout).255
Population behavior.To examine whether the results of sub-256
jects 1 and 2 can be generalized across the population, the257
mean and standard deviation of the first six measures (3)-(8)258
defined in the Data Analysis section (i.e. the absolute hand259
path area, the hand and cursor initial directions and the hand260
and cursor final directions) are plotted against the normal-261
ized trials for groups G1 and G2 in Figure 5A. The bar plots262
of Figure 5B and Figure 5C and the associated t-tests are263
used to examine the significance of the behaviors observed264
between the two groups.265
In VE2, the hand of the G2 group moves away from the266
straight line toward the cursor (Figure 5 A.4). The subjects267
consistently maintain their path with similar curvature in the268
subsequent trials during learning. Furthermore, their hand269
path drifts towards the cursor direction as they learn the270
field while their final hand position is maintained at the tar-271
get (Figure 5 A.5, A.6). This behavior is reflected in the272
movements of subject 2 in Figure 4B.273
In VE1, the G2 group subjects move in the same direction274
as the cursor, as they have done previously in VE2 (Figure275
5 A.7-A.9). This behavior persists until the last few trials,276
where the G2 group’s initial direction begins to move towards277
the straight line (Figure 5 A.8), while the final direction drifts278
slightly away from the cursor trajectory (Figure 5 A.9). The279
G2 group eventually returns to the straight line trajectory in280
the washout trials (Figure 5 A.7-A.9).281
Compared to the respective behaviors in VE2, the perfor-282
mance of the G2 group in VE1 possesses similar hand path283
area as seen in Figure 5B (p > 0.65 for early learning and284
p > 0.39 for late learning). Furthermore, the subjects learn285
similar trajectories in the two environments, with similar ini-286
tial and final directions during late learning (p > 0.2 for ini-287
tial direction and p > 0.3 for final direction). There exists288
little change in the performance measures during the probe289
trials during which the cursor position is removed (Figure290
5C, p > 0.6, p > 0.73, p > 0.5 for the three measures respec-291
tively), suggesting that the learning occurs in a feed-forward292
manner.293
In VE1, the G1 group is observed to move away from the294
straight line in the direction opposite to the cursor (Figure295
5 A.1-A.3). The G1 group’s initial direction increases in the296
direction opposite of the cursor and the direction is main-297
tained throughout the trials (Figure 5 A.2). Similarly, the298
G1 group’s final direction moves away from the straight line299
in the environment, resulting in the subject’s hand failing to300
reach the target (Figure 5 A.3, hand). In this task-irrelevant301
environment, the G1 group is still able to bring the cursor to302
the target and complete the task (Figure 5 A.3, cursor).303
In the washout trials, the G1 group’s movements return304
to the straight line. The fast decrease of the final direction305
(Figure 5 A.3) compared to the slow decrease of the initial306
direction (Figure 5 A.2) is reflected in subject 1’s movements307
in the field VE1 (Figure 4A).308
Compared to the G2 group’s behavior in VE1, the G1309
group demonstrated similar hand path area and final direc-310
tion for early learning (Figure 5B, p > 0.3 and p > 0.05 re-311
spectively). However, in late learning, significant changes are312
observed between the two groups (p < 0.01 and p < 1e − 5313
Journal of Neurophysiology 3
for hand path area and final direction, respectively). The314
hand initial directions of the two groups in the field are con-315
sistently different across trials (p < 0.01 for early learning316
and p < 1e− 5 for late learning).317
Similar to the G2 group, the movement of the G1 group is318
of feed-forward nature, resulting in insignificant change in the319
behavior during the probe trials (Figure 5C, p > 0.7, p > 0.6320
and p > 0.34 for each measure).321
During early washout after learning in VE1, the hand322
path areas are similar between subjects of the G1 and G2323
groups (Figure 5B, p > 0.1), while the initial direction for324
the G1 group is higher than that of the G2 group (p < 0.05).325
However, the final directions are similar between the two326
groups (p > 0.1). This implies that the subjects of group327
G1 made significantly more corrections to their movements328
(through means such as online feedback or motor planning329
adjustments) compared to subjects of the G2 group. In the330
late washout trials, the G1 group’s hand path is adjusted331
such that it is similar to that of the G2 group for all three332
measures (p > 0.05, p > 0.1 and p > 0.9). The initial and333
final directions of the hand paths of both groups are not sig-334
nificantly different from zero (p > 0.05, p > 0.46 for the G1335
group and p > 0.25 p > 0.2 for the G2 group), suggesting336
that the subjects return to the straight line.337
Significance of Error Compensation.To examine error com-338
pensation in human learning, the difference between the hand339
path area and the cursor path area SE is analyzed in Figure 6.340
The G2 group exhibits a nearly instantaneous change when341
either VE1 or VE2 is introduced. The change is maintained342
throughout the trials within the respective environment. On343
the other hand, the G1 group exhibits a gradual change in vi-344
sual environment VE1 in the direction opposite to the change345
made by the G2 group in the same environment.346
To further analyze this behavior, an exponential model347
y = aebk + c [10]
is fitted to the data, where y represents the difference between348
the hand and cursor path area (SE), k is the generated trial349
number and a, b, and c are the coefficients of the fit. The350
initial learning behavior of the subject is reflected in the sum351
of coefficients a and c. The late learning behavior is reflected352
in the coefficient c. Finally, the learning rate of the subjects353
is reflected in the coefficient b. The three coefficients of inter-354
est a + c, b and c are significantly different between the two355
groups.356
The coefficient a+ c is low for the G1 group in VE1 and357
is not significantly different from zero (p > 0.8). For the G2358
group in VE1 and VE2, the coefficient is significantly positive359
(with p < 0.01 and p < 0.05, respectively). This reflects the360
instantaneous increase observed for the G2 group’s learning361
behavior, which is not observed in the G1 group.362
The learning rate b of the G1 group is significantly neg-363
ative (p < 0.05), while the G2 group’s learning rate in VE2364
and VE1 after the increase is not significantly different from365
zero (p > 0.6 and p > 0.1, respectively).366
The coefficient c, which reflects the steady-state of the367
subjects’ learning behavior, is significantly negative for the368
G1 group (p < 0.05) and is significantly positive for the G2369
group in VE2 (p < 0.05).370
Finally, the coefficients a + c and c for the G2 group in371
VE1 are significantly different from that of the G1 group372
(p < 0.01 and p < 0.01, respectively), but are similar to their373
values in VE2 (p > 0.2 and p > 0.05, respectively).374
These results reflect the observation that subjects in the375
G2 group learn to move both in VE2 and VE1 by switching376
to another movement, and subsequently maintain the move-377
ment in the environment. By contrast, subjects in the G1378
group gradually change their movements in VE1, resulting379
in a gradual convergence to the final movement trajectory in380
VE1.381
Discussion382
This study examined whether prior training with task-383
relevant feedback affects the learning strategy used in task-384
irrelevant environment. Two groups of subjects (G1 and G2)385
learned target reaching movements in the task-irrelevant en-386
vironment VE1, where the G2 group had trained previously387
in the task-relevant environment VE2. The results exhibited388
two distinctly different behaviors for the two groups, suggest-389
ing that the learning strategy was affected by previous train-390
ing. In particular, the learning strategy in the task-irrelevant391
environment VE1 was affected by previous training in the392
task-relevant environment VE2. In addition, the large varia-393
tions in the number of trials necessary to complete a learning394
phase observed in the different subjects suggested that the395
observed behavior was independent of the total number of396
trials made by the subjects. The following sections further397
discuss the observed behavior.398
Humans use different learning strategies for the same task.399
When presented with the lateral visual deformation of VE1,400
the G1 group moved in the direction opposite the deforma-401
tion and tended to drift further away from the target over402
trials. In contrast, the G2 group tended to move either in a403
straight line or in the same direction as the cursor in VE1,404
with the hand movements following the cursor movements405
(Figure 5 A.4-A.6).406
The different learning behaviors can be explained by the407
different strategies employed by the two groups to learn the408
task. The G1 group may use visual reflexes to gradually409
compensate for the observed mismatch between the cursor410
and the straight line joining the starting point and the target411
trial after trial, as was proposed in previous works (Franklin412
et al., 2008; Krakauer et al., 2000), resulting in hand moving413
opposite to the direction of the cursor deformation.414
On the other hand, the G2 group’s learning strategy415
seems to consist of switching their planned movements416
(Ganesh and Burdet, 2013) (Figure 6), resulting in their417
hand either ignoring or following the cursor (Figure 5 A.4-418
A.9). This strategy results in the hand either moving along419
a straight line or moving in the same direction as the cursor,420
enabling limited deviation of the hand from the two trajec-421
tories.422
Comparison of learning strategies.The different learning423
strategies are further evidenced by the different learning be-424
haviors made by the two groups in VE1. The G1 group’s425
learning strategy results in the difference between the hand426
and the cursor path area to decay exponentially after an ini-427
tial jump. This is shown from the negative coefficient “b”428
of the G1 group in environment VE1 (Figure 6). Conse-429
quently, the hand gradually drifts from the straight line tra-430
jectory. Drifting was also observed when visual feedback was431
prevented during the movement (Brown et al., 2003; Salaun432
et al., 2009). However in that case no exponential decay was433
observed, implying that the behavior was not associated with434
learning.435
The G2 group’s learning strategy results in the difference436
between the hand and the cursor path area to change almost437
instantaneously when the subject is presented with the novel438
visual environment VE1 (Figure 6, right column). The fact439
4 Journal of Neurophysiology
that the coefficient “b” found for G2 group’s movements in440
VE1 is not significantly different from zero in Figure 6 sug-441
gests that no significant adjustment of the hand or the cursor442
is made in the subsequent trials.443
Learning strategies affected by task-relevant errorsAlthough444
the G2 group’s learning behavior was different from that of445
the G1 group in VE1, it was similar to the G2 group’s learn-446
ing behavior in VE2 (Figure 6). This suggests that the G2447
group used the same learning strategy in VE1 as that used448
in VE2, which was a different strategy than that of the G1449
group in VE1. It is therefore possible that the human sub-450
jects changed their strategy while learning to move in VE2.451
A possible reason for the change is that if subjects use the452
strategy of the G1 group to correct for the visual discrepan-453
cies in VE2, then task-relevant errors are produced (Franklin454
et al., 2008), resulting in the endpoint cursor deviating from455
the target. The subjects are therefore forced to use visual456
reflexes and voluntary visual corrections to adjust the move-457
ment online in order to ensure that the hand reaches the458
target, which is evident in the final direction observed in the459
G2 group in the field (Figure 5 A.6).460
This reliance on visual reflexes caused the subjects to461
switch their planned movement, which allowed them to462
choose either to ignore or to follow the visual disturbances463
in VE2 in order to maintain a relatively straight trajectory464
and succeed in the task, as was observed in Figure 5 A.4-A.6.465
In this light, reaching the target has a higher priority (the466
primary task) than compensating for the visual deformation467
(the secondary task), which is ignored if it conflicts with the468
primary task.469
Over trials, it seemed that the subjects became famil-470
iarized with the new plan such that they relied less on on-471
line adjustments. This is supported by observations that the472
movements of the G2 group are invariant in probe trials, in473
which no cursor is provided to the subjects (Figure 5C), which474
implies that the movements in late learning are feed-forward475
in nature.476
Overall, these observations suggest that task-relevant er-477
rors not only affect subjects’ movement and visual reflexes,478
but also change the learning strategy employed, resulting in479
subjects using different feed-forward commands to achieve480
the task in the same visual field (Figures 5 and 6, VE1 envi-481
ronment).482
Previous investigations have found that human subjects483
change their reliance on feed-forward or feedback informa-484
tion for learning and for motion depending on their previ-485
ous experience (Kagerer et al., 1997; Saijo and Gomi, 2012).486
Current results show that experience can further change the487
strategy which humans use for learning. In particular, sub-488
jects use different learning strategies depending on whether489
they have previously been trained in environments involving490
task-relevant errors.491
Preference of gradual learning strategy In VE1, it was ob-492
served that both learning strategies enabled the subjects to493
succeed in target reaching. Why did the subjects prefer the494
gradual adaptation strategy over the plan switching strategy,495
which they used only if they had been trained in the task-496
relevant environment VE2? Gradual change of the movement497
can be performed automatically trial after trial, for example498
by incorporating visual reflexes experienced in previous trial.499
This control strategy does not require much online control. In500
contrast, the strategy used in the task-relevant environment501
requires switching to another motor plan and coordinating502
the motor command with the ongoing movement and thus503
heavily relies on sensory feedback and online computation504
and control. Therefore, humans may use an Occam’s razor505
approach for learning (MacKay, 2003) in that they do not at-506
tempt to make abrupt switches between plans for learning an507
environment, unless it is necessary to perform the task, since508
such a strategy is more difficult than gradually updating the509
feed-forward command using visual errors.510
ACKNOWLEDGMENTS. We thank Wayne Dailey for his assistance in proof read-511
ing the manuscript.512
References513
Brown, L. E., Rosenbaum, D. A., and Sainburg, R. L. (2003).514
Movement speed effects on limb position drift. Experimen-515
tal Brain Research, 153(2):266–74.516
Burdet, E., Osu, R., Franklin, D. W., Milner, T. E., and517
Kawato, M. (2001). The central nervous system stabilizes518
unstable dynamics by learning optimal impedance. Nature,519
414(6862):446–9.520
Criscimagna-Hemminger, S. E., Bastian, A. J., and Shad-521
mehr, R. (2010). Size of error affects cerebellar contri-522
butions to motor learning. Journal of Neurophysiology,523
103(4):2275–2284.524
Day, B. and Lyon, I. (2000). Voluntary modification of auto-525
matic arm movements evoked by motion of a visual target.526
Experimental Brain Research, 130(2):159–168.527
Flanagan, J. R., Nakano, E., Imamizu, H., Osu, R., Yoshioka,528
T., and Kawato, M. (1999). Composition and decompo-529
sition of internal models in motor learning under altered530
kinematic and dynamic environments. Journal of Neuro-531
science, 19:34–1.532
Franklin, D. W., Burdet, E., Peng Tee, K., Osu, R., Chew,533
C.-M., Milner, T. E., and Kawato, M. (2008). Cns learns534
stable, accurate, and efficient movements using a simple535
algorithm. Journal of Neuroscience, 28(44):11165–11173.536
Franklin, D. W. and Wolpert, D. M. (2008). Specificity of537
reflex adaptation for task-relevant variability. Journal of538
Neuroscience, 28(52):14165–14175.539
Franklin, S., Wolpert, D. M., and Franklin, D. W. (2012). Vi-540
suomotor feedback gains upregulate during the learning of541
novel dynamics. Journal of Neurophysiology, 108(2):467–542
478.543
Ganesh, G. and Burdet, E. (2013). Motor planning explains544
human behaviour in tasks with multiple solutions. Robotics545
and Autonomous Systems, 61(4):362–368.546
Haith, A. M. and Krakauer, J. W. (2013). Model-based547
and model-free mechanisms of human motor learning. In548
Progress in Motor Control, pages 1–21. Springer.549
Harris, C. S. (1963). Adaptation to displaced vision: visual,550
motor, or proprioceptive change? Science, 140(3568):812–551
813.552
Helmholtz, H. v. and Southall, J. (1925). Treatise on physi-553
ological optics. III. The perceptions of vision. New York.554
Kagerer, F. A., Contreras-Vidal, J., and Stelmach, G. E.555
(1997). Adaptation to gradual as compared with sud-556
den visuo-motor distortions. Experimental Brain Research,557
115(3):557–561.558
Krakauer, J. W., Pine, Z. M., Ghilardi, M. F., and Ghez, C.559
(2000). Learning of visuomotor transformations for vecto-560
rial planning of reaching trajectories. Journal of Neuro-561
science, 20(23):8916–8924.562
MacKay, D. J. (2003). Information Theory, Inference and563
Learning Algorithms, volume 1. Cambridge University564
Press, Cambridge, UK.565
Marko, M. K., Haith, A. M., Harran, M. D., and Shadmehr,566
R. (2012). Sensitivity to prediction error in reach adapta-567
tion. Journal of Neurophysiology, 108(6):1752–1763.568
Journal of Neurophysiology 5
Michel, C., Pisella, L., Prablanc, C., Rode, G., and Rossetti,569
Y. (2007). Enhancing visuomotor adaptation by reducing570
error signals: single-step (aware) versus multiple-step (un-571
aware) exposure to wedge prisms. Journal of Cognitive572
Neuroscience, 19(2):341–350.573
Pisella, L., Rode, G., Farne, A., Tilikete, C., and Rossetti, Y.574
(2006). Prism adaptation in the rehabilitation of patients575
with visuo-spatial cognitive disorders. Current Opinion in576
Neurology, 19(6):534–542.577
Redding, G. M. and Wallace, B. (1996). Adaptive spatial578
alignment and strategic perceptual-motor control. Journal579
of Experimental Psychology: Human Perception and Per-580
formance, 22(2):379.581
Saijo, N. and Gomi, H. (2012). Effect of visuomotor-map582
uncertainty on visuomotor adaptation. Journal of Neuro-583
physiology, 107(6):1576–1585.584
Saijo, N., Murakami, I., Nishida, S., and Gomi, H. (2005).585
Large-field visual motion directly induces an involuntary586
rapid manual following response. The Journal of Neuro-587
science, 25(20):4941–4951.588
Salaun, C., Padois, V., and Sigaud, O. (2009). A two-level589
model of anticipation-based motor learning for whole body590
motion. In Anticipatory Behavior in Adaptive Learning591
Systems, pages 229–246. Springer.592
Sarlegna, F. R. and Sainburg, R. L. (2009). The roles of vision593
and proprioception in the planning of reaching movements.594
In Progress in motor control. Springer.595
Schaefer, S. Y., Shelly, I. L., and Thoroughman, K. A. (2012).596
Beside the point: motor adaptation without feedback-597
based error correction in task-irrelevant conditions. Jour-598
nal of Neurophysiology, 107(4):1247–1256.599
Scheidt, R. A., Conditt, M. A., Secco, E. L., Mussa-ivaldi,600
F. A., and Robert, A. (2005). Interaction of visual and pro-601
prioceptive feedback during adaptation of human reaching602
movements. Journal of Neurophysiology, 93(6):3200–3213.603
Seidler, R. D., Kwak, Y., Fling, B. W., and Bernard, J. A.604
(2013). Neurocognitive mechanisms of error-based motor605
learning. Advanced Experiment Medical Biology, 782:39–60.606
Shabbott, B. A. and Sainburg, R. L. (2010). Learning a vi-607
suomotor rotation: simultaneous visual and proprioceptive608
information is crucial for visuomotor remapping. Experi-609
mental Brain Research, 203(1):75–87.610
Todorov, E. and Jordan, M. I. (2002). Optimal feedback611
control as a theory of motor coordination. Nature Neuro-612
science, 5(11):1226–35.613
Tseng, Y.-W., Diedrichsen, J., Krakauer, J. W., Shadmehr,614
R., and Bastian, A. J. (2007). Sensory prediction errors615
drive cerebellum-dependent adaptation of reaching. Jour-616
nal of Neurophysiology, 98(1):54–62.617
van Beers, R. J., Brenner, E., and Smeets, J. B. (2013). Ran-618
dom walk of motor planning in task-irrelevant dimensions.619
Journal of Neurophysiology, 109(4):969–977.620
Wang, J. (2005). Adaptation to visuomotor rotations remaps621
movement vectors, not final positions. Journal of Neuro-622
science, 25(16):4024–4030.623
Wei, K. and Kording, K. (2010). Uncertainty of feedback and624
state estimation determines the speed of motor adaptation.625
Frontiers in Computational Neuroscience.626
Wolpert, D. M., Diedrichsen, J., and Flanagan, J. R. (2011).627
Principles of sensorimotor learning. Nature reviews. Neu-628
roscience, 12(12):739–51.629
6 Journal of Neurophysiology
Group 2
2 cm
15 cm
Hand Path
Cursor Path
Null Field Task Irrelevant
Environment (VE1)Probe Trials
Visual Cues
Missed the target
Too Fast Too SlowGood Time
Award a Point
A
B
Group1
Null Field VE1 Null Field
Null Field Null FieldVE2 VE1
C
D
Null Field
Learning Phase 2
Max Score: 25
Learned Phase 2
Max Score: 25
One probe trialevery 10 trials
Washout Phase 2
Max Score: 30
x
y
Task Relevant Environment (VE2)
Starting Phase
Max Score: 30
One probe trialevery 10 trials
Learning Phase
Max Score: 25
Learned Phase
Max Score: 25
One probe trialevery 10 trials
Washout Phase
Max Score: 20
Starting Phase 1
Max Score: 30
One probe trialevery 10 trials
Learning Phase 1
Max Score: 25
Learned Phase 1
Max Score: 25
One probe trialevery 10 trials
Washout Phase 1and
Starting Phase 2
Max Score: 20
Figure 2
Cursor
Hand
αH βCαC βH
SH SC
(A) (B) (C)
Figure 3
Nul
l Fie
ldBA
Initi
al 1
0 m
ovem
ents
Fina
l 10
mov
emen
tsTask Irrelevant
Environment (VE1)Task Relevant
Environment (VE2)
Group 1 (Subject 1) Group 2 (Subject 2)
Task IrrelevantEnvironment (VE1)
CTask RelevantEnvironment (VE2)
Was
hout
(fi
rst 1
0 m
ovem
ents
)
Cursor Path
Hand Path
1 10Trial
Prob
e Tr
ial
mov
emen
ts
0
5
10
15
− 5 0 5x position (cm)
y p
osi
tio
n (c
m)
Figure 4
−
Generated Trial No
G1 Task Irrelevant Environment (VE1)
A.8
A.7
A.9
800 1000 1200 1400 1600
G2 Task Relevant Environment (VE1)
A.5
A.4
A.6
Cursor
Hand
800 1000 1200 1400 1600
G2 Task Irrelevant Environment (VE2)
0 200 400 600 800
0
0
0
αβ
A.1
A.2
A.3
(deg
rees
) (d
egre
es)
S (c
m2 )
−10
10
20
30
100
−50
50
100
−100
−50
50
100
A
(a)
0
10
20
30
1 2 3 4a1 a2 a3 a4b1 b2 b3 b4
−50
0
50
20
0
20
40
60
Group 2 in VE2
Group 2 in VE1
Group 1 in VE1
* p<0.05** p<0.01*** p<10-3
* **
***
***
***
********* * ***
**
** *** *
***
** * ** *
*
****
***
*** *
***
***
Early Learning Late Learning Early Washout Late Washout
B
SH (c
m2 )
αΗ (
de
gre
es)
βΗ (
de
gre
es)
(b)
0
10
20
30
SH (c
m2 )
−50
0
50
α (
de
gre
es)
−20
0
20
40
60
β (
de
gre
es)
Late Learning Probe Trials
** *
*** ***
*** ***
C
(c)
Figure 5
Gro
up 2
VE2
Generated Trial No
Gro
up 2
VE1
Gro
up 1
VE1
G1VE1
Difference between hand and cursor path area SE=SH-SC
Fitting
* p<0.05** p<0.01*** p<10-3
−20
−10
0
10
20
−20
−10
0
10
20
−20
−10
0
10
20
200 400 600 8000
G2VE2 G2VE1
a+c
−16
−12
−8
−4
02
x 10− 3
b
−40
−20
20
40
c 0
−8
−4
0
4
8
12
*
* **
4
Fitted Coefficients
* ***
*
*** ***
Figure 6
Figure Legends
Figure 1. Setup of the experiment: Subjects perform target reaching movementswhile their hand is attached to the robot, which supports the arm against gravityand can measure hand position.
Figure 2. Experiment protocol.
Figure 3. Definition of the hand and cursor path area (SH and SC), the initialhand and cursor direction (αH , αC) and the final hand and cursor direction(βH , βC), as defined in Equations (3) through (8).
Figure 4. Evolution of cursor trajectories (solid yellow to red lines) and handpath (solid blue to purple lines) for two representative subjects in groups G1and G2. The different effects of VE1 and VE2 are observed in the first twocolumns, while the influence of learning of VE2 on the learning in VE1 is ob-served by comparing the first and third columns. Because the cursor is alignedwith the hand in the null, probe and washout trials, only the hand paths areplotted. Since the robot stops recording when the hand paths velocity falls be-low 0.03 m/s, there are minute discrepancies between the cursor and the targetat the end of the movement in VE1.
Figure 5 Comparison of behaviors across population
Figure 5a. Mean and standard deviation of the three measures for the hand(blue solid line) and the cursor (red solid line) across the normalized trials forall subjects. The different phases are shown in the alternating green and blueshades. The trials made by each subject are spline-fitted and 200 trials aregenerated in each phase, thereby enabling comparison of the performances ofdifferent subjects.
Figure 5b. Comparison of learning performances during the first and last 10movements made by the two groups in visual environment VE1. The asterisks‘*’ indicate the significance level over all subjects. The graph compares theaverage measure made by the subjects of different groups across early and latelearning trials and washout trials. The bars in the graph show the mean ofthe average measures of each group while the error bars provide the standarddeviation.
Figure 5c. Comparison of learning performances during late learning and probetrials for the two groups in visual environment VE1. The asterisks ‘*’ indicatethe significance level over all subjects. The mean and error bars are calculatedin the same way as described in Figure 5b.
Figure 6. SE = SH − SC measure for the G1 and G2 groups in VE1 and VE2.An exponential fit of the measure against the normalized trial number is usedto determine the coefficients a, b, and c shown on the right. For the fit, y istaken as the measure SE while k is the generated trial number