12
Effect of sensory experience on motor learning strategy Shou-Han Zhou * , Denny Oetomo * , Ying Tan * , Iven Mareels * and Etienne Burdet * Melbourne School of Engineering, The University of Melbourne, Melbourne, Victoria Australia, and Department of Bioengineering, Imperial College of Science, Technology and Medicine, London, United Kingdom Submitted to Journal of Neurophysiology Shou-Han Zhou, Denny Oetomo, Ying Tan, Iven Mareels, Etienne 1 Burdet It is well known that the central nervous system automati- 2 cally reduces a mismatch in the visuo-motor coordination. Can the 3 underlying learning strategy be modified by environmental factors 4 or a subject’s learning experiences? To elucidate this matter, two 5 groups of subjects learned to execute reaching arm movements in 6 environments with task-irrelevant visual cues. However, one group 7 had previous experience of learning these movements using task- 8 relevant visual cues. The results demonstrate that the two groups 9 used different learning strategies for the same visual environment, 10 and that the learning strategy was influenced by prior learning ex- 11 perience. 12 Human Motor Learning; Task relevant and irrelevant sensory feedback; Sensorimotor 13 learning strategies 14 Introduction 15 Humans have the ability to learn with visual deformations 16 effectively, as was demonstrated through the use of prismatic 17 glasses (Helmholtz and Southall, 1925; Harris, 1963; Redding 18 and Wallace, 1996; Pisella et al., 2006; Michel et al., 2007). 19 To systematically analyze visuo-motor coordination learning, 20 recent works have observed modifications to arm reaching 21 movements when visual feedback is affected during the move- 22 ment (Flanagan et al., 1999; Krakauer et al., 2000; Scheidt 23 et al., 2005). This learning was interpreted by processes in- 24 volving sensory prediction (Tseng et al., 2007; Sarlegna and 25 Sainburg, 2009; Wei and Kording, 2010; Marko et al., 2012; 26 Schaefer et al., 2012) and adjustment to feed-forward muscle 27 inputs (Wang, 2005; Shabbott and Sainburg, 2010). In all 28 cases, compensation for the mismatch between the hand and 29 the cursor movements was modeled as the gradual modifica- 30 tion of the planned trajectory across trials (Wolpert et al., 31 2011; Haith and Krakauer, 2013; Seidler et al., 2013). This 32 minimization of the error between the hand and the visual 33 target can be interpreted as the optimization of a particular 34 set of factors associated with the human and with the environ- 35 ment (Todorov and Jordan, 2002; Criscimagna-Hemminger 36 et al., 2010). 37 However, a mismatch of the visuo-motor coordination is 38 not always gradually reduced over trials. In particular, sub- 39 jects have been observed to learn through switching of the 40 planned movements, as is evident from observation that hu- 41 mans are able to learn multiple visuo-motor rotations at one 42 time (Ganesh and Burdet, 2013), or through the strategy 43 of ignoring the visual disturbances that do not affect the 44 task outcome, (i.e. task-irrelevant disturbances) (Franklin 45 and Wolpert, 2008; van Beers et al., 2013). 46 While the works discussed above show how the mismatch 47 of hand and visual target are reduced by learning, this paper 48 investigates whether this mismatch can affect the learning 49 strategy itself. The above learning strategies/processes may 50 rely on visual reflexes as proposed in recent works (Day and 51 Lyon, 2000; Saijo et al., 2005; Franklin and Wolpert, 2008; 52 Franklin et al., 2012), i.e. involuntary motor responses oppos- 53 ing the mismatch between the hand and the visual cursor. In- 54 terestingly, these visual reflexes can be inhibited by the CNS 55 in carefully designed environments (Franklin and Wolpert, 56 2008)). Therefore, could the choice of a strategy, e.g. gradual 57 adaptation of planned movement or switching between distinct 58 planed movements, be affected by the type of visual environ- 59 ment provided? To address this question, we designed an 60 experiment in which two groups of subjects performed reach- 61 ing movements in a visual environment with a task-irrelevant 62 deformation, where one group was previously trained in an- 63 other visual environment producing a task-relevant deforma- 64 tion. We analyzed the resulting behavior and adaptation. 65 The results demonstrate that the task-relevant errors affect 66 the subjects’ learning strategy, yielding different learning be- 67 haviors. 68 Experiment 69 Eight right-handed subjects (aged 21 - 42, with 4 females) 70 with no reported neurological disorders participated in the 71 study as the first group (G1 group). A group of six subjects 72 (aged 23 - 40, with 3 females) participated as the second 73 group (G2 group). The study was approved by the Impe- 74 rial College ethics committee and the subjects gave written 75 consent prior to performing the experiment. 76 Setup. The apparatus setup for the experiment is shown in 77 Figure 1. The robot is a stiff four-bar linkage offering little 78 resistance to motion. It is equipped with optical encoders to 79 measure the joints angle at a sampling rate of 1 kHz. Each 80 human subject is required to sit on a chair while his/her hand 81 is strapped to a cuff attached to the robot end effector, which 82 prevents any wrist movement and provides support to the 83 arm against gravity. The subject’s arm is therefore restricted 84 to planar movements and can be modeled as a two bar serial 85 linkage with revolute joints at the end of each link. To pre- 86 vent movement of the upper body, each subject is required to 87 rest against a head-rest which is fixed onto the robot frame. 88 The hand movement is recorded in Cartesian coordinates 89 [x H y H ] T R 2 relative to the shoulder. The cursor position 90 [x C y C ] T R 2 on the computer screen is reflected from a 91 mirror which removes the subject’s hand from his/her field of 92 vision, enabling the experimenter to generate any computer- 93 controlled visual distortions by modifying the cursor position 94 from the actual hand position. Both the cursor and the hand 95 movements are recorded at 200 Hz. 96 Protocol. The experiment task consists of performing target 97 reaching movements with the right arm from the start posi- 98 tion located at (-15, 15) cm (in front of the subject’s chest) 99 to a 1 cm radius target 15 cm away in the y direction (Figure 100 1). The arm motion is performed on a plane approximately 101 10 cm below the subject’s shoulder level. 102 Address for reprint requests and other correspondence: Etienne Burdet, Department of Bioengi- neering, Imperial College of Science, Technology and Medicine (Email: [email protected]) Journal of Neurophysiology 1 Articles in PresS. J Neurophysiol (November 26, 2014). doi:10.1152/jn.00470.2014 Copyright © 2014 by the American Physiological Society.

Effect of sensory experience on motor learning strategy

  • Upload
    etienne

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Effect of sensory experience on motor learningstrategyShou-Han Zhou ∗, Denny Oetomo ∗ , Ying Tan ∗ , Iven Mareels ∗ and Etienne Burdet †

∗Melbourne School of Engineering, The University of Melbourne, Melbourne, Victoria Australia, and †Department of Bioengineering, Imperial College of Science, Technology

and Medicine, London, United Kingdom

Submitted to Journal of Neurophysiology

Shou-Han Zhou, Denny Oetomo, Ying Tan, Iven Mareels, Etienne1

Burdet It is well known that the central nervous system automati-2

cally reduces a mismatch in the visuo-motor coordination. Can the3

underlying learning strategy be modified by environmental factors4

or a subject’s learning experiences? To elucidate this matter, two5

groups of subjects learned to execute reaching arm movements in6

environments with task-irrelevant visual cues. However, one group7

had previous experience of learning these movements using task-8

relevant visual cues. The results demonstrate that the two groups9

used different learning strategies for the same visual environment,10

and that the learning strategy was influenced by prior learning ex-11

perience.12

Human Motor Learning; Task relevant and irrelevant sensory feedback; Sensorimotor13

learning strategies14

Introduction15

Humans have the ability to learn with visual deformations16

effectively, as was demonstrated through the use of prismatic17

glasses (Helmholtz and Southall, 1925; Harris, 1963; Redding18

and Wallace, 1996; Pisella et al., 2006; Michel et al., 2007).19

To systematically analyze visuo-motor coordination learning,20

recent works have observed modifications to arm reaching21

movements when visual feedback is affected during the move-22

ment (Flanagan et al., 1999; Krakauer et al., 2000; Scheidt23

et al., 2005). This learning was interpreted by processes in-24

volving sensory prediction (Tseng et al., 2007; Sarlegna and25

Sainburg, 2009; Wei and Kording, 2010; Marko et al., 2012;26

Schaefer et al., 2012) and adjustment to feed-forward muscle27

inputs (Wang, 2005; Shabbott and Sainburg, 2010). In all28

cases, compensation for the mismatch between the hand and29

the cursor movements was modeled as the gradual modifica-30

tion of the planned trajectory across trials (Wolpert et al.,31

2011; Haith and Krakauer, 2013; Seidler et al., 2013). This32

minimization of the error between the hand and the visual33

target can be interpreted as the optimization of a particular34

set of factors associated with the human and with the environ-35

ment (Todorov and Jordan, 2002; Criscimagna-Hemminger36

et al., 2010).37

However, a mismatch of the visuo-motor coordination is38

not always gradually reduced over trials. In particular, sub-39

jects have been observed to learn through switching of the40

planned movements, as is evident from observation that hu-41

mans are able to learn multiple visuo-motor rotations at one42

time (Ganesh and Burdet, 2013), or through the strategy43

of ignoring the visual disturbances that do not affect the44

task outcome, (i.e. task-irrelevant disturbances) (Franklin45

and Wolpert, 2008; van Beers et al., 2013).46

While the works discussed above show how the mismatch47

of hand and visual target are reduced by learning, this paper48

investigates whether this mismatch can affect the learning49

strategy itself. The above learning strategies/processes may50

rely on visual reflexes as proposed in recent works (Day and51

Lyon, 2000; Saijo et al., 2005; Franklin and Wolpert, 2008;52

Franklin et al., 2012), i.e. involuntary motor responses oppos-53

ing the mismatch between the hand and the visual cursor. In-54

terestingly, these visual reflexes can be inhibited by the CNS55

in carefully designed environments (Franklin and Wolpert,56

2008)). Therefore, could the choice of a strategy, e.g. gradual57

adaptation of planned movement or switching between distinct58

planed movements, be affected by the type of visual environ-59

ment provided? To address this question, we designed an60

experiment in which two groups of subjects performed reach-61

ing movements in a visual environment with a task-irrelevant62

deformation, where one group was previously trained in an-63

other visual environment producing a task-relevant deforma-64

tion. We analyzed the resulting behavior and adaptation.65

The results demonstrate that the task-relevant errors affect66

the subjects’ learning strategy, yielding different learning be-67

haviors.68

Experiment69

Eight right-handed subjects (aged 21 − 42, with 4 females)70

with no reported neurological disorders participated in the71

study as the first group (G1 group). A group of six subjects72

(aged 23 − 40, with 3 females) participated as the second73

group (G2 group). The study was approved by the Impe-74

rial College ethics committee and the subjects gave written75

consent prior to performing the experiment.76

Setup.The apparatus setup for the experiment is shown in77

Figure 1. The robot is a stiff four-bar linkage offering little78

resistance to motion. It is equipped with optical encoders to79

measure the joints angle at a sampling rate of 1 kHz. Each80

human subject is required to sit on a chair while his/her hand81

is strapped to a cuff attached to the robot end effector, which82

prevents any wrist movement and provides support to the83

arm against gravity. The subject’s arm is therefore restricted84

to planar movements and can be modeled as a two bar serial85

linkage with revolute joints at the end of each link. To pre-86

vent movement of the upper body, each subject is required to87

rest against a head-rest which is fixed onto the robot frame.88

The hand movement is recorded in Cartesian coordinates89

[xH yH ]T ∈ R2 relative to the shoulder. The cursor position90

[xC yC ]T ∈ R2 on the computer screen is reflected from a91

mirror which removes the subject’s hand from his/her field of92

vision, enabling the experimenter to generate any computer-93

controlled visual distortions by modifying the cursor position94

from the actual hand position. Both the cursor and the hand95

movements are recorded at 200 Hz.96

Protocol.The experiment task consists of performing target97

reaching movements with the right arm from the start posi-98

tion located at (−15, 15) cm (in front of the subject’s chest)99

to a 1 cm radius target 15 cm away in the y direction (Figure100

1). The arm motion is performed on a plane approximately101

10 cm below the subject’s shoulder level.102

Address for reprint requests and other correspondence: Etienne Burdet, Department of Bioengi-neering, Imperial College of Science, Technology and Medicine (Email: [email protected])

Journal of Neurophysiology 1

Articles in PresS. J Neurophysiol (November 26, 2014). doi:10.1152/jn.00470.2014

Copyright © 2014 by the American Physiological Society.

Before each trial, the target and the cursor appear and103

the robot ceases to apply any force, enabling the subject to104

perform free movement. After each trial, the cursor disap-105

pears and the robot moves the subject’s hand back to the106

starting position for the next trial. In this way, no visual107

feedback of the hand is provided to the subject at the end of108

each movement, preventing him or her from easily noticing109

the discrepancies between the hand and the cursor positions110

(Franklin et al., 2008).111

If the hand reaches the target in 700 ± 100 ms, the tar-112

get displays a ripple and the movement is rewarded with one113

point, as shown in Figure 2A. If the movement is too fast or114

too slow, there is no reward and the target’s color is modified115

accordingly (Figure 2A). The subjects are required to obtain116

a score of 100 points (G1 group) or 180 points (G2 group) in117

order to complete the experiment. The subjects are informed118

that their movements may be affected during the experiment.119

Two types of visual environments are provided to the sub-jects (Figure 2B). In environment 1 (VE1) the cursor position(xC yC)T is related to the hand position (xH yH)T as(

xC

yC

)=

(0.1yH + xC(0)

yH

)[1]

where yH represents the hand’s velocity in the y direction and120

xC(0) = xH(0) is the starting cursor position in the x direc-121

tion. Under this environment, the cursor and the hand start122

at the same position. When the subject begins to move to-123

wards the target, the cursor deviates from the subject’s hand124

trajectory in the x direction proportionally to the speed of125

the hand movement in the y direction. At the end of the trial,126

the subject’s hand stops moving and the cursor settles on the127

line joining the centers of the starting point and the target.128

Therefore, there is no error between the cursor’s final posi-129

tion and the target in the x direction, i.e. this environment130

does not induce any end-point error in the x direction.131

Environment 2 (VE2) is implemented as(xC

yC

)=

(xH + 0.1yH

yH

)[2]

In this environment, the cursor deviates from the hand as132

the subject reaches for the target. The cursor then settles133

at the subject’s hand at the end of the trial when the sub-134

ject’s hand stops moving. In this environment, deviations of135

the hand from the target are therefore reflected on the screen136

which the subject is required to minimize. This is in contrast137

to the first environment (VE1).138

Probe trials are used to observe changes in the planned139

movement. In these trials, the visual cursor is turned off, so140

that subjects have no visual feedback during the movement,141

but can see the position reached by their hand position when142

they have completed the movement.143

Before the experiment, the subjects are informed that144

changes will occur during the experiment but are not in-145

formed of the form of the changes nor when the changes would146

take place. The subjects of both groups progress through147

different phases of the experiment according to the protocols148

given in Figures 2C and 2D, respectively. As a simple reward,149

the subjects progress through different phases of the exper-150

iment by completing a given number of successful trials i.e.151

trials in which the target is reached in the suitable duration.152

This way, the subjects are required to have fully learned to153

perform the task in the given environment before they can154

progress to the next phase.155

The subjects in the G1 group perform the arm reaching156

movements in VE1 according to the protocol given in Figure157

2C. The starting phase consists of trials without visual de-158

formation, during which the subjects can experience the task159

and the robot dynamics. After 30 successful trials, VE1 is ac-160

tivated for the unsuspecting subject. In the subsequent learn-161

ing phase, the subjects carry out the trials until they have162

produced 25 successful trials. This is followed by a learned163

phase with a 25 successful trials target during which probe164

trials are randomly integrated. Finally, a washout phase is165

applied with a target of 20 successful trials in which the en-166

vironment is turned off. Learning effects can be observed by167

comparing the trials of the washout phase with the trials of168

the starting phase.169

The subjects in the G2 group are required to perform arm170

reaching movements in VE2 before completing movements in171

VE1 (Figure 2D). The subjects of the G2 group are required172

to learn VE2 with the same protocol as the subjects of the173

G1 group. The subsequent learning of VE1 occurs immedi-174

ately after the washout phase of VE2. The washout phase of175

VE1 is therefore used to observe the learning of VE2 and is176

also used as the starting phase for the subsequent learning of177

VE1.178

Data Analysis .The data of the hand position is collected179

during the experiment. The hand velocity is computed using180

numerical differentiation followed by a fifth-order zero phase181

Butterworth low pass filter with a 30 Hz cut-off frequency.182

To filter out any movement due to motor noise, the start183

and the end of the recorded movement are determined from184

a velocity threshold of 0.03 m/s as in (Tseng et al., 2007).185

Any movement below this threshold is removed, enabling a186

comprehensive analysis of the subjects’ movement.187

Four measures are used to analyze learning:188

1. The absolute hand path area and the absolute cursor path189

area of each trial are defined as the area delimited by the190

hand path (Burdet et al., 2001), as shown in Figure 3A:191

SH =

N∑i=1

∣∣∣xH(i)− xH(0)∣∣∣ ∣∣∣yH(i)

∣∣∣ [3]

SC =N∑i=1

∣∣∣xC(i)− xC(0)∣∣∣ ∣∣∣yC(i)

∣∣∣ [4]

where N is the total number of points collected during the192

trial.193

2. The hand initial direction αH and the cursor initial direc-tion αC are computed as the direction of the hand and thecursor for the first quarter of the trajectory (Figure 3B)

αCk = arctan

(yC(N/4)− yC(0)

xC(N/4)− xC(0)

)αCk = −|αC

k | [5]

αHk = arctan

(yH(N/4)− yH(0)

xH(N/4)− xH(0)

)αHk = −(sign(αC

k )αHk ) [6]

With this definition, the cursor is always in the negative194

direction while a positive hand direction indicates that the195

hand is moving away from the cursor and a negative hand196

direction indicates that it is moving towards the cursor.197

3. Similarly, the hand final direction βH and the cursor finaldirection βC are defined as the directions of the hand andcursor positions at the end of the movement relative to

2 Journal of Neurophysiology

the start position (Figure 3C)

βCk = arctan

(yC(N)− yC(0)

xC(N)− xC(0)

)βCk = −|βC

k | [7]

βHk = arctan

(yH(N)− yH(0)

xH(N)− xH(0)

)βHk = −|sign(βC

k )βHk | [8]

4. Finally, the difference between the absolute hand-path error198

and the absolute cursor error is defined as199

SE = SH − SC [9]

where SH and SC are the absolute hand and cursor error200

defined in (3) and (4) respectively.201

All trials are considered in the results analysis. In order to202

compare the performances between different subjects, a spline203

is first fit to the data of each subject across trials. This spline204

is then used to interpolate between trials in order to generate205

200 trials per phase.206

Results207

The results of typical subjects in each group are first de-208

scribed and then systematically analyzed to identify the rel-209

evant learning patterns.210

Evolution of hand and cursor movement paths.The subjects211

of the G2 group performed more trials than that of the G1212

group (a mean of 274 for the G1 group and 388 for the G2213

group). However, the two groups have large standard devia-214

tion (std = 140 for the G1 group and std = 123 for the the215

G2 group). To compare the results of the two groups, the216

movements of one representative subject of each group are217

shown in Figure 4. The subjects performed a similar number218

of trials (364 and 387 trials, respectively).219

In the Null Field, the hand paths made by subject 1 of the220

G1 group and by subject 2 of the G2 group join the starting221

point and the target in approximately a straight line (Figure222

4, Null Field row).223

In the task-relevant environment VE2, subject 2 initially224

moves in the opposite direction to the deformation (Figure225

4B, Initial 10 Movements), before learning to move in the226

same direction as the cursor movement (Figure 4B, Final 10227

Movements). This behavior is maintained in the probe trials228

(Figure 4B, Probe Trial). When the visual deformation of229

VE2 is turned off, the subject quickly reverts to the straight-230

line trajectory (Figure 4B, Washout).231

In the task-irrelevant environment VE1, it was observed232

that subject 1’s hand immediately deviates from the straight233

line trajectory (Figure 4A, Initial 10 Movements). The hand234

path continues to drift with consecutive trials in the oppo-235

site direction to the visual deformation (Figure 4A, Final 10236

Movements). This behavior is not modified in the probe tri-237

als, during which the subjects made the same movements as238

observed when the visual field is turned on (Figure 4A, Probe239

Trials).240

In the same environment VE1, subject 2’s hand path241

moves away from the visual deformation on the very first242

trials (Figure 4C, Initial 10 Movements). However, unlike243

the results of subject 1 in Figure 4A, subject 2 settles on244

moving along the same curve as the visual cursor after suffi-245

ciently many trials, similar to his/her movements in the VE2246

environment (Figure 4C, Final 10 Movements). The subject247

maintains the curved movement in the probe trials, result-248

ing in the hand reaching the actual target (Figure 4C, Probe249

Trials).250

In the washout of VE1, significant adjustments are made251

by subject 1, with the subject returning to the straight line252

trajectory (Figure 4A, Washout). However, the washout tri-253

als of subject 2 in VE1 are not adjusted and the subject con-254

tinues to move along the curved path (Figure 4C, Washout).255

Population behavior.To examine whether the results of sub-256

jects 1 and 2 can be generalized across the population, the257

mean and standard deviation of the first six measures (3)-(8)258

defined in the Data Analysis section (i.e. the absolute hand259

path area, the hand and cursor initial directions and the hand260

and cursor final directions) are plotted against the normal-261

ized trials for groups G1 and G2 in Figure 5A. The bar plots262

of Figure 5B and Figure 5C and the associated t-tests are263

used to examine the significance of the behaviors observed264

between the two groups.265

In VE2, the hand of the G2 group moves away from the266

straight line toward the cursor (Figure 5 A.4). The subjects267

consistently maintain their path with similar curvature in the268

subsequent trials during learning. Furthermore, their hand269

path drifts towards the cursor direction as they learn the270

field while their final hand position is maintained at the tar-271

get (Figure 5 A.5, A.6). This behavior is reflected in the272

movements of subject 2 in Figure 4B.273

In VE1, the G2 group subjects move in the same direction274

as the cursor, as they have done previously in VE2 (Figure275

5 A.7-A.9). This behavior persists until the last few trials,276

where the G2 group’s initial direction begins to move towards277

the straight line (Figure 5 A.8), while the final direction drifts278

slightly away from the cursor trajectory (Figure 5 A.9). The279

G2 group eventually returns to the straight line trajectory in280

the washout trials (Figure 5 A.7-A.9).281

Compared to the respective behaviors in VE2, the perfor-282

mance of the G2 group in VE1 possesses similar hand path283

area as seen in Figure 5B (p > 0.65 for early learning and284

p > 0.39 for late learning). Furthermore, the subjects learn285

similar trajectories in the two environments, with similar ini-286

tial and final directions during late learning (p > 0.2 for ini-287

tial direction and p > 0.3 for final direction). There exists288

little change in the performance measures during the probe289

trials during which the cursor position is removed (Figure290

5C, p > 0.6, p > 0.73, p > 0.5 for the three measures respec-291

tively), suggesting that the learning occurs in a feed-forward292

manner.293

In VE1, the G1 group is observed to move away from the294

straight line in the direction opposite to the cursor (Figure295

5 A.1-A.3). The G1 group’s initial direction increases in the296

direction opposite of the cursor and the direction is main-297

tained throughout the trials (Figure 5 A.2). Similarly, the298

G1 group’s final direction moves away from the straight line299

in the environment, resulting in the subject’s hand failing to300

reach the target (Figure 5 A.3, hand). In this task-irrelevant301

environment, the G1 group is still able to bring the cursor to302

the target and complete the task (Figure 5 A.3, cursor).303

In the washout trials, the G1 group’s movements return304

to the straight line. The fast decrease of the final direction305

(Figure 5 A.3) compared to the slow decrease of the initial306

direction (Figure 5 A.2) is reflected in subject 1’s movements307

in the field VE1 (Figure 4A).308

Compared to the G2 group’s behavior in VE1, the G1309

group demonstrated similar hand path area and final direc-310

tion for early learning (Figure 5B, p > 0.3 and p > 0.05 re-311

spectively). However, in late learning, significant changes are312

observed between the two groups (p < 0.01 and p < 1e − 5313

Journal of Neurophysiology 3

for hand path area and final direction, respectively). The314

hand initial directions of the two groups in the field are con-315

sistently different across trials (p < 0.01 for early learning316

and p < 1e− 5 for late learning).317

Similar to the G2 group, the movement of the G1 group is318

of feed-forward nature, resulting in insignificant change in the319

behavior during the probe trials (Figure 5C, p > 0.7, p > 0.6320

and p > 0.34 for each measure).321

During early washout after learning in VE1, the hand322

path areas are similar between subjects of the G1 and G2323

groups (Figure 5B, p > 0.1), while the initial direction for324

the G1 group is higher than that of the G2 group (p < 0.05).325

However, the final directions are similar between the two326

groups (p > 0.1). This implies that the subjects of group327

G1 made significantly more corrections to their movements328

(through means such as online feedback or motor planning329

adjustments) compared to subjects of the G2 group. In the330

late washout trials, the G1 group’s hand path is adjusted331

such that it is similar to that of the G2 group for all three332

measures (p > 0.05, p > 0.1 and p > 0.9). The initial and333

final directions of the hand paths of both groups are not sig-334

nificantly different from zero (p > 0.05, p > 0.46 for the G1335

group and p > 0.25 p > 0.2 for the G2 group), suggesting336

that the subjects return to the straight line.337

Significance of Error Compensation.To examine error com-338

pensation in human learning, the difference between the hand339

path area and the cursor path area SE is analyzed in Figure 6.340

The G2 group exhibits a nearly instantaneous change when341

either VE1 or VE2 is introduced. The change is maintained342

throughout the trials within the respective environment. On343

the other hand, the G1 group exhibits a gradual change in vi-344

sual environment VE1 in the direction opposite to the change345

made by the G2 group in the same environment.346

To further analyze this behavior, an exponential model347

y = aebk + c [10]

is fitted to the data, where y represents the difference between348

the hand and cursor path area (SE), k is the generated trial349

number and a, b, and c are the coefficients of the fit. The350

initial learning behavior of the subject is reflected in the sum351

of coefficients a and c. The late learning behavior is reflected352

in the coefficient c. Finally, the learning rate of the subjects353

is reflected in the coefficient b. The three coefficients of inter-354

est a + c, b and c are significantly different between the two355

groups.356

The coefficient a+ c is low for the G1 group in VE1 and357

is not significantly different from zero (p > 0.8). For the G2358

group in VE1 and VE2, the coefficient is significantly positive359

(with p < 0.01 and p < 0.05, respectively). This reflects the360

instantaneous increase observed for the G2 group’s learning361

behavior, which is not observed in the G1 group.362

The learning rate b of the G1 group is significantly neg-363

ative (p < 0.05), while the G2 group’s learning rate in VE2364

and VE1 after the increase is not significantly different from365

zero (p > 0.6 and p > 0.1, respectively).366

The coefficient c, which reflects the steady-state of the367

subjects’ learning behavior, is significantly negative for the368

G1 group (p < 0.05) and is significantly positive for the G2369

group in VE2 (p < 0.05).370

Finally, the coefficients a + c and c for the G2 group in371

VE1 are significantly different from that of the G1 group372

(p < 0.01 and p < 0.01, respectively), but are similar to their373

values in VE2 (p > 0.2 and p > 0.05, respectively).374

These results reflect the observation that subjects in the375

G2 group learn to move both in VE2 and VE1 by switching376

to another movement, and subsequently maintain the move-377

ment in the environment. By contrast, subjects in the G1378

group gradually change their movements in VE1, resulting379

in a gradual convergence to the final movement trajectory in380

VE1.381

Discussion382

This study examined whether prior training with task-383

relevant feedback affects the learning strategy used in task-384

irrelevant environment. Two groups of subjects (G1 and G2)385

learned target reaching movements in the task-irrelevant en-386

vironment VE1, where the G2 group had trained previously387

in the task-relevant environment VE2. The results exhibited388

two distinctly different behaviors for the two groups, suggest-389

ing that the learning strategy was affected by previous train-390

ing. In particular, the learning strategy in the task-irrelevant391

environment VE1 was affected by previous training in the392

task-relevant environment VE2. In addition, the large varia-393

tions in the number of trials necessary to complete a learning394

phase observed in the different subjects suggested that the395

observed behavior was independent of the total number of396

trials made by the subjects. The following sections further397

discuss the observed behavior.398

Humans use different learning strategies for the same task.399

When presented with the lateral visual deformation of VE1,400

the G1 group moved in the direction opposite the deforma-401

tion and tended to drift further away from the target over402

trials. In contrast, the G2 group tended to move either in a403

straight line or in the same direction as the cursor in VE1,404

with the hand movements following the cursor movements405

(Figure 5 A.4-A.6).406

The different learning behaviors can be explained by the407

different strategies employed by the two groups to learn the408

task. The G1 group may use visual reflexes to gradually409

compensate for the observed mismatch between the cursor410

and the straight line joining the starting point and the target411

trial after trial, as was proposed in previous works (Franklin412

et al., 2008; Krakauer et al., 2000), resulting in hand moving413

opposite to the direction of the cursor deformation.414

On the other hand, the G2 group’s learning strategy415

seems to consist of switching their planned movements416

(Ganesh and Burdet, 2013) (Figure 6), resulting in their417

hand either ignoring or following the cursor (Figure 5 A.4-418

A.9). This strategy results in the hand either moving along419

a straight line or moving in the same direction as the cursor,420

enabling limited deviation of the hand from the two trajec-421

tories.422

Comparison of learning strategies.The different learning423

strategies are further evidenced by the different learning be-424

haviors made by the two groups in VE1. The G1 group’s425

learning strategy results in the difference between the hand426

and the cursor path area to decay exponentially after an ini-427

tial jump. This is shown from the negative coefficient “b”428

of the G1 group in environment VE1 (Figure 6). Conse-429

quently, the hand gradually drifts from the straight line tra-430

jectory. Drifting was also observed when visual feedback was431

prevented during the movement (Brown et al., 2003; Salaun432

et al., 2009). However in that case no exponential decay was433

observed, implying that the behavior was not associated with434

learning.435

The G2 group’s learning strategy results in the difference436

between the hand and the cursor path area to change almost437

instantaneously when the subject is presented with the novel438

visual environment VE1 (Figure 6, right column). The fact439

4 Journal of Neurophysiology

that the coefficient “b” found for G2 group’s movements in440

VE1 is not significantly different from zero in Figure 6 sug-441

gests that no significant adjustment of the hand or the cursor442

is made in the subsequent trials.443

Learning strategies affected by task-relevant errorsAlthough444

the G2 group’s learning behavior was different from that of445

the G1 group in VE1, it was similar to the G2 group’s learn-446

ing behavior in VE2 (Figure 6). This suggests that the G2447

group used the same learning strategy in VE1 as that used448

in VE2, which was a different strategy than that of the G1449

group in VE1. It is therefore possible that the human sub-450

jects changed their strategy while learning to move in VE2.451

A possible reason for the change is that if subjects use the452

strategy of the G1 group to correct for the visual discrepan-453

cies in VE2, then task-relevant errors are produced (Franklin454

et al., 2008), resulting in the endpoint cursor deviating from455

the target. The subjects are therefore forced to use visual456

reflexes and voluntary visual corrections to adjust the move-457

ment online in order to ensure that the hand reaches the458

target, which is evident in the final direction observed in the459

G2 group in the field (Figure 5 A.6).460

This reliance on visual reflexes caused the subjects to461

switch their planned movement, which allowed them to462

choose either to ignore or to follow the visual disturbances463

in VE2 in order to maintain a relatively straight trajectory464

and succeed in the task, as was observed in Figure 5 A.4-A.6.465

In this light, reaching the target has a higher priority (the466

primary task) than compensating for the visual deformation467

(the secondary task), which is ignored if it conflicts with the468

primary task.469

Over trials, it seemed that the subjects became famil-470

iarized with the new plan such that they relied less on on-471

line adjustments. This is supported by observations that the472

movements of the G2 group are invariant in probe trials, in473

which no cursor is provided to the subjects (Figure 5C), which474

implies that the movements in late learning are feed-forward475

in nature.476

Overall, these observations suggest that task-relevant er-477

rors not only affect subjects’ movement and visual reflexes,478

but also change the learning strategy employed, resulting in479

subjects using different feed-forward commands to achieve480

the task in the same visual field (Figures 5 and 6, VE1 envi-481

ronment).482

Previous investigations have found that human subjects483

change their reliance on feed-forward or feedback informa-484

tion for learning and for motion depending on their previ-485

ous experience (Kagerer et al., 1997; Saijo and Gomi, 2012).486

Current results show that experience can further change the487

strategy which humans use for learning. In particular, sub-488

jects use different learning strategies depending on whether489

they have previously been trained in environments involving490

task-relevant errors.491

Preference of gradual learning strategy In VE1, it was ob-492

served that both learning strategies enabled the subjects to493

succeed in target reaching. Why did the subjects prefer the494

gradual adaptation strategy over the plan switching strategy,495

which they used only if they had been trained in the task-496

relevant environment VE2? Gradual change of the movement497

can be performed automatically trial after trial, for example498

by incorporating visual reflexes experienced in previous trial.499

This control strategy does not require much online control. In500

contrast, the strategy used in the task-relevant environment501

requires switching to another motor plan and coordinating502

the motor command with the ongoing movement and thus503

heavily relies on sensory feedback and online computation504

and control. Therefore, humans may use an Occam’s razor505

approach for learning (MacKay, 2003) in that they do not at-506

tempt to make abrupt switches between plans for learning an507

environment, unless it is necessary to perform the task, since508

such a strategy is more difficult than gradually updating the509

feed-forward command using visual errors.510

ACKNOWLEDGMENTS. We thank Wayne Dailey for his assistance in proof read-511

ing the manuscript.512

References513

Brown, L. E., Rosenbaum, D. A., and Sainburg, R. L. (2003).514

Movement speed effects on limb position drift. Experimen-515

tal Brain Research, 153(2):266–74.516

Burdet, E., Osu, R., Franklin, D. W., Milner, T. E., and517

Kawato, M. (2001). The central nervous system stabilizes518

unstable dynamics by learning optimal impedance. Nature,519

414(6862):446–9.520

Criscimagna-Hemminger, S. E., Bastian, A. J., and Shad-521

mehr, R. (2010). Size of error affects cerebellar contri-522

butions to motor learning. Journal of Neurophysiology,523

103(4):2275–2284.524

Day, B. and Lyon, I. (2000). Voluntary modification of auto-525

matic arm movements evoked by motion of a visual target.526

Experimental Brain Research, 130(2):159–168.527

Flanagan, J. R., Nakano, E., Imamizu, H., Osu, R., Yoshioka,528

T., and Kawato, M. (1999). Composition and decompo-529

sition of internal models in motor learning under altered530

kinematic and dynamic environments. Journal of Neuro-531

science, 19:34–1.532

Franklin, D. W., Burdet, E., Peng Tee, K., Osu, R., Chew,533

C.-M., Milner, T. E., and Kawato, M. (2008). Cns learns534

stable, accurate, and efficient movements using a simple535

algorithm. Journal of Neuroscience, 28(44):11165–11173.536

Franklin, D. W. and Wolpert, D. M. (2008). Specificity of537

reflex adaptation for task-relevant variability. Journal of538

Neuroscience, 28(52):14165–14175.539

Franklin, S., Wolpert, D. M., and Franklin, D. W. (2012). Vi-540

suomotor feedback gains upregulate during the learning of541

novel dynamics. Journal of Neurophysiology, 108(2):467–542

478.543

Ganesh, G. and Burdet, E. (2013). Motor planning explains544

human behaviour in tasks with multiple solutions. Robotics545

and Autonomous Systems, 61(4):362–368.546

Haith, A. M. and Krakauer, J. W. (2013). Model-based547

and model-free mechanisms of human motor learning. In548

Progress in Motor Control, pages 1–21. Springer.549

Harris, C. S. (1963). Adaptation to displaced vision: visual,550

motor, or proprioceptive change? Science, 140(3568):812–551

813.552

Helmholtz, H. v. and Southall, J. (1925). Treatise on physi-553

ological optics. III. The perceptions of vision. New York.554

Kagerer, F. A., Contreras-Vidal, J., and Stelmach, G. E.555

(1997). Adaptation to gradual as compared with sud-556

den visuo-motor distortions. Experimental Brain Research,557

115(3):557–561.558

Krakauer, J. W., Pine, Z. M., Ghilardi, M. F., and Ghez, C.559

(2000). Learning of visuomotor transformations for vecto-560

rial planning of reaching trajectories. Journal of Neuro-561

science, 20(23):8916–8924.562

MacKay, D. J. (2003). Information Theory, Inference and563

Learning Algorithms, volume 1. Cambridge University564

Press, Cambridge, UK.565

Marko, M. K., Haith, A. M., Harran, M. D., and Shadmehr,566

R. (2012). Sensitivity to prediction error in reach adapta-567

tion. Journal of Neurophysiology, 108(6):1752–1763.568

Journal of Neurophysiology 5

Michel, C., Pisella, L., Prablanc, C., Rode, G., and Rossetti,569

Y. (2007). Enhancing visuomotor adaptation by reducing570

error signals: single-step (aware) versus multiple-step (un-571

aware) exposure to wedge prisms. Journal of Cognitive572

Neuroscience, 19(2):341–350.573

Pisella, L., Rode, G., Farne, A., Tilikete, C., and Rossetti, Y.574

(2006). Prism adaptation in the rehabilitation of patients575

with visuo-spatial cognitive disorders. Current Opinion in576

Neurology, 19(6):534–542.577

Redding, G. M. and Wallace, B. (1996). Adaptive spatial578

alignment and strategic perceptual-motor control. Journal579

of Experimental Psychology: Human Perception and Per-580

formance, 22(2):379.581

Saijo, N. and Gomi, H. (2012). Effect of visuomotor-map582

uncertainty on visuomotor adaptation. Journal of Neuro-583

physiology, 107(6):1576–1585.584

Saijo, N., Murakami, I., Nishida, S., and Gomi, H. (2005).585

Large-field visual motion directly induces an involuntary586

rapid manual following response. The Journal of Neuro-587

science, 25(20):4941–4951.588

Salaun, C., Padois, V., and Sigaud, O. (2009). A two-level589

model of anticipation-based motor learning for whole body590

motion. In Anticipatory Behavior in Adaptive Learning591

Systems, pages 229–246. Springer.592

Sarlegna, F. R. and Sainburg, R. L. (2009). The roles of vision593

and proprioception in the planning of reaching movements.594

In Progress in motor control. Springer.595

Schaefer, S. Y., Shelly, I. L., and Thoroughman, K. A. (2012).596

Beside the point: motor adaptation without feedback-597

based error correction in task-irrelevant conditions. Jour-598

nal of Neurophysiology, 107(4):1247–1256.599

Scheidt, R. A., Conditt, M. A., Secco, E. L., Mussa-ivaldi,600

F. A., and Robert, A. (2005). Interaction of visual and pro-601

prioceptive feedback during adaptation of human reaching602

movements. Journal of Neurophysiology, 93(6):3200–3213.603

Seidler, R. D., Kwak, Y., Fling, B. W., and Bernard, J. A.604

(2013). Neurocognitive mechanisms of error-based motor605

learning. Advanced Experiment Medical Biology, 782:39–60.606

Shabbott, B. A. and Sainburg, R. L. (2010). Learning a vi-607

suomotor rotation: simultaneous visual and proprioceptive608

information is crucial for visuomotor remapping. Experi-609

mental Brain Research, 203(1):75–87.610

Todorov, E. and Jordan, M. I. (2002). Optimal feedback611

control as a theory of motor coordination. Nature Neuro-612

science, 5(11):1226–35.613

Tseng, Y.-W., Diedrichsen, J., Krakauer, J. W., Shadmehr,614

R., and Bastian, A. J. (2007). Sensory prediction errors615

drive cerebellum-dependent adaptation of reaching. Jour-616

nal of Neurophysiology, 98(1):54–62.617

van Beers, R. J., Brenner, E., and Smeets, J. B. (2013). Ran-618

dom walk of motor planning in task-irrelevant dimensions.619

Journal of Neurophysiology, 109(4):969–977.620

Wang, J. (2005). Adaptation to visuomotor rotations remaps621

movement vectors, not final positions. Journal of Neuro-622

science, 25(16):4024–4030.623

Wei, K. and Kording, K. (2010). Uncertainty of feedback and624

state estimation determines the speed of motor adaptation.625

Frontiers in Computational Neuroscience.626

Wolpert, D. M., Diedrichsen, J., and Flanagan, J. R. (2011).627

Principles of sensorimotor learning. Nature reviews. Neu-628

roscience, 12(12):739–51.629

6 Journal of Neurophysiology

3-dom

3-dom

mirror

monitor

(-0.15,0.15)

θ2

θ1

(0,0)

x

Figure 1

Group 2

2 cm

15 cm

Hand Path

Cursor Path

Null Field Task Irrelevant

Environment (VE1)Probe Trials

Visual Cues

Missed the target

Too Fast Too SlowGood Time

Award a Point

A

B

Group1

Null Field VE1 Null Field

Null Field Null FieldVE2 VE1

C

D

Null Field

Learning Phase 2

Max Score: 25

Learned Phase 2

Max Score: 25

One probe trialevery 10 trials

Washout Phase 2

Max Score: 30

x

y

Task Relevant Environment (VE2)

Starting Phase

Max Score: 30

One probe trialevery 10 trials

Learning Phase

Max Score: 25

Learned Phase

Max Score: 25

One probe trialevery 10 trials

Washout Phase

Max Score: 20

Starting Phase 1

Max Score: 30

One probe trialevery 10 trials

Learning Phase 1

Max Score: 25

Learned Phase 1

Max Score: 25

One probe trialevery 10 trials

Washout Phase 1and

Starting Phase 2

Max Score: 20

Figure 2

Cursor

Hand

αH βCαC βH

SH SC

(A) (B) (C)

Figure 3

Nul

l Fie

ldBA

Initi

al 1

0 m

ovem

ents

Fina

l 10

mov

emen

tsTask Irrelevant

Environment (VE1)Task Relevant

Environment (VE2)

Group 1 (Subject 1) Group 2 (Subject 2)

Task IrrelevantEnvironment (VE1)

CTask RelevantEnvironment (VE2)

Was

hout

(fi

rst 1

0 m

ovem

ents

)

Cursor Path

Hand Path

1 10Trial

Prob

e Tr

ial

mov

emen

ts

0

5

10

15

− 5 0 5x position (cm)

y p

osi

tio

n (c

m)

Figure 4

Generated Trial No

G1 Task Irrelevant Environment (VE1)

A.8

A.7

A.9

800 1000 1200 1400 1600

G2 Task Relevant Environment (VE1)

A.5

A.4

A.6

Cursor

Hand

800 1000 1200 1400 1600

G2 Task Irrelevant Environment (VE2)

0 200 400 600 800

0

0

0

αβ

A.1

A.2

A.3

(deg

rees

) (d

egre

es)

S (c

m2 )

−10

10

20

30

100

−50

50

100

−100

−50

50

100

A

(a)

0

10

20

30

1 2 3 4a1 a2 a3 a4b1 b2 b3 b4

−50

0

50

20

0

20

40

60

Group 2 in VE2

Group 2 in VE1

Group 1 in VE1

* p<0.05** p<0.01*** p<10-3

* **

***

***

***

********* * ***

**

** *** *

***

** * ** *

*

****

***

*** *

***

***

Early Learning Late Learning Early Washout Late Washout

B

SH (c

m2 )

αΗ (

de

gre

es)

βΗ (

de

gre

es)

(b)

0

10

20

30

SH (c

m2 )

−50

0

50

α (

de

gre

es)

−20

0

20

40

60

β (

de

gre

es)

Late Learning Probe Trials

** *

*** ***

*** ***

C

(c)

Figure 5

Gro

up 2

VE2

Generated Trial No

Gro

up 2

VE1

Gro

up 1

VE1

G1VE1

Difference between hand and cursor path area SE=SH-SC

Fitting

* p<0.05** p<0.01*** p<10-3

−20

−10

0

10

20

−20

−10

0

10

20

−20

−10

0

10

20

200 400 600 8000

G2VE2 G2VE1

a+c

−16

−12

−8

−4

02

x 10− 3

b

−40

−20

20

40

c 0

−8

−4

0

4

8

12

*

* **

4

Fitted Coefficients

* ***

*

*** ***

Figure 6

Figure Legends

Figure 1. Setup of the experiment: Subjects perform target reaching movementswhile their hand is attached to the robot, which supports the arm against gravityand can measure hand position.

Figure 2. Experiment protocol.

Figure 3. Definition of the hand and cursor path area (SH and SC), the initialhand and cursor direction (αH , αC) and the final hand and cursor direction(βH , βC), as defined in Equations (3) through (8).

Figure 4. Evolution of cursor trajectories (solid yellow to red lines) and handpath (solid blue to purple lines) for two representative subjects in groups G1and G2. The different effects of VE1 and VE2 are observed in the first twocolumns, while the influence of learning of VE2 on the learning in VE1 is ob-served by comparing the first and third columns. Because the cursor is alignedwith the hand in the null, probe and washout trials, only the hand paths areplotted. Since the robot stops recording when the hand paths velocity falls be-low 0.03 m/s, there are minute discrepancies between the cursor and the targetat the end of the movement in VE1.

Figure 5 Comparison of behaviors across population

Figure 5a. Mean and standard deviation of the three measures for the hand(blue solid line) and the cursor (red solid line) across the normalized trials forall subjects. The different phases are shown in the alternating green and blueshades. The trials made by each subject are spline-fitted and 200 trials aregenerated in each phase, thereby enabling comparison of the performances ofdifferent subjects.

Figure 5b. Comparison of learning performances during the first and last 10movements made by the two groups in visual environment VE1. The asterisks‘*’ indicate the significance level over all subjects. The graph compares theaverage measure made by the subjects of different groups across early and latelearning trials and washout trials. The bars in the graph show the mean ofthe average measures of each group while the error bars provide the standarddeviation.

Figure 5c. Comparison of learning performances during late learning and probetrials for the two groups in visual environment VE1. The asterisks ‘*’ indicatethe significance level over all subjects. The mean and error bars are calculatedin the same way as described in Figure 5b.

Figure 6. SE = SH − SC measure for the G1 and G2 groups in VE1 and VE2.An exponential fit of the measure against the normalized trial number is usedto determine the coefficients a, b, and c shown on the right. For the fit, y istaken as the measure SE while k is the generated trial number