16
K2 PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY crevecoeur, cluff and scott: goal-directed movement planning and execution 435 39 Computational Approaches for Goal-Directed Movement Planning and Execution frédéric crevecoeur,* tyler cluff,* and stephen h. scott abstract The apparent ease of skilled motor behavior masks the complex neural processes involved in the control of movement. To handle the complexity of most motor tasks, optimal feedback control suggests that the brain continuously processes sensory feedback and selectively compensates for noise disturbances as well as external perturbations. Moti- vated by this powerful prediction, recent studies have used perturbation paradigms to investigate the neural control of movement. After introducing the basic mathematical con- cepts of optimal feedback control, we review evidence that the brain generates flexible feedback strategies following pertur- bations perceived from visual feedback (e.g., a sudden change in the target location), as well as mechanical perturbations applied to the limb. Importantly, we highlight evidence that the motor system can generate goal-directed responses in as little as 50–60 ms following a mechanical perturbation. A transcortical feedback pathway through primary motor cortex appears to play an important role in these rapid corrective responses. Elite athletes push the limits of the sensorimotor system, integrating sensory information to generate rapid yet remarkably precise movements. A good example is a hockey player breaking away from his defenders. In a split second, the player has to read the goaltender’s movement and decide whether to shoot the puck or fake the goaltender and wait for an opening. Even the simplest movements that we perform in daily life, such as reaching for a cup of coffee, also involve complex sensorimotor coordination. The ability to use sensory information to flexibly guide, correct, or modify our actions is the hallmark of skilled biological control. Recently, optimal feedback control (OFC) has been used as a model of how the brain processes sensory information to control movement. A powerful feature of this model is that it describes how the motor system should handle performance errors caused by neural variability or environmental disturbances. This feature has renewed interest in feedback response strategies, in particular because they may provide a window into the voluntary control of movement. In this chapter, we introduce the basic notions underlying optimal control theory and discuss how this approach may help us iden- tify problems that the brain must solve to perform even the simplest movements. Then, we review recent find- ings that emphasize the similarity between biological control and the sophistication of optimal control models. In particular, we focus on perturbation para- digms showing that many aspects of optimal control models are observed in corrective responses generated by humans: (1) continuous processing of sensory feed- back underlies the control of movement, and (2) feed- back control is tailored to the constraints imposed by the task at hand. We finish our chapter by briefly dis- cussing how sophisticated motor behavior may be linked to processing in distributed brain circuits, high- lighting recent evidence that neural processing in primary motor cortex (M1) may possess some of the attributes required for flexible feedback control. Optimal control: Definitions and applications in neuroscience Flash and Hogan (1985) introduced optimal control principles to movement neuroscience almost 30 years ago. Intuitively, this approach is based on the assump- tion that the brain selects motor commands that maxi- mize or minimize a performance criterion. In the context of reaching movements, Flash and Hogan sug- gested that the brain selects motor plans that minimize the derivative of the hand’s acceleration (or the jerk). This hypothesis was justified by the fact that point-to- point movements tend to be smooth with a bell-shaped velocity profile, which is reproduced nicely by minimiz- ing the derivative of the hand’s acceleration during reaching movements. Following Flash and Hogan’s work, a wealth of studies have proposed biological cost functions that incorporate kinetic parameters (e.g., * These authors contributed equally to this work.

New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

crevecoeur, cluff and scott: goal-directed movement planning and execution 435

39 Computational Approaches for Goal-Directed Movement Planning and Executionfrédéric crevecoeur,* tyler cluff,* and stephen h. scott

abstract The apparent ease of skilled motor behavior masks the complex neural processes involved in the control of movement. To handle the complexity of most motor tasks, optimal feedback control suggests that the brain continuously processes sensory feedback and selectively compensates for noise disturbances as well as external perturbations. Moti-vated by this powerful prediction, recent studies have used perturbation paradigms to investigate the neural control of movement. After introducing the basic mathematical con-cepts of optimal feedback control, we review evidence that the brain generates flexible feedback strategies following pertur-bations perceived from visual feedback (e.g., a sudden change in the target location), as well as mechanical perturbations applied to the limb. Importantly, we highlight evidence that the motor system can generate goal-directed responses in as little as 50–60 ms following a mechanical perturbation. A transcortical feedback pathway through primary motor cortex appears to play an important role in these rapid corrective responses.

Elite athletes push the limits of the sensorimotor system, integrating sensory information to generate rapid yet remarkably precise movements. A good example is a hockey player breaking away from his defenders. In a split second, the player has to read the goaltender’s movement and decide whether to shoot the puck or fake the goaltender and wait for an opening. Even the simplest movements that we perform in daily life, such as reaching for a cup of coffee, also involve complex sensorimotor coordination. The ability to use sensory information to flexibly guide, correct, or modify our actions is the hallmark of skilled biological control.

Recently, optimal feedback control (OFC) has been used as a model of how the brain processes sensory information to control movement. A powerful feature of this model is that it describes how the motor system should handle performance errors caused by neural variability or environmental disturbances. This feature has renewed interest in feedback response strategies, in

particular because they may provide a window into the voluntary control of movement. In this chapter, we introduce the basic notions underlying optimal control theory and discuss how this approach may help us iden-tify problems that the brain must solve to perform even the simplest movements. Then, we review recent find-ings that emphasize the similarity between biological control and the sophistication of optimal control models. In particular, we focus on perturbation para-digms showing that many aspects of optimal control models are observed in corrective responses generated by humans: (1) continuous processing of sensory feed-back underlies the control of movement, and (2) feed-back control is tailored to the constraints imposed by the task at hand. We finish our chapter by briefly dis-cussing how sophisticated motor behavior may be linked to processing in distributed brain circuits, high-lighting recent evidence that neural processing in primary motor cortex (M1) may possess some of the attributes required for flexible feedback control.

Optimal control: Definitions and applications in neuroscience

Flash and Hogan (1985) introduced optimal control principles to movement neuroscience almost 30 years ago. Intuitively, this approach is based on the assump-tion that the brain selects motor commands that maxi-mize or minimize a performance criterion. In the context of reaching movements, Flash and Hogan sug-gested that the brain selects motor plans that minimize the derivative of the hand’s acceleration (or the jerk). This hypothesis was justified by the fact that point-to-point movements tend to be smooth with a bell-shaped velocity profile, which is reproduced nicely by minimiz-ing the derivative of the hand’s acceleration during reaching movements. Following Flash and Hogan’s work, a wealth of studies have proposed biological cost functions that incorporate kinetic parameters (e.g., *These authors contributed equally to this work.

Page 2: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

436 motor systems and action

minimum torque change; Nakano et al., 1999; Uno, Kawato, & Suzuki, 1989) and energy consumption (Biess, Liebermann, & Flash, 2007; Berret, Chiovetto, Nori, & Pozzo, 2011) to identify movement variables that the motor system may control and optimize.

This approach is grounded in the theory of optimal control, a formalism that applies optimization princi-ples to describe what the motor system should do according to how our limbs move, as well as the performance criteria (e.g., energy consumption) and constraints (e.g., time constraint) associated with move -ment. The following section outlines the basic notions of control theory and defines (optimal) control prob-lems that are often encountered in movement neuroscience.

Dynamical Systems and Control Problems At the basis of control engineering is the notion of a dynamical system: a set of variables evolving as a function of time. For instance, the angular motion of a body segment can be seen as a dynamical system described by state vari-ables, such as the joint angle and velocity. In general, the evolving state of a dynamical system can be described by a differential equation of the form:

!x f x u= ( , ), (1)

where x is the vector of state variables and the dot expresses its time derivative. This derivative is a func-tion of the state vector itself and an additional vari-able, u, called the control vector. It is assumed that the controller influences the state of the system by changing the value of the control vector. For example, varying muscle activity changes the torque acting on a joint and alters segmental motion through the dynamical properties captured in the function f. With these definitions, we can define a control problem as follows: find a time-varying control vector that steers the state variables to a desired location. Assuming that the initial state x0 is known, the problem is to find a control function u(t), t0 <= t <= tf, such that the solution of Eq. 1:

x t x f x s u s dst

t

( ) ( ( ), ( )) ,= + ∫0

0

(2)

meets the constraints of the control problem. A reach-ing movement, for example, may be described as a control problem where the brain must steer the hand to the spatial location of a goal target. Expressing a simple task such as reaching as a control problem is a powerful approach to identify the challenges that motor control presents for the brain. As we will see, even the simplest movements impose complex sensorimotor transformations.

Deterministic Optimal Control In general, the solution of a particular control problem is not unique. This also applies to biological motor control, where reaching movements may follow distinct paths to the same target or even have different velocities along the same movement path. Given that each of these move-ments satisfy the goal of reaching the target (i.e., motor equivalence), extensive research has been conducted to identify how the brain selects one control solution among infinitely many alternatives. One way to reduce the set of possible movement solutions is to constrain the problem using a cost function. For instance, we may be interested in applying a sequence of joint torques that allows us to reach a target while minimizing the intensity of muscle activity to avoid fatigue. In this example, the cost is directly related to the motor command, and the problem is to find the reaching path that minimizes muscle energy expenditure. This approach uses optimization principles to determine the best way to reach the target among all possible movement solutions.

A typical cost function contains a final cost, g(x), and a running cost, L(x,u), that accumulates along the tra-jectory followed by the state variables:

J x u g x t L x t u t dtf

t

t f

( , ) ( ( )) ( ( ), ( )) .= + ∫0

(3)

With these definitions, an optimal control problem can be defined as follows: find a control function that mini-mizes J(x,u) (Eq. 3), subject to the initial condition x(t0) = x0, and to the system dynamics (Eq. 1). It can be shown that, under the optimal solution, J(x,u) satisfies the Hamilton-Jacobi-Bellman equation (U represents the set of admissible control actions):

− ∂∂

= + ∂∂{ }∈

J x ut

L x uJ x u

xf x u

u U

( , )min ( , )

( , )( , ) . (4)

The function J(x,u) is the cost to accumulate from the present time until the end of the problem horizon, tf. This quantity is called the cost-to-go and plays a central role in the derivation of numerical solutions (Todorov, 2006). An intuitive interpretation of Eq. 4 is that the control vector can vary the orientation of the instanta-neous direction of the state trajectory (f(x,u)), which should ideally follow the direction that is opposite to the gradient of the cost-to-go (∂J/∂x). However, the opti-mization also takes into account the instantaneous cost (L(x,u)), and as a result, Eq. 4 achieves the best com-promise between the gradient of the cost-to-go and the instantaneous cost (L(x,u)). This compromise deter-mines the instantaneous variation of the cost-to-go (∂J/∂t).

Page 3: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

crevecoeur, cluff and scott: goal-directed movement planning and execution 437

The many studies that have used this approach have agreed on the general conclusion that healthy motor systems favor smooth and efficient movements. The shortcoming of this approach, however, is that because the laws of physics relate all movement parameters, virtually every meaningful cost function in the form of Eq. 3 partially captures the smoothness and efficiency of biological motor control. Also, this approach does not systematically account for the continuous update of motor commands that is necessary to correct for motor errors.

Stochastic Optimal Control In general, factors that can induce motor errors and trial-to-trial variability fall into two broad categories. First, the variable activa-tion of neural circuits induces variable motor behavior. Neural noise can be found in sensory systems, move-ment preparatory activity, and the activation of muscles in the motor periphery (Churchland, Afshar, & Shenoy, 2006; Faisal, Selen, & Wolpert, 2008; Osborne, Lis-berger, & Bialek, 2005; Scott & Loeb, 1994; van Beers, Haggard, & Wolpert, 2004). Additionally, motor errors can be produced by external disturbances resulting from our interaction with the environment. Both neural variability and external disturbances require that the brain continuously update motor commands based on the available sensory data to produce successful behavior.

Harris and Wolpert (1998) considered the influence of neural variability on movement planning and sug-gested that motor commands are selected to minimize the variance of movement end points, assuming the intensity of motor noise scales with the size of the motor command. While this model explicitly considers the effect of motor noise, it does not address the online adjustment of motor commands required when

disturbances alter performance. The control of stochas-tic processes was introduced to address this limitation. In this framework, feedback is essential to update motor commands and compensate for neural variability. Online monitoring and control are often described in terms of a state estimator combined with a controller (figure 39.1). The state estimator typically combines sensory and motor signals to compute the present state of the body (figure 39.1; optimal state estimator), and the controller uses the estimated state of the body to select control actions that best reflect the goal and con-straints of the task (figure 39.1; optimal feedback control policy). This section presents the basic formal-ism of the problems of estimation and control of sto-chastic processes.

Because random (Brownian) motion does not have finite instantaneous variations, the control problem is formulated in discrete rather than continuous time (Arnold, 1974). To begin, the analog of Eq. 1 becomes:

dX F X u dt G X u dW= +( , ) ( , ) . (5)

In Eq. 5, the capital X signifies that the state vector is now a stochastic variable. Eq. 5 expresses that small changes in the state (dX) follow a deterministic law described by the function F that captures the system dynamics as in Eq. 1, and a stochastic term that captures random disturbances in the process (dW). We now convert this equation to discrete time by considering changes in the process over a time step of δt. From the definition of Brownian motion (Arnold, 1974), the accumulation of random noise over δt follows a Gauss-ian distribution with zero mean and variance equal to δt. We use ξ(t) to designate these Gaussian disturbances at each time step. The formulation in discrete time becomes

Figure 39.1 Illustration of basic processes expected under the OFC framework. The selection of the behavioral task determines the feedback control policy (task selection). The purpose of the feedback control policy is to continuously process and convert sensory data into motor commands that best satisfy the task demand (optimal feedback control policy). Once the sensory feedback is processed, motor commands are sent to the peripheral motor system (biomechanical plant or

Optimal feedback

control policy

Optimal state

estimation

Taskselection

Motorcommands

Efferencecopy

Sensoryfeedback

Musculoskeletal system

musculoskeletal system). An efference copy of the descending motor command is used internally to predict the conse-quences of motor actions. These internal predictions are com-bined with feedback from sensory receptors to compute the posterior estimate of the state of the body (optimal state esti-mator). This state estimate is used to adjust the motor com-mands during the ongoing motor action.

Page 4: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

438 motor systems and action

X t t M X t u t N X t u t t( ) ( ( ), ( )) ( ( ), ( )) ( ),+ = +δ ξ (6)

where M(.) and N(.) are the discrete versions of the functions F and G introduced in Eq. 5. The analog of Eq. 4 also becomes a discrete equation. The optimiza-tion can no longer be computed over the true state variables, because independent simulations of the same process will lead to distinct sample paths. Instead, for stochastic control problems, the controller should opti-mize the expected outcome, which ensures that perfor-mance will be optimal in the most likely scenario. The analogue of Eq. 4 becomes

J X u L X u E J X u X ut t tu

t t t t t t t( , ) min ( , ) ( , )| , ,= + [ ]{ }+ + +1 1 1 (7)

where E[.] denotes the expected value of the argument and the explicit dependency on time was replaced by the subscript t. Eq. 7 means that the best control action minimizes the present running cost, as well as the expected value of the future cost-to-go given the selected control action (figure 39.1; optimal feedback control policy).

A clear difficulty encountered with stochastic pro-cesses is that the state of the system (Xt) may not be known exactly, because sensory feedback is also cor-rupted by noise. Filtering techniques can be used to estimate the state of a stochastic process (figure 39.1; optimal state estimator). These techniques combine imperfect sensory data with internal assumptions about the state of the system (or internal priors). Prior assump-tions are often formulated as the output of a forward model, making a prediction of the present state of the body given the available information, including the motor command sent to the muscles (Wolpert & Flana-gan, 2001). Let Yt denote the information available at time t. We assume that the conditional distribution of Yt given the state variable Xt is known. First, we can predict the distribution of Xt given the distribution of Xt-1 by using Eq. 6 (forward model). This prediction step gives a prior belief about the system state. Second, the feedback data, Yt, can be used to correct the prior dis-tribution by applying Bayes’ theorem, which gives the posterior distribution of the state (Xt) given the sensory feedback (Yt). In other words, the best estimate of the system state is obtained by performing Bayesian integra-tion of sensory signals and forward predictions at each time step. The resulting estimate is optimal in the sense that the variance of the posterior distribution is mini-mized. In the particular case of linear dynamics affected by additive Gaussian disturbances, this procedure is known as a Kalman filter (Kalman, 1960).

Linear-Quadratic-Gaussian Regulator Let us examine the simplest case where an analytic solution of the

optimal control and filtering problems is available. The main assumptions are that the system follows linear dynamics, coupled with a quadratic cost function and subject to additive Gaussian noise (linear quadratic Gaussian, or LQG). The function M(.) introduced in Eq. 6 becomes a linear function of the state and control vectors (represented by the state-space matrices A and B), and the noise disturbance is captured by additive Gaussian noise (ξt). With these definitions, the system dynamics become

X AX But t t t+ = + +1 ξ . (8)

The sensory feedback available at time t is of the form:

Y HXt t t= + ω , (9)

where ωt is Gaussian random variable and H is the mapping between the system state and sensory data. This matrix may express that information about some state variables is not available, such as an external per-turbation that can only be indirectly measured through its effect on the motion of the body. The matrix H may also express that some sensory signals provide informa-tion about a combination of state variables, as for instance muscle spindles provide sensory feedback that is a mixture of the muscle’s length, velocity, and higher derivatives. The cost function is a quadratic form in the state and control variables, defined by weight matrices Qt and Rt, respectively:

L X u X Q X u R ut t tT

t t tT

t t( , ) .= + (10)

The optimal control problem consists in finding a sequence of control variables that minimize the total expected cost, that is:

J E L X ut tt

N

= ⎡⎣⎢

⎤⎦⎥=

∑ ( , ) .1

(11)

For this class of control problems, the optimal control policy turns out to be a linear function of the estimated state, denoted x̂t :

u C xt t t= ˆ . (12)

The sequence of optimal feedback gains, Ct, is deter-mined by the cost matrices (Qt and Rt in Eq. 10) and by the system dynamics (A and B in Eq. 8). The result is a time-varying feedback gain that tells us how the motor system should transform the estimated state of the system into motor commands.

The optimal estimate of the state is calculated in two steps (Kalman filter). First, the prediction of the next system state is obtained by simulating the dynamics over one time step given the current motor command, ut, and taking the expected value of the outcome:

ˆ ˆ .x Ax Butp

t t= +− −1 1 (13)

Page 5: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

crevecoeur, cluff and scott: goal-directed movement planning and execution 439

This prediction (or prior belief) is then corrected with the difference between actual and expected feedback, weighted by the Kalman gain:

ˆ ˆ ˆ .x x K Y Hxt tp

t t tp= + −( ) (14)

The Kalman gains are determined by the system dynam-ics (Eq. 8), feedback (Eq. 9), and the covariance matri-ces of the motor and feedback noise. The Kalman gain (Kt) specifies how much the prior estimate (Eq. 13) should be changed according to the available sensory data. Recent studies have extended the LQG framework to take properties of biological motor control into account, such as the scaling of noise variability with the intensity of the neural signal (Crevecoeur, Sepulchre, Thonnard, & Lefèvre, 2011; Qian, Jiang, Jiang, & Mazzoni, 2013; Todorov, 2005), as well as the nonlinear-ity of the musculoskeletal system (Li & Todorov, 2007). A complete derivation of the optimal feedback gains and Kalman gains is beyond the scope of the present review, but can be found elsewhere (Åström, 1970; Brown, 1983; Bryson & Ho, 1975).

Figure 39.1 recapitulates the different components of the full control algorithm (estimation and control) and illustrates how they may describe some aspects of motor control. In the framework of optimal control, task selec-tion specifies the movement goal and constraints and can be seen as the definition of the cost function (Eqs. 3 or 10). Indeed, the cost function determines which state variables will be constrained, and how much motor commands will be penalized to attain that goal. Once the cost function is defined, an optimal control policy can be derived as the solution of a well-defined problem. In this respect, the approach based on optimal control makes a theoretical link between motor behavior and the control of a biomechanical plant, which are two fundamental aspects of movement neuroscience (Scott, 2004). Optimal control provides a normative tool to describe how the biomechanical plant should be con-trolled according to a given behavioral objective. The online control of movement is then realized by applying the feedback control policy (figure 39.1 and Eq. 12) to the estimated state of the body (figure 39.1 and Eqs. 13 and 14).

Optimal Control and Internal Models The notion of an internal model is an important conceptual framework in motor neuroscience that is often defined as a group of neurons or circuits that mimic the input-output relationship of the peripheral motor system and environment. A traditional view was that inverse models transform a desired movement into a sequence of motor commands, and forward models predict the con-sequences of these motor commands (for a review, see

Kawato, 1999). Pairing inverse and forward models according to the intended movement was proposed as a model for sensorimotor coordination (Wolpert & Kawato, 1998).

Optimal feedback control generalizes the neural computations identified within the framework of inter-nal models and bridges the gap between movement planning and execution by providing a goal-related feedback control policy (Eq. 12 for linear systems). In the optimal control framework, the controller must transform the movement goal expressed by the cost function (Eqs. 3 or 11) into a control policy. This opera-tion is similar to the one performed by inverse models, in the sense that they both use knowledge of the body dynamics to map intended movements into motor com-mands. The notion of a forward model is also present in the optimal control framework. A common perspec-tive is that forward models predict the consequences of motor actions, thereby allowing the brain to compen-sate for their effect ahead of sensory information. The computation of a prior belief about the state of the body can be seen as a forward prediction of the future state of the body (Eq. 13), based on the current motor commands and internal knowledge of the system dynamics (represented by the matrices A and B; figure 39.1, musculoskeletal system). Motor prediction from forward models is therefore a critical component of optimal feedback control models.

Aside from the problems related to the derivation of the control policy, an important challenge for the motor system is the computation of the state estimate. Indeed, we have only partially addressed the problem of state estimation with the Kalman filter, dealing with the variability of prediction and sensory signals. Addi-tionally, the brain must cope with time delays resulting from the transmission of neural signals along the nerves. Given the presence of sensory delays, the hypothesis that the brain uses state estimation requires converting delayed sensory feedback into present estimates of the state of the body, which involve prediction based on sensory signals that is independent from the motor pre-diction (Ariff, Donchin, Nanayakkara, & Shadmehr, 2002; Mehta & Schaal, 2002). Compatible with this prediction, we recently showed that rapid sensory predictions are performed following a perturbation (Crevecoeur & Scott, 2013).

Perspective on Alternative Control Approaches: Trading Efficiency for Robustness A central assumption of optimal (feedback) control approaches in movement neuroscience is that the brain knows the system’s dynamics exactly. In other words, OFC models give us the best solution, assuming that a correct

Page 6: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

440 motor systems and action

internal representation of the body and environmental dynamics is available, while uncertainty is handled by considering the presence of random noise in the system. However, uncertainty in the internal model of dynamics can also alter the control of a movement. For instance, the inertia of the limb differs slightly according to the weight of a watch, clothing, or hand-held objects. In addition, muscle dynamics may vary depending on the level of background activity, biochemical factors, or fatigue (Zahalak, 1981). This class of model distur-bances does not fall under those disturbances modeled by random noise, as they potentially introduce system-atic biases during movement.

In general, researchers have approached this problem with learning or adaptation studies (Shadmehr, Smith, & Krakauer, 2010; Wolpert, Diedrichsen, & Flanagan, 2011). In this framework, changes in motor commands reflect adaptive adjustments of how the brain repre-sents environmental dynamics. While tremendous progress has been made with this approach, a clear shortcoming is that the body and environment can change more rapidly than the adaptation processes typi-cally investigated in motor learning studies (e.g., learn-ing curves varying over tens to hundreds of trials). For instance, muscle dynamics rapidly change during effort without giving us the chance to practice tens of trials to adapt to those changes.

Engineers have developed an approach based on the concept of robustness to deal with these internal model uncertainties. The idea is to make the control design as insensitive to model errors as possible (Bhattacharyya, Chapellat, & Keel, 1995; Doyle, Francis, & Tannen-baum, 1992). This approach typically focuses on prop-erties of the controller rather than on the actual system trajectories emphasized in the classical optimal control approach. An important theoretical result is that the controllers that are the most robust against model errors do not always correspond to the controllers that are the most efficient (Boulet & Duan, 2007; Michiels & Niculescu, 2007). In other words, improving the robustness of control may degrade performance, whereas optimizing a performance criterion can make the control design more fragile to model errors. Com-patible with these principles, previous studies have sug-gested that motor performance is altered to maintain performance or preserve movement smoothness in conditions of higher uncertainty (Crevecoeur, McIntyre, Thonnard, & Lefèvre, 2010; Ronsse, Thonnard, Lefèvre, & Sepulchre, 2008). However, to our knowledge, the trade-off between the efficiency and robustness of biological motor control and its influence on motor planning and execution has not been thoroughly investigated. Given that internal models of body and

environmental dynamics can never be known exactly, robustness may be an important consideration in motor neuroscience.

Application to biological control: Flexible sensorimotor control strategies

The motor system has a remarkable ability to perform successfully while never reproducing exactly the same movement. This consistent success in the presence of variability suggests the central nervous system is well aware of the constraints of the task at hand and is less concerned about errors that do not affect performance. This tendency is often referred to as the minimum inter-vention principle (Todorov & Jordan, 2002) and is cap-tured by the ability to ignore limb deviations that do not interfere with task completion. The same idea is reflected in the notion of structured variability or an uncontrolled manifold, where limb and whole-body motion are more variable along dimensions that are irrelevant for the task (Balasubramaniam, Riley, & Turvey, 2000; Scholz & Schoner, 1999; Valero-Cuevas, Venkadesan, & Todorov, 2009; Cluff et al., 2011).

Optimal feedback control provides a framework for us to understand task-related error corrections. Because motor commands have a cost, there is no need to control movement errors that do not interfere with the intended goal. This trade-off between behavioral per-formance and motor costs, expressed in a straightfor-ward way by the quadratic cost function in Eq. 10, is only possible if the brain continuously processes sensory data to select control actions that are appropriate for the goal and constraints of the task. In agreement with this principle, we review several studies emphasizing that this type of flexible, task-dependent feedback control underlies both voluntary motor behavior and responses to external perturbations.

Visuomotor Feedback Responses Liu and Todorov (2007) provided compelling evidence for flexible bio-logical control strategies by demonstrating that feed-back responses depend on the hand’s position when a goal target changes location during reaching. In this experiment, visual target perturbations were intro-duced at the start (early) or near the end of a reaching movement (late; figure 39.1A) in a task where subjects were instructed to stop at a peripheral target. When the target location was perturbed early in the movement, the participants corrected their hand path smoothly (figure 39.1B). In contrast, hand path corrections were incomplete when the same perturbation was introduced late in the movement (figure 39.1B). Liu and Todorov suggested that the dependency of the correction upon

Page 7: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

crevecoeur, cluff and scott: goal-directed movement planning and execution 441

the timing of the target jump was caused by the inher-ent trade-off between end point accuracy and motor costs. In order to stop near the target, the controller became more sensitive to movement velocity than end point accuracy, leading to consistent under-compensa-tion for target errors introduced at the end of the reaching movement. This systematic under-compensa-tion was reduced when the task instructions were to hit rather than stop at the target (figure 39.1B). Hitting the target removes the constraint on final hand velocity and allows participants to fully correct for positional errors introduced near the target.

Several other attributes of visuomotor control have been addressed in the context of target or cursor (i.e., hand feedback) jumps. In agreement with the princi-ples of stochastic optimal control, several studies have shown that visual perturbation responses are modu-lated by the reliability of the cursor or target location (Izawa & Shadmehr, 2008; Körding & Wolpert, 2004), by the shape of the goal target (Knill, Bondada, & Chhabra, 2011), and by the relevance of the visual cursor jump relative to the reaching target (Franklin & Wolpert, 2008). As we mentioned earlier, the aim of OFC models is not to eliminate all variability, but rather, allow it to accumulate in dimensions that do not inter-fere with performance (Todorov, 2004) while minimiz-ing it in dimensions that are relevant for task completion. Knill and colleagues (2011) outlined the same type of selective motor corrections during visuomotor control. Indeed, Knill and colleagues showed that motor correc-tions following lateral cursor jumps were nearly twice as large for rectangular targets oriented parallel to the movement path compared to when the target was per-pendicular to the reach.

Another clear example of flexible visuomotor control is when subjects reach in the presence of visual pertur-bations that may or may not affect reaching perfor-mance (Franklin & Wolpert, 2008). Franklin and Wolpert characterized these selective corrections by unexpectedly shifting hand feedback during point-to-point reaching movements. The hand cursor distur-bance either persisted until the end of movement (relevant), or returned to the veridical hand location before the end of the movement (irrelevant, figure 39.2D). By shifting the visual cursor location, it was shown that the motor system selectively corrects hand feedback perturbations that affect the outcome of the task (figure 39.2E), while ignoring perturbations that do not affect performance (figure 39.2F).

In summary, visuomotor perturbation studies have shown that the brain produces distinct feedback responses when the same perturbation is encountered in different behavioral contexts. Consistent with optimal

control models, these results reveal that the motor system selectively corrects for visual perturbations that jeopardize task performance while taking into consid-eration the constraints of the task. The latency of visuomotor corrections was consistently observed in 150–230 ms (Franklin & Wolpert, 2008; Knill et al., 2011), which, after removing delays associated with signal transmission, suggests the brain can rapidly implement flexible feedback responses after a perturba-tion. In the following section, we present results from mechanical perturbation studies suggesting that flexi-ble feedback responses can be implemented in as little as ∼50 ms when they are mediated by rapid changes in limb afferent feedback.

Mechanical Perturbations Investigating how quickly the motor system implements flexible control strategies can provide important insight into the neural pathways involved in feedback control. Mechanical perturbations offer a powerful means to address this problem, because muscle afferent feedback evokes motor responses on a time scale of tens of milliseconds. In fact, when the limb is displaced by a mechanical perturbation, the motor system produces a stereotyped sequence of muscle activity, beginning with the short-latency stretch reflex (response epoch called R1: 20–50 ms post-perturbation) and ending with a volun-tary response (>100 ms). The short-latency stretch reflex is the earliest of these responses, and due to its timing can be attributed to spinal processing. These spinal stretch responses are sensitive to joint motion, but show little of the functional complexity expressed during voluntary behavior (Pruszynski, Kurtzer, & Scott, 2008).

Between the short-latency and voluntary responses is the long-latency stretch response (R2/R3: 50–105 ms post-perturbation), which includes responses gener-ated from multiple neural substrates (Pruszynski, Kurtzer, & Scott, 2011), including spinal (Ghez & Shinoda, 1978; Matthews, 1984; Schuurmans et al., 2009) and supraspinal pathways (Evarts, 1973; Phillips, 1969). A robust observation in motor physiology studies is that feedback corrections in the long-latency time window exhibit remarkable flexibility and can be modi-fied by the subject’s voluntary intent (Crago, Houk, & Hasan, 1976; Hammond, 1956; Rothwell, Traub, & Marsden, 1980) or the demands of an ongoing motor action (see Hasan, 2005; Matthews, 1991; Pruszynski & Scott, 2012, for a comprehensive review).

As we mentioned earlier, OFC models suggest the ability to alter feedback responses to a mechanical per-turbation is a direct consequence of task-dependent sensorimotor processing (Scott, 2004). Nashed and

Page 8: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

442 motor systems and action

Figure 39.2 Visuomotor responses account for the goal of the ongoing task. (A) Subjects were instructed to reach and stop at a target (“stop” condition) that either stayed in the central location or jumped in the lateral direction after the start of the movement. Data are the population average tra-jectories when participants were instructed to stop at the target. Color code: black, baseline; red, early perturbation; blue, late perturbation. (B) Hand path deviation in the direc-tion of the displaced target. Note that subjects were unable to compensate when the target jumped late in the movement, demonstrating that feedback responses depend on the task constraints. Color scheme is the same as in A. Dashed lines, “hit” condition; solid lines, “stop” condition. (C) Results from target intercept experiment. In this experiment, the target jumped laterally and then moved downward at a fixed rate. The subjects were instructed to “stop” or “hit” the target before it stopped moving. (Adapted from Liu & Todorov, 2007.) (D) Visuomotor responses only correct for errors that

0

5

10

15

20

25

Pos

ition

[cm

]

-4 0 4 -4 0 4

Position [cm]-4 0 4

NormalTask-relevant

feedbackTask-irrelevant

feedback

0

1

2

3

0

1

2

3M

ean

forc

e [N

]M

ean

forc

e [N

]

Pre-normal Task-relevantfeedback

1 14 1 14 1 14

Trial blocks

1 14 1 14 1 14

Trial blocks

Post-normal

Pre-normal Task-irrelevantfeedback

Post-normal

D

E

F

0

5

10

0 200 400 600 800

0

5

10

0 200 400 600 800 Early Late0

1

Early Late0

1

Late

ral p

ositi

on [c

m]

Und

ersh

oot [

cm]

Und

ersh

oot [

cm]

Time [ms]

Target

Hit

Stop

40 cm

20 cm

35 c

m

Late

ral p

ositi

on [c

m]

Time [ms] Perturbation

Perturbation

Stop

Hit

Target

A

B

C

are relevant to the ongoing task. In the normal condition, the hand feedback cursor reproduced the hand trajectory. In the task-relevant feedback condition (orange traces), the visual cursor moved away from the hand trajectory and remained as this point for the rest of the movement. In contrast, in the task-irrelevant feedback condition (blue traces), the hand feedback cursor moved away from the hand trajectory but returned to the true hand position by the end of the move-ment. (E) Time course of adaptation to task-relevant feedback perturbations. Data are the mean (solid line) and standard deviation (shaded region) force difference across subjects between right and left visual perturbations (180–230 ms after cursor jump). Note that subjects produced larger corrections when the cursor displacements were relevant to the ongoing task. (F) Time course of adaptation to task-irrelevant feedback perturbations. Note that subjects did not adapt when the perturbations were irrelevant to the ongoing task. (Adapted from Franklin & Wolpert, 2008.) (See color plate 35.)

Page 9: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

crevecoeur, cluff and scott: goal-directed movement planning and execution 443

Figure 39.3 Corrective responses to mechanical perturba-tions applied during reaching and postural control. (A) Dif-ferences in corrective responses while reaching to a circle target (left panel) or rectangular bar (middle panel). Black dotted lines are unperturbed reaching movements to each target shape. Extensor perturbations were applied at the shoulder and elbow on random trials to elicit elbow motion just after movement onset. Corrective responses were rapidly directed back to the circular target (black solid lines), but in contrast, were directed to new locations on the rectangular bar (gray lines). Note that differences between corrections can be observed in brachioradialis stretch responses (popula-tion data, elbow flexor) within ∼60 ms of the perturbation (R2 and R3; right panel). Vertical lines denote perturbation onset, and dashed lines denote separation of the different phases of the stretch response. (B) Differences in corrective responses while reaching to a rectangular bar occluded by environmen-tal objects (left panel) and an unoccluded rectangular bar (middle panel). Corrective responses directed the hand between the objects to the target (black solid lines), but directed the hand to new locations on the unoccluded rect-angular bar (gray lines). Note that the stretch response begins to reflect the target shape ∼60 ms after the

∆EM

G [a

u]

2 cmx

y

Perturbed reaching

Unperturbed reaching

Time [ms]

∆EM

G [a

u]

–50 0 50 100 1500

6 R1R2 R3A

B

C

D E

0

3

Time [ms]–50 0 50 100 150

R1R2 R3

Tim

e [m

s]

One-cursor task:matching perturbations

One-cursor task:mirror perturbations

–100

0

100

200

300

400

Hand X position (cm)Right hand Left hand

–10 –5 0 5 10

500

−50 0 50 100 150 2002500

2

4

6

8

10

−1

0

1

2

3

∆EM

G [a

u]

−50 0 50 100 150 200250

EM

G [a

u]

Time [ms]

MirrorMatchingR1 R2R3

CursorActualposition

Pert.

perturbation. (Adapted from Nashed et al., 2012.) (C) Cor-rective responses in a bimanual postural control task. In the mirror perturbation condition (left panel), the hands are perturbed in opposite directions (black arrows). In the match-ing perturbations condition (right panel), the hands are perturbed in the same direction (gray arrows), causing dis-placement of the feedback cursor that is displayed at the spatial average position of the two hands. (D) Lateral hand path kinematics. Thin lines correspond to individual subject data; thick lines denote the population average for each condition. Color scheme is the same as in C. Shaded box is the acceptable target region. (E) Pectoralis major stretch responses in the mirror (black lines) and matching conditions (gray lines). Data are the population average response, and shaded region corresponds to the SEM. Bottom panel is the difference between stretch responses in the matching and mirror conditions (Matching – Mirror). Dotted line denotes average difference across participants, shaded region is the SEM. Note that differences in muscle stretch responses emerge ∼75 ms after the perturbation, demonstrating that right limb responses depend on feedback about the left arm’s motion. (Adapted from Omrani et al., 2013.)

colleagues (2012) recently investigated whether long-latency responses selectively correct for task-relevant errors while subjects made reaching movements to a circular target or rectangular bar oriented perpendicu-lar to the reach (figure 39.3A). On certain trials, a mechanical perturbation was applied to displace the hand in the lateral direction. When the perturbation pushed the hand away from the circular target, the participants performed rapid corrective responses to direct their hand back to the target. In contrast, when the same perturbation was applied while subjects reached to the rectangular bar, the participants

redirected their hand to new locations on the bar. Similar context-dependent responses were evoked when obstacles in the environment required that the participants navigate to the target through a narrow channel (figure 39.3B). These behavioral results were reproduced by an optimal feedback control model with differing sensitivity to lateral hand errors. Further, the model predicted that the shape of the goal target should influence the feedback response as early as sensory feedback about the perturbation became available. In agreement with these model predictions, differences in muscle responses between tasks were observed in as

Page 10: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

444 motor systems and action

little as 70 ms following the perturbation, establishing that rapid motor corrections (i.e., long-latency responses) integrate muscle stretch information with knowledge about the behavioral goal and spatial features of the environment (Nashed et al., 2012).

Flexible Bimanual Feedback Control In the context of sudden mechanical perturbations, Marsden and colleagues (Marsden, Merton, & Morton, 1981) first outlined the task-dependency of interlimb responses, noting that long-latency muscle stretch responses in the right arm after left arm perturbations reflected the task the right arm was performing. If the right arm held a table for support, the extensors were activated to stabilize the participant following the left arm perturbation. Remarkably, subjects even reversed their corrective responses and activated the flexor muscles of the right arm if they had to stabilize a cup of tea. Perhaps the most powerful example of how the task modulates interlimb feedback responses is that muscle responses were absent if the subject grasped a loose handle and benefitted little from right arm responses. The coordinated responses observed in the muscles of the unperturbed arm clearly emphasize that online feedback control is not hard-wired but can be engaged at will, depending on the context and on the intended behavior.

Bimanual control therefore provides a remarkable tool to address how sophisticated feedback control can be distributed across different body parts. Motivated by the capacity of OFC models to exploit many different ways to attain the same movement goal (i.e., task redun-dancy), recent studies have examined feedback correc-tions in bimanual tasks by comparing motor responses when the two arms act independently or are coupled by the task demand. A compelling example of this flex-ibility is when one hand is gradually perturbed in a task that requires bimanual reaching movements (Diedrich-sen, 2007). If each hand controls its own cursor while reaching to separate targets (two-cursor task), only the perturbed hand shows a corrective response. However, when the two arms control a single cursor (one-cursor task), displayed as the spatial average position of the two hands, perturbations applied to one hand elicit bilateral responses to correct the cursor’s trajectory (Diedrichsen, 2007). These flexible responses were reproduced by expressing the task constraints (Qt in Eq. 7) for each hand independently, or as the average of the two hands. The flexibility of bilateral corrections is not restricted to changes in the size of the response, as even the direction of these coordinated responses can be reversed if required by the task (Diedrichsen & Gush, 2009).

Studies have since confirmed that sensory informa-tion from one limb in both reaching (Mutha & Sain-burg, 2009) and posture (Dimitriou, Franklin, & Wolpert, 2012; Omrani, Diedrichsen, & Scott, 2013) rapidly modifies corrective responses in the other limb during the long-latency time window. When the two hands control independent cursors and are perturbed in opposite directions, robust long-latency responses are observed in both arms to counter the perturbation (Omrani et al., 2013). In contrast, long-latency stretch responses are substantially smaller when the two hands are perturbed in opposite directions while controlling a single cursor displayed at the spatial average position of the two hands. In this context, corrective responses are unnecessary because the position of the single feed-back cursor is not disturbed by the perturbation. When the direction of the left arm perturbation was not pre-dictable, stretch responses in the right arm depended on proprioceptive input from the left arm in the one-cursor condition, and were larger if both arms were perturbed in the same direction (figure 39.3C, D, and E). These differential feedback responses were not observed when each hand controlled its own indepen-dent cursor. Why should the brain distribute corrective responses when several effectors are involved in the task? Optimal feedback control predicts this behavioral pattern, since dividing the response across effectors reduces the effort and variability of motor corrections (Diedrichsen & Dowling, 2009).

Internal Models of Multijoint Dynamics We have so far focused on evidence that the motor system continuously processes sensory data to generate task-dependent feedback responses. It is important to recognize, however, that task-dependent feedback responses can only be achieved if the motor system has knowledge of how the body should move in response to external forces that arise from our environmental interactions or forces generated by muscles. An impor-tant feature of body dynamics is the presence of interac-tion torques between joints that require coordinated multijoint responses to control the motion at each joint. Extensive evidence has shown that the voluntary motor system compensates for these interaction torques to produce straight reaching movements (Gribble & Ostry, 1999; Hollerbach & Flash, 1982), but an interest-ing question is whether corrective responses also reflect knowledge of limb mechanics.

Previous work emphasized that mechanical perturba-tions evoked rapid responses in muscles that are not directly stretched by the perturbation, suggesting that coordinated motor responses occur in the long-latency time window (Gielen, Ramaekers, & van Zuylen, 1988;

Page 11: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

crevecoeur, cluff and scott: goal-directed movement planning and execution 445

Figure 39.4 Long-latency responses express knowledge of multijoint limb mechanics. (A) Subjects were instructed to maintain their hand at a central target, and step-torque per-turbations were applied to the shoulder and elbow joints (flexor torques at both joints, dark gray shading and denoted by (F), extensor torques at both joints, light gray shading and denoted by (E). Stretch responses were recorded from the posterior deltoid muscle (PD) (B) Limb configuration and applied multijoint torques were selected to cause substantial elbow motion (dashed lines) but minimal shoulder motion (solid lines). (C) Posterior deltoid (shoulder extensor) muscle activity aligned on perturbation onset. Note that

–50 0 50 100 150

1

3

5

7

Pos

t. D

elt.

EM

G [a

u]

A

B

C

DShoulderElbow

–8

–4

0

4

8

∆Joi

nt A

ngle

[deg

]

Time [ms]–25 0 50 100

τe

τs

τe

τs

R1 R3R2

75ο

45ο

PD

Responding to underlying torque

~50~20

Firin

g R

ate

[Hz]

10

20

30

40

50

60

Nonspecific response

Time [ms]

Perturbationonset

0 100 200

Applied multijoint torques

Flexor (F) Extensor (E)

(F)

(E)

(E)

(F)

(E)

(F)

although there was no change in the length of the posterior deltoid muscle, there is still a robust long-latency response (excitatory and inhibitory). Data are the population-level muscle response, and shaded region corresponds to SEM. Same color scheme as in A and B. (Adapted from Kurtzer et al., 2008.) (D) Population-level response of shoulder-like M1 neurons. Shoulder-like neurons respond to the underlying shoulder torque even though local information from the shoulder is ambiguous about the underlying torque. Note that the response to the underlying torque begins about 50 ms after the onset of the shoulder and elbow perturbations. (Adapted from Pruszynski et al., 2012.)

Soechting & Lacquaniti, 1988). The question of whether these responses relate to limb dynamics was recently addressed by applying different combinations of pertur-bations to the shoulder and elbow. In one experiment, multijoint loads applied to the shoulder and elbow did not produce motion at the shoulder but led to either flexion or extension motion at the elbow (Kurtzer, Pruszynski, & Scott, 2008, 2009; figure 39.4A, B, and C). The authors found that there was no short-latency muscle stretch response in the posterior deltoid, a shoulder extensor, highlighting that this spinal reflex is not elicited without overt motion at the joint (figure 39.4B, C). In contrast, the long-latency response (50–105 ms post-perturbation) integrated motion infor-mation from both joints and generated a large response in the posterior deltoid to counter the shoulder flexor torque, even though there was no motion at the joint (figure 39.4C). These results clearly showed that rapid motor corrections map the sensed motion of

the shoulder and elbow joints onto a response that is appropriate for the actual underlying torque rather than the observed motion pattern.

It is important to emphasize that OFC as a theory of motor behavior can only tell us what the optimal motor solution should look like. The studies presented above highlight feedback responses that possess an impressive degree of flexibility and can be modified to suit the needs of many behavioral tasks. It has been demon-strated that long-latency stretch responses are modified in different dynamic environments (Ahmadi-Pajouh, Towhidkhah, & Shadmehr, 2012; Kimura & Gomi, 2009; Krutky, Ravichandran, Trumbower, & Perreault, 2010), and a direct prediction is that corrective responses should be modulated by the novel dynamical context. Changes in corrective responses to a gradual perturba-tion have been observed over longer time scales (Wagner & Smith, 2008). A recent study shows that long-latency responses also express knowledge of

Page 12: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

446 motor systems and action

internal models acquired during motor learning (Cluff & Scott, 2013). This result emphasizes that adaptive changes to novel dynamics alter voluntary behavior and rapid feedback responses.

In summary, the studies outlined above emphasized the following principles: (1) upper-limb postural control and reaching involve continuous sensory processing; (2) feedback control processes selectively compensate for errors that interfere with the ongoing task, produc-ing motor strategies appropriate for the movement goal; (3) these principles describe voluntary motor behavior as well as responses to visual and mechanical perturbations; and (4) sophisticated feedback responses emerge ∼50–60 ms following a mechanical perturba-tion, coinciding with the long-latency stretch response.

M1 as part of a flexible feedback controller

A robust observation across the studies outlined above is that short-latency responses (∼20–50 ms) are pre-dominantly sensitive to muscle stretch (Pruszynski & Scott, 2012), whereas task-dependent responses consis-tently emerge in the long-latency time window (∼50–100 ms). Long-latency responses coincide with the contribution of long-loop mechanisms, including a transcortical pathway through M1 (Cheney & Fetz, 1984; Desmedt, 1978; Matthews, 1991). Indeed, single-unit recordings in monkeys indicate that M1 receives somatosensory feedback (Evarts & Fromm, 1977; Scott & Kalaska, 1997) or mechanical perturbations (Herter, Korbel, & Scott, 2009; Picard & Smith, 1992). More-over, human studies using transcranial magnetic stimu-lation emphasize a causal link between M1 processing and long-latency stretch responses, since motor cortex stimulation disrupts the task-dependent features of long-latency responses (Capaday, Forget, Fraser, & Lamarre, 1991; Day, Riescher, Struppler, Rothwell, & Marsden, 1991; Kimura, Haggard, & Gomi, 2006, but see also Shemmell, An, & Perreault, 2009). Given the involvement of M1 in the generation of voluntary behavior (Porter & Lemon, 1993) and rapid feedback pathways, this brain region is a clear candidate to implement flexible feedback control strategies (Scott, 2004).

A number of neurophysiological studies in nonhu-man primates have highlighted that a transcortical feed-back through M1 provides important task-dependent processing following mechanical perturbations. In a seminal study, Evarts and Tanji (1976) found that within 40 ms of an upper limb perturbation, the response of pyramidal tract neurons differ depending on whether the monkey was instructed to push or pull a handle. The authors suggested that these task-dependent neural

responses contribute to volitional control of the limb during the long-latency time window. This idea was recently tested in a study examining whether the trans-cortical feedback pathway through M1 exhibits knowl-edge of the limb’s biomechanical properties (Pruszynski et al., 2011) using the paradigm developed by Kurtzer et al. (2008). The authors applied different combina-tions of shoulder and elbow torques evoking flexor or extensor motion at the elbow and no motion at the shoulder. As a result of these multijoint perturbation loads, shoulder motion (none, in this case) is ambigu-ous about the applied perturbation, and the motor system can only appropriately counter the underlying torque by taking elbow motion into consideration. The responses of shoulder-related neurons in MI were found to appropriately respond to the applied load at ∼50 ms, about 15 ms before long-latency responses were recorded in shoulder muscles (figure 39.4D).

How this transcortical feedback pathway resolves this multijoint integration problem is unclear. One clue is that the earliest activity in MI, from 20 to 50 ms after a perturbation, does not reflect specific features of the perturbation (figure 39.4D). That is, regardless of per-turbation direction, all neurons sensitive to shoulder or elbow motion display similar responses until 50 ms after the perturbation is applied. This may suggest that M1 requires ∼30 ms to identify the appropriate response, or that knowledge of limb mechanics is computed else-where in sensorimotor circuits. Several brain regions that receive sensory feedback from the limb project to M1, including primary somatosensory cortex, parietal area 5, and cerebellum (Fromm & Evarts, 1982; Martin, Cooper, Hacking, & Ghez, 2000; Mason, Miller, Baker, & Houk, 1998). There was considerable interest in examining how brain regions including M1 responded to sensory feedback in the 1970s (for a review, see Desmedt, 1978), but since then the role of sensory feed-back in M1 processing has received little attention from the scientific community. Given the tight link between voluntary control and sensory feedback processing, the use of OFC as a framework to understand voluntary motor control has led to a renewed interest in how dif-ferent cortical and subcortical circuits participate in feedback processing for motor control (Scott, 2012).

Conclusion

We have argued that OFC is a powerful tool for under-standing biological motor control and that it has shed light on many of the complexities the brain must con-sider to move successfully in the presence of neural variability or environmental disturbances. Perhaps the most important contribution of OFC in movement

Page 13: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

crevecoeur, cluff and scott: goal-directed movement planning and execution 447

neuroscience has been to unify motor planning and feedback responses to perturbations in a common framework. These two aspects of motor control have been almost relegated to distinct fields of investigation (Scott, 2008). The major conceptual advance of OFC is the idea that movement planning and execution are two sides of the same story: a goal-directed feedback control policy. We expect that future research will address how the computations underlying flexible feedback control are distributed across different brain regions.

REFERENCES

Ahmadi-Pajouh, M. A., Towhidkhah, F., & Shadmehr, R. (2012). Preparing to reach: Selecting an adaptive long-latency feedback controller. J Neurosci, 32(28), 9537–9545.

Ariff, G., Donchin, O., Nanayakkara, T., & Shadmehr, R. (2002). A real-time state predictor in motor control: Study of saccadic eye movements during unseen reaching move-ments. J Neurosci, 22(17), 7721–7729.

Arnold, L. (1974). Stochastic differential equations: Theory and applications. New York, NY: Wiley.

Åström, K. (1970). Introduction to stochastic control theory. New York, NY: Academic Press.

Balasubramaniam, R., Riley, M. A., & Turvey, M. (2000). Specificity of postural sway to the demands of a precision task. Gait Posture, 11(1), 12–24.

Berret, B., Chiovetto, E., Nori, F., & Pozzo, T. (2011). Evidence for composite cost functions in arm movement planning: An inverse optimal control approach. PLoS Comput Biol, 7(10), e1002183.

Bhattacharyya, S. P., Chapellat, H., & Keel, L. H. (1995). Robust control: The parametric approach. Englewood Cliffs, NJ: Prentice Hall.

Biess, A., Liebermann, D. G., & Flash, T. (2007). A compu-tational model for redundant human three-dimensional pointing movements: Integration of independent spatial and temporal motor plans simplifies movement dynamics. J Neurosci, 27(48), 13045–13064.

Boulet, B., & Duan, X. (2007). The fundamental tradeoff between performance and robustness: a new perspective on loop shaping. IEEE Contr Syst Mag, 27(3), 30–44.

Brown, R. (1983). Introduction to random signal analysis and Kalman filtering. New York, NY: Wiley.

Bryson, A., & Ho, Y.-C. (1975). Applied optimal control: Optimi-zation, estimation, and control. New York, NY: Hemisphere Publishing.

Capaday, C., Forget, R., Fraser, R., & Lamarre, Y. (1991). Evidence for a contribution of the motor cortex to the long-latency stretch reflex of the human thumb. J Physiol, 440(1), 243–255.

Cheney, P. D., & Fetz, E. E. (1984). Corticomotoneuronal cells contribute to long-latency stretch reflexes in the rhesus monkey. J Physiol, 349, 249–272.

Churchland, M. M., Afshar, A., & Shenoy, K. V. (2006). A central source of movement variability. Neuron, 52(6), 1085–1096.

Cluff, T., & Scott, S. H. (2013). Rapid feedback responses correlate with reach adaptation and properties of novel upper limb loads. J Neurosci, 33(40), 15903–15914.

Cluff, T., Aspasia, M., Lee, T. D., & Balasubramaniam, R. (2011). Multijoint error compensation mediates unstable object control. J Neurophysiol, 108, 1167–1175.

Crago, P. E., Houk, J. C., & Hasan, Z. (1976). Regulatory actions of human stretch reflex. J Neurophysiol, 39(5), 925–935.

Crevecoeur, F., & Scott, S. H. (2013). Priors engaged in long-latency responses to mechanical perturbations suggest a rapid updated in state estimation. PLoS Comput Biol, 9(8), e1003177.

Crevecoeur, F., McIntyre, J., Thonnard, J.-L., & Lefèvre, P. (2010). Movement stability under uncertain internal models of dynamics. J Neurophysiol, 104, 1301–1313.

Crevecoeur, F., Sepulchre, R. J., Thonnard, J.-L., & Lefèvre, P. (2011). Improving the state estimation for optimal control of stochastic processes subject to multipli-cative noise. Automatica, 47(3), 591–596.

Day, B. L., Riescher, H., Struppler, A., Rothwell, J. C., & Marsden, C. D. (1991). Changes in the response to mag-netic and electrical stimulation of the motor cortex following muscle stretch in man. J Physiol, 433(1), 41–57.

Desmedt, J. (1978). Cerebral motor control in man: Long loop mechanisms. New York, NY: Karger.

Diedrichsen, J. (2007). Optimal task-dependent changes of bimanual feedback control and adaptation. Curr Biol, 17(19), 1675–1679.

Diedrichsen, J., & Dowling, N. (2009). Bimanual coordina-tion as task-dependent linear control policies. Hum Move-ment Sci, 28(3), 334–347.

Diedrichsen, J., & Gush, S. (2009). Reversal of bimanual feedback responses with changes in task goal. J Neurophysiol, 101(1), 283–288.

Dimitriou, M., Franklin, D. W., & Wolpert, D. M. (2012). Task-dependent coordination of rapid bimanual motor responses. J Neurophysiol, 107(3), 890–901.

Doyle, J. C., Francis, B. A., & Tannenbaum, A. R. (1992). Feedback control theory. New York, NY: Macmillan.

Evarts, E. V. (1973). Motor cortex reflexes associated with learned movement. Science, 179(4072), 501–503.

Evarts, E. V., & Fromm, C. (1977). Sensory responses in motor cortex neurons during precise motor control. Neuro-sci Lett, 5(5), 267–272.

Evarts, E. V., & Tanji, J. (1976). Reflex and intended responses in motor cortex pyramidal tract neurons of monkey. J Neurophysiol, 39, 1069–1080.

Faisal, A. A., Selen, L. P. J., & Wolpert, D. M. (2008). Noise in the nervous system. Nat Rev Neurosci, 9(4), 292–303.

Flash, T., & Hogan, N. (1985). The coordination of arm movements: An experimentally confirmed mathematical model. J Neurosci, 5(7), 1688–1703.

Franklin, D. W., & Wolpert, D. M. (2008). Specificity of reflex adaptation for task-relevant variability. J Neurosci, 28(52), 14165–14175.

Franklin, S., Wolpert, D. M., & Franklin, D. W. (2012). Visuomotor feedback gains upregulate during the learning of novel dynamics. J Neurophysiol, 108(2), 467–478.

Fromm, C., & Evarts, E. V. (1982). Pyramidal tract neurons in somatosensory cortex: Central and peripheral inputs during voluntary movement. Brain Res, 238(1), 186–191.

1

Page 14: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

448 motor systems and action

Ghez, C., & Shinoda, Y. (1978). Spinal mechanisms of the functional stretch reflex. Exp Brain Res, 32, 55–68.

Gielen, C. C., Ramaekers, L., & van Zuylen, E. J. (1988). Long-latency stretch reflexes as co-ordinated functional responses in man. J Physiol, 407(1), 275–292.

Gribble, P. L., & Ostry, D. J. (1999). Compensation for interaction torques during single- and multijoint limb movement. J Neurophysiol, 82(5), 2310–2326.

Hammond, P. H. (1956). The influence of prior instruction to the subject on an apparently involuntary neuro-muscular response. J Physiol, 132(1), 17–18P.

Harris, C. M., & Wolpert, D. M. (1998). Signal-dependent noise determines motor planning. Nature, 394(6695), 780–784.

Hasan, Z. (2005). The human motor control system’s response to mechanical perturbation: Should it, can it and does it ensure stability? J Motor Behav, 37(6), 484–493.

Herter, T. M., Korbel, T., & Scott, S. H. (2009). Compari-son of neural responses in primary motor cortex to tran-sient and continuous loads during posture. J Neurophysiol, 101(1), 150–163.

Hollerbach, J., & Flash, T. (1982). Dynamic interactions between limb segments during planar arm movement. Biol Cybern, 44, 67–77.

Izawa, J., & Shadmehr, R. (2008). On-line processing of uncertain information in visuomotor control. J Neurosci, 28(44), 11360–11368.

Kalman, R. (1960). A new approach to linear filtering and prediction problems. J Basic Eng-T ASME, 82, 35–45.

Kawato, M. (1999). Internal models for motor control and trajectory planning. Curr Opin Neurobiol, 9(6), 718–727.

Kimura, T., & Gomi, H. (2009). Temporal development of anticipatory reflex modulation to dynamical interactions during arm movement. J Neurophysiol, 102(4), 2220–2231.

Kimura, T., Haggard, P., & Gomi, H. (2006). Transcranial magnetic stimulation over sensorimotor cortex disrupts anticipatory reflex gain modulation for skilled action. J Neurosci, 26(36), 9272–9281.

Knill, D. C., Bondada, A., & Chhabra, M. (2011). Flexible, task-dependent use of sensory feedback to control hand movements. J Neurosci, 31(4), 1219–1237.

Körding, K. P., & Wolpert, D. M. (2004). Bayesian integra-tion in sensorimotor learning. Nature, 427(6971), 244–247.

Krutky, M. A., Ravichandran, V. J., Trumbower, R. D., & Perreault, E. J. (2010). Interactions between limb and environmental mechanics influence stretch reflex sensitiv-ity in the human arm. J Neurophysiol, 103(1), 429–440.

Kurtzer, I. L., Pruszynski, J. A., & Scott, S. H. (2008). Long-latency reflexes of the human arm reflect an internal model of limb dynamics. Curr Biol, 18(6), 449–453.

Kurtzer, I., Pruszynski, J. A., & Scott, S. H. (2009). Long-latency responses during reaching account for the mechan-ical interaction between the shoulder and elbow joints. J Neurophysiol, 102(5), 3004–3015.

Li, W., & Todorov, E. (2007). Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system. Int J Contr, 80(9), 1439–1453.

Liu, D., & Todorov, E. (2007). Evidence for the flexible sensorimotor strategies predicted by optimal feedback control. J Neurosci, 27(35), 9354–9368.

Marsden, C. D., Merton, P. A., & Morton, H. B. (1981). Human postural responses. Brain, 104(3), 513–534.

Martin, J. H., Cooper, S. E., Hacking, A., & Ghez, C. (2000). Differential effects of deep cerebellar nuclei inactivation on reaching and adaptive control. J Neurophysiol, 83(4), 1886–1899.

Mason, C. R., Miller, L. E., Baker, J. F., & Houk, J. C. (1998). Organization of reaching and grasping movements in the primate cerebellar nuclei as revealed by focal musci-mol inactivations. J Neurophysiol, 79(2), 537–554.

Matthews, P. B. (1984). Evidence from the use of vibration that the human long-latency stretch reflex depends upon spindle secondary afferents. J Physiol, 348(1), 383–415.

Matthews, P. B. C. (1991). The human stretch reflex and the motor cortex. Trends Neurosci, 14(3), 87–91.

Mehta, B., & Schaal, S. (2002). Forward models in visuomo-tor control. J Neurophysiol, 88(2), 942–953.

Michiels, W., & Niculescu, S.-I. (2007). Stability and stabi-lization of time-delay systems. An eigenvalue-based approach. In Advances in design and control, 12. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM) Publications.

Mutha, P. K., & Sainburg, R. L. (2009). Shared bimanual tasks elicit bimanual reflexes during movement. J Neuro-physiol, 102(6), 3142–3155.

Nakano, E., Imamizu, H., Osu, R., Uno, Y., Gomi, H., Yoshioka, T., & Kawato, M. (1999). Quantitative exami-nations of internal representations for arm trajectory plan-ning: Minimum commanded torque change model. J Neurophysiol, 81(5), 2140–2155.

Nashed, J. Y., Crevecoeur, F., & Scott, S. H. (2012). Influ-ence of the behavioral goal and environmental obstacles on rapid feedback responses. J Neurophysiol, 108(4), 999–1009.

Omrani, M., Diedrichsen, J., & Scott, S. H. (2013). Rapid feedback corrections during a bimanual postural task. J Neurophysiol, 109(1), 147–161.

Osborne, L. C., Lisberger, S. G., & Bialek, W. (2005). A sensory source for motor variation. Nature, 437(7057), 412–416.

Phillips, C. G. (1969). The 1968 Ferrier lecture: Motor appa-ratus of the baboon’s hand. Proc R Soc Lond B Biol Sci, 173(1031), 141–174.

Picard, N., & Smith, A. M. (1992). Primary motor cortical responses to perturbations of prehension in the monkey. J Neurophysiol, 68(5), 1882–1894

Porter, R., & Lemon, R. N. (1993). Corticospinal function and voluntary movement. New York, NY: Clarendon Press.

Pruszynski, J. A., Kurtzer, I., Nashed, J. Y., Omrani, M., Brouwer, B., & Scott, S. H. (2011). Primary motor cortex underlies multi-joint integration for fast feedback control. Nature, 478(7369), 387–390.

Pruszynski, J. A., Kurtzer, I., & Scott, S. H. (2008). Rapid motor responses are appropriately tuned to the metrics of a visuospatial task. J Neurophysiol, 100(1), 224–238.

Pruszynski, J. A., Kurtzer, I., & Scott, S. H. (2011). The long-latency reflex is composed of at least two functionally independent processes. J Neurophysiol, 106(1), 449–459.

Pruszynski, J. A., & Scott, S. H. (2012). Optimal feedback control and the long-latency stretch response. Exp Brain Res, 218, 341–359.

Qian, N., Jiang, Y., Jiang, J.-P., & Mazzoni, P. (2013). Move-ment duration, Fitts’s law, and an infinite-horizon optimal feedback control model for biological motor systems. Neural Comput, 25, 697–724.

Page 15: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY

crevecoeur, cluff and scott: goal-directed movement planning and execution 449

Ronsse, R., Thonnard, J.-L., Lefèvre, P., & Sepulchre, R. (2008). Control of bimanual rhythmic movements: Trading efficiency for robustness depending on the context. Exp Brain Res, 187, 193–205.

Rothwell, J. C., Traub, M. M., & Marsden, C. D. (1980). Influence of voluntary intent on the human long-latency stretch reflex. Nature, 286(5772), 496–498.

Scholz, J., & Schoner, G. (1999). The uncontrolled mani-fold concept: Identifying control variables for a functional task. Exp Brain Res, 126, 289–306.

Schuurmans, J., de Vlugt, E., Schouten, A., Meskers, C., De Groot, J., & van der Helm, F. (2009). The monosyn-aptic Ia afferent pathway can largely explain the stretch duration effect of the long latency M2 response. Exp Brain Res, 193, 491–500.

Scott, S. H. (2012). The computational and neural basis of voluntary motor control and planning. Trends Cogn Sci, 16(11), 541–549.

Scott, S. H. (2004). Optimal feedback control and the neural basis of volitional motor control. Nat Rev Neurosci, 5(7), 532–546.

Scott, S. H. (2008). Inconvenient truths about neural pro-cessing in primary motor cortex. J Physiol, 586(5), 1217–1224.

Scott, S. H., & Kalaska, J. F. (1997). Reaching movements with similar hand paths but different arm orientation. I. Activity of individual cells in motor cortex. J Neurophysiol, 77, 826–853.

Scott, S. H., & Loeb, G. E. (1994). The computation of posi-tion sense from spindles in mono- and multiarticular muscles. J Neurosci, 14(12), 7529–7540.

Shadmehr, R., Smith, M. A., & Krakauer, J. W. (2010). Error correction, sensory prediction, and adaptation in motor control. Annu Rev Neurosci, 33, 89–108.

Shemmell, J., An, J. H., & Perreault, E. J. (2009). The dif-ferential role of motor cortex in stretch reflex modulation induced by changes in environmental mechanics and verbal instruction. J Neurosci, 29(42), 13255–13263.

Soechting, J. F., & Lacquaniti, F. (1988). Quantitative evalu-ation of the electromyographic responses to multidirectional

load perturbations of the human arm. J Neurophysiol, 59(4), 1296–1313.

Todorov, E. (2004). Optimality principles in sensorimotor control. Nat Neurosci, 7(9), 907–915.

Todorov, E. (2005). Stochastic optimal control and estima-tion methods adapted to the noise characteristics of the sensorimotor system. Neural Comput, 17(5), 1084–1108.

Todorov, E. (2006). Optimal control theory. In K. Doya (Ed.), Bayesian brain: Probabilistic approaches to neural coding (pp. 269–298). Cambridge, MA: MIT Press.

Todorov, E., & Jordan, M. I. (2002). Optimal feedback control as a theory of motor coordination. Nat Neurosci, 5(11), 1226–1235.

Uno, Y., Kawato, M., & Suzuki, R. (1989). Formation and control of optimal trajectory in human multijoint arm movement. Biol Cybern, 61, 89–101.

Valero-Cuevas, F. J., Venkadesan, M., & Todorov, E. (2009). Structured variability of muscle activations supports the minimal intervention principle of motor control. J Neuro-physiol, 102(1), 59–68.

Van Beers, R. J., Haggard, P., & Wolpert, D. M. (2004). The role of execution noise in movement variability. J Neuro-physiol, 91(2), 1050–1063.

Wagner, M. J., & Smith, M. A. (2008). Shared internal models for feedforward and feedback control. J Neurosci, 28(42), 10663–10673.

Wolpert, D. M., & Flanagan, J. R. (2001). Motor prediction. Curr Biol, 11(18), R729–R732.

Wolpert, D. M., & Kawato, M. (1998). Multiple paired forward and inverse models for motor control. Neural Netw, 11(7–8), 1317–1329.

Wolpert, D. M., Diedrichsen, J., & Flanagan, J. R. (2011). Principles of sensorimotor learning. Nat Rev Neurosci, 12, 739–751.

Zahalak, G. I. (1981). A distribution-moment approximation for kinetic theories of muscular contraction. Math Biosci, 55, 89–114.

Page 16: New 39 Computational Approaches for Goal-Directed Movement … · 2015. 10. 2. · available sensory data to produce successful behavior. Harris and Wolpert (1998) considered the

K2

PROPERTY OF MIT PRESS: FOR PROOFREADING AND INDEXING PURPOSES ONLY