22
Optimal Control of Inverted Pendulum Degree Project in Engineering Physics, First Cycle, SA104X Department of Mathematics, section of Numerical Analysis Royal Institute of Technology Author: Oskar Stattin, [email protected] Supervisor: Mattias Sandberg Examiner: Mårten Olsson

Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

  • Upload
    others

  • View
    21

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

Optimal Control of Inverted Pendulum

Degree Project in Engineering Physics, First Cycle, SA104XDepartment of Mathematics, section of Numerical Analysis

Royal Institute of Technology

Author: Oskar Stattin, [email protected]

Supervisor: Mattias SandbergExaminer: Mårten Olsson

Page 2: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

Abstract

This report investigates two methods of finding the optimal control of aninverted pendulum with a quadratic cost functional.

In the first method a discretisation of a Hamiltonian system is taken asa symplectic Euler-scheme for Newton’s method which is used to find theoptimal control from an initial guess. According to the Pontryagin principlethis gives the optimal control, since the solution to a Hamiltonian systemgives the optimum to a control problem. The second method uses the matrixRiccati differential equation to find the optimal control for a linearised modelof the pendulum.

The result was two programs that find the optimal control. The firstmethod’s program demands clever initial guesses in order to converge. Thelinearised model’s solutions are only valid for a limited area, which turnedout to be surprisingly large.

Page 3: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

Sammanfattning

I den här rapporten implementeras två metoder för att finna den optimalastyrningen av en inverterad pendel med kvadratisk kostnadsfunktional.

I den första metoden används ett diskretiserad Hamiltonskt system somsymplektiskt Euler-schema för att iterera fram en lösning med Newtonsmetod från en startgissning. Enligt Pontryagins princip ger detta en op-timal lösning, då lösningen av ett Hamiltonskt system ger optimum till kon-trollproblem. Den andra metoden använder Riccatidifferentialekvationen förmatriser på en lineariserad modell av pendeln för att finna optimal styrning.

Resultatet var två program som finner den optimala styrningen. Denförsta metodens program kräver smarta startgissningar för att konvergera.Den lineariserade modellens lösningar har ett begränsat giltighetsområde somvisade sig vara överraskande stort.

Page 4: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 Problem formulation . . . . . . . . . . . . . . . . . . . 2

2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.1 Introduction to optimal control . . . . . . . . . . . . . 32.2 Pontryagin’s principle . . . . . . . . . . . . . . . . . . 42.3 The Riccati equation . . . . . . . . . . . . . . . . . . . 52.4 Newton’s method . . . . . . . . . . . . . . . . . . . . . 62.5 Model of the inverted pendulum . . . . . . . . . . . . . 7

3 Method and implementation . . . . . . . . . . . . . . . . . . . 93.1 Construction of the non-linear model . . . . . . . . . . 93.2 Construction of the linear model . . . . . . . . . . . . . 103.3 Implementations . . . . . . . . . . . . . . . . . . . . . 11

4 Discussion and results . . . . . . . . . . . . . . . . . . . . . . 14

1

Page 5: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

1 Introduction

As automated control technologies such as autonomous vehicles, robotics andprocess control are spreading, the need for optimal control grows. Whetherit is to optimise the energy used, the time elapsed, or another resource,using optimal control to enhance performance has become essential to staycompetitive.

Optimal control has been a field of study since its formulation mid 20thcentury, and has been much aided by the increase of computing power anddevelopment of efficient algorithms. Its different areas of application arediverse, and ranges from simple dynamic systems such as the inverted pen-dulum, to more complex ones such as aircrafts.

The inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium. Its variations are as many as thedifferent control laws used to stabilise it.

1.1 Problem formulation

The main concern of this report will be finding the optimal control of aninverted pendulum. One way of doing this is by linearising the differentialequation for the pendulum, and using the matrix Riccati equations to findan optimal control. Another way is by using the symplectic Euler method, asdescribed in [3].

The goals are thus:

• Implement a program finding the optimal control of an inverted pen-dulum, using

1. symplectic Euler method2. the matrix Riccati differential equation.

2

Page 6: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

2 Background

In this section the necessary theoretical groundwork for the report is done,consisting of a brief introduction to optimal control theory, the Riccati equa-tion, Newton’s method, and our model of the inverted pendulum.

2.1 Introduction to optimal control

Optimal control is a branch of control theory aiming to optimise the control ofa system. The goal is to steer the systems dynamics, modelled by differentialequations, using a control function called u. When this control is the optimalcontrol, it will be denoted u⇤. Below follows the mathematical framework ofan optimal control problem.

Given a system with state trajectory X(s) modelled by some differentialequation with initial values X0, the problem can be written

X 0(s) = f(X(s), u(s)),

X(0) = X0.(1)

The dimension of the control and the state space is given by

u : [0, T ] ! B ⇢ Rm,

x : [0, T ] ! Rd.(2)

B is the set of accepted controls, and T is the end time. The objective is tominimise the functional

g(X(T )) +

ZT

0

h(X(s), u(s))ds, (3)

where g is called a terminal cost, the positional goal of our state trajectory,and h is called a running cost. This running cost is the "resource" we wantto minimise. Minimising this functional is done by adjusting the controlfunction and this is how the control is made optimal.

The value function, v(x, t) is introduced,

v(x, t) := inf

X(t)=x,u2B

g(X(T )) +

ZT

t

h(X(s), u(s))ds

�. (4)

This function is the viscosity solution of the Hamilton-Jacobi-Bellman equa-tion [2],

3

Page 7: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

vt

(x, t) +H(vx

(x, t), x) = 0, (x, t) 2 Rd ⇥ (0, T ),

v(x, T ) = g(x), x 2 Rd.(5)

The gradient with respect to the spatial variable x is denoted vx

, and sim-ilarly v

t

is the partial derivative with respect to t. The function H is theHamiltonian, defined as

H(�, x) = min

u2B

�� · f(x, u) + h(x, u)

�,

� = vx

(X(t), t).(6)

A solution to the HJB-equation gives us the optimal control. A thoroughproof of the optimality granted by the value function solving the HJB equa-tion can be found in Theorem 38 in Mathematical Control Theory by Son-tag [5].

The strength of this approach is that the HJB equation turns the probleminto an optimisation at each stage. For every pair of X and t a control u isfound that minimises the Hamiltonian and solves the HJB equation.

2.2 Pontryagin’s principle

To arrive at the symplectic Euler method mentioned in the problem formu-lation, something more is needed, and it is the Pontryagin principle that willgive us this something. This principle states that if we have the optimal con-trol, u⇤ (which strictly should be denoted u⇤

(s)), and the optimal trajectory,X(s), of a problem, and find a � solving the equation

��0(s) = fx

(X(s), u⇤)) · �+ h

x

(s,X(s), u⇤),

�(T ) = gx

(X(T )),

then this � satisfies the equation

�f(X(s), u⇤) · �(s)� h(X(s), u⇤

) � �f(X(s), u) · �(s)� h(X(s), u), (7)

where u is any other control from the optimal control u⇤. Our � also fulfills

�(s) = vx

(s,X(s))

4

Page 8: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

when v is differentiable at (s,X(s)). If the Hamiltonian is differentiableand if v

x

is smooth then the pair �(s) and X(s) will be a solution to thebi-characteristic Hamiltonian system

X 0(s) = H

(�(s), X(s)), X(0) = X0, 0 < s < T,

��0(s) = Hx

(�(s), X(s)), �(T ) = gx

(X(T )), 0 < s < T.(8)

The gradients are denoted in the same way as before, so Hx

and H�

meansthe gradients with respect to xă and �. For a more rigorous derivation seeTheorem 7.4.17 and its Corollary 7.4.18 with proofs in [4].

This Hamiltonian system can be discretised to give the following equa-tions

¯Xn+1 � ¯X

n

��tH�

(

¯�n+1, ¯Xn

) = 0, ¯X0 = X0,¯�n

� ¯�n+1 ��tH

x

(

¯�n+1, ¯Xn

) = 0, ¯�N

= gx

(

¯XN

),(9)

which is called a symplectic Euler scheme.

2.3 The Riccati equation

If a system is linear and has a quadratic cost, the optimal control can befound using the Riccati equations. The system can then be described in acompact form with matrices,

x(t) = Ax(t) + Bu(t). (10)

The cost functional can be formed using matrices R, Q and S, where R is asymmetric m⇥m matrix, Q is a symmetric n⇥n matrix and S is a constantn⇥ n matrix,

x(T )0Sx(T ) +

ZT

0

(x(t)0Q(t)x(t) + u(t)0R(t)u(t))dt. (11)

This translates the HJB equation into an equation which can be solved moreeasily than the non-linear one discussed in the previous section.

0 = min

u2B

⇥x0Qx+ u0Ru+ v

t

(t, x) + vx

(t, x)0(Ax+Bu)⇤,

v(T, x) = x0Sx.(12)

5

Page 9: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

An ansatz for the value function is done with the introduction of the functionP which is an n⇥ n symmetric matrix of functions,

v(t, x) = x0P (t)x. (13)

This function, together with the terminal-value condition, v(T, x), gives usthe Riccati differential equations for P after some calculations, see Example3.2.2 from [1] or chapter 8.2, Linear Systems with Quadratic Control, in [5],

˙P (t) = �P (t)A+ A0P (t)� P (t)BR�1(t)B0P (t)�Q,

P (T ) = S.(14)

The optimal control is found by using the solution to the Riccati equations,

u⇤(t) = �R�1

(t)B0P (t)x(t). (15)

2.4 Newton’s method

The scalar version of Newton’s method is a method for solving systems onthe form f(y) = 0. With an initial guess of y, the equation is solved us-ing iterations y

n+1 = yn

+ �yn

where �yn

= �f(yn

)/f 0(y

n

), updating thederivative for every new position y. This scalar version can be generalisedby exchanging y with a system of variables, y, and the function f with avector-valued function F. Here f 0

(yn

) will then be replaced by a Jacobian,defined as J

ij

= @Fi/@yj

, turning the equation for updating the state intodyi = �J

�1yiF(yi). The equation can be solved by iterating

y

i+1 = y

i

� J

�1yiF(y

i

), (16)

for i = 1, 2 . . .M . The asymptotic order of convergence of Newton’s methodis quadratic, although it can be difficult finding an initial guess in the regionof convergence.

Other methods that rely on approximation of derivatives to produce aJacobian are called quasi-newton methods and are typically of a lower orderof convergence. An example of these methods is the one-dimensional se-cant method with an order of convergence around 1.618, recognisable as thegolden ratio. MATLABs integrated solver fsolve is an example of a methodcreating a Jacobian using numerical approximations of the derivatives.

6

Page 10: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

ă

Figure 1: Inverted pendulum

2.5 Model of the inverted pendulum

To derive the model of the pendulum, the system is first observed as a con-servative system. External and frictional forces are neglected. The quantitiesare regarded as dimensionless. This may be somewhat unphysical, but in thisreport this general equation serves our purpose.

Viewing the pendulum as a conservative system allows us to use theEuler-Lagrange equations in our derivation of the equations of motion,

d

dt

✓@L

@qj

◆=

@L

@qj

,

L = Ek

� Ep

.

(17)

Here q is a generalised coordinate, namely the x-coordinate, or the angle✓ between the pendulum and base, and L is the Lagrangian defined as thedifference between kinetic and potential energy, E

k

and Ep

, of the system.The model of the inverted pendulum is depicted in Figure 1. The pen-

dulum is considered a point mass, m2, at a distance l from the base, whichhas the mass m1. Their velocities are v1 for the base, and v2 for the pen-dulum. Both the base and the pendulum have one degree of freedom each:translational and rotational respectively.

Forming the Lagrangian we get the expression

L =

1

2

m1v21 +

1

2

m2v22 �m1gl sin ✓, (18)

7

Page 11: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

where the constant g = 9.82 is the gravitational constant. The velocities arederived from the change of position, v =

px2

+ y2,

v22 = x2,

v21 =

� ddt

�x� l sin ✓

��2+

� ddt(l cos ✓)

�2=

=

�x� l ˙✓ sin ✓

�2+

�l ˙✓ cos ✓

�2=

= x2 � 2l ˙✓x sin ✓ + l2 ˙✓2�sin

2 ✓ + cos

2 ✓�

= x2 � 2l ˙✓x sin ✓ + l2 ˙✓2.

(19)

Inserting these expressions into the Lagrangian gives us

L =

1

2

m1(x2 � 2l ˙✓x sin ✓ + l2 ˙✓2) +

1

2

m2x2 �m1gl sin ✓. (20)

Now the Euler-Lagrange equations can be used to determine the equationsof motion of the pendulum. The expressions

d

dt

✓@L

@ ˙✓

◆=

@L

@✓,

d

dt

✓@L

@x

◆=

@L

@x,

(21)

gives us the equations of motions after some simplifications. These are de-termined as being

0 = gcos✓ � x sin ✓ + l¨✓, (22)

0 = (m1 +m2)x�m1l¨✓ sin ✓ +m1l ˙✓2cos ✓. (23)

Setting all masses and lengths to one, l = m1 = m2 = 1, neglecting (23), andconsidering the acceleration of the base as a control, x = u, gives a controlledODE in (22).

8

Page 12: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

3 Method and implementation

An overview of how the optimal control problem is approached is found in thissection. Both the design and the implementation of the algorithm describedin the problem formulation are presented.

3.1 Construction of the non-linear model

The goal of the inverted pendulum is to reach the unstable equilibrium at✓ = ⇡/2, using as little force as possible. The model described in section 2.5is used, and the equation (22) as the differential equation for ✓, with x = u.

The state space will be of dimension two since its differential equation isof order two and the control is scalar making its dimension one. This makesd = 2 and m = 1 in (2). A new variable is introduced to reduce the orderof the differential equation, ˙✓(t) = (t), making the two variables the anglefrom the vertical plane and the angular velocity. Counter-clockwise is definedas positive.

Our problem is then

X(s) =

✓✓(s) (s)

◆, X 0

(s) = f(X(s), u(s)) =

u sin ✓ � g cos ✓

◆. (24)

The two parts of the value function are both defined as quadratic costs,

h(X(s), u(s)) =u2

2

,

g(X(T )) = K

✓✓(T )� ⇡

2

◆2

.

(25)

Here K is a constant which can be adjusted so that the penalty of acceleratingthe base is in proportion to the penalty of not reaching the goal.

These two functions are motivated by our wish to minimise the forceused on the base, force being proportional to acceleration. This choice alsosimplifies our comparison between the two methods, since the second onedemands a quadratic control.

The Hamiltonian can then be calculated as

H(�, X) = min

u2B

⇥� · f + h

⇤= min

u2B

⇥�1 + �2(u sin ✓ � g cos ✓) +

u2

2

⇤. (26)

9

Page 13: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

To minimise this function with respect to u, the expression is differentiatedand set equal to zero,

@

@u

✓� · f + h

◆= �2 sin ✓ + u = 0,

u = ��2 sin ✓.(27)

Plugging this back into the Hamiltonian, it can be written as

H(�, X) = �1 � (�2 sin ✓)

2

2

� g�2 cos ✓ (28)

Now the gradients of the Hamiltonian can be calculated as

H�

=

��2 sin2 ✓ � g cos ✓

◆, (29)

andH

X

=

✓�2 sin ✓(g � �2 cos ✓)

�1

◆. (30)

These will be used in the Hamiltonian system’s discrete counterpart, (9).Lastly the gradient of the function g is calculated,

gx

(X(t)) =

✓2K(✓(t)� ⇡

2 )

0

◆. (31)

This gradient is needed to determine the end value of the dual function,�(T ) = g

x

(X(T )).

3.2 Construction of the linear model

To simplify the construction of the linear model, a new angle is introduced,ˆ✓ = ✓+⇡/2. Using this angle in the expression (24), the second term becomesucosˆ✓ + gsinˆ✓. With a Taylor expansion of the two terms around ˆ✓ = 0 anew expression is found:

f(X(s), u(s)) =

✓ˆ

u+ gˆ✓ +O(

ˆ✓2)

◆. (32)

10

Page 14: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

The higher terms of ˆ✓ can be omitted since the expression is evaluated aroundˆ✓ = 0, where they become very small. This equation can be expressed as alinear system on the form (10) with the matrices

A =

✓0 1

g 0

◆, X(s) =

✓ˆ✓(s)ˆ (s)

◆, B =

✓0

1

◆, u(s) = u(s). (33)

One can also define a cost functional in the new linearised model,

g(x, t) = K ˆ✓(T )2

h(x, t) =

ZT

0

u2

2

ds(34)

which, using the notation found in (11), gives the two following matrices

R =

✓1

2

◆, S =

✓K 0

0 0

◆. (35)

3.3 Implementations

A discretisation of the state trajectory is made. We introduce N discretisa-tion points for the numerical approximations over a time [0, T ].

Method using symplectic Euler

To investigate how to solve (9) with the Hamiltonian (28), an algorithmdescribed in [3] will be implemented in MATLAB. The solution to F(Y) = 0

is wanted, and to determine the solution Newton’s method is implemented,Y

i+1 = Y

i

� J

�1YiF(Y

i

). A discrete version of the Hamiltonian boundarysystem called a symplectic Euler scheme is used as the function F(Y), see(9).

Together with the initial values and the end conditions on the dual vari-

11

Page 15: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

able the following function is acquired,

F =

0

BBBBBBBBBBBBB@

✓0 � y1 0 � y2

¯X2 � ¯X1 ��tH�

(

¯�2, ¯X1)

¯�1 � ¯�2 ��tHx

(

¯�2, ¯X1)

...¯XN

� ¯XN�1 ��tH

(

¯�N

, ¯XN�1)

¯�N�1 � ¯�

N

��tHx

(

¯�N

, ¯XN�1)

�N,1 � g

(✓N

, N

)

�N,2 � g

(✓N

, N

)

1

CCCCCCCCCCCCCA

. (36)

The solution vector Y is ordered in the following way,

Y =

0

BBBBBBBBBBB@

y1y2y3

...

y4N�1

y4N

1

CCCCCCCCCCCA

=

0

BBBBBBBBBBB@

x0

�0x1

�1...

xN

�N

1

CCCCCCCCCCCA

=

0

BBBBBBBBBBBBB@

✓0 0

�1,0�2,0...✓N

N

�1,N�2,N

1

CCCCCCCCCCCCCA

. (37)

When the function F is determined, the Jacobian can be calculated. Asmentioned before, there are ways of approximating the Jacobian with quasi-newton methods, but we opt to calculate it analytically to achieve a higherorder of convergence.

The Jacobian is determined by taking partial derivatives with respect tothe discretisation variables. Each row is a gradient of a function F1, F2, ..., F4N

12

Page 16: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

with respect to the variables y1, y2, ..., y4N ,

J =

0

BBBB@

@F1@y1

@F1@y2

. . . @F1@y4N

@F2@y1

. . ....

@F4N@y1

@F4N@y4N

1

CCCCA=

0

BBB@

@F1@Y...

@F4N@Y

1

CCCA

=

0

BBBBB@

@(✓0�y1)@✓0

@(✓0�y1)@ 0

. . . @(✓0�y1)@�2,N

@( 0�y2)@✓0

. . ....

@(�2,N�g (✓N , N ))@✓0

@(�2,N�g (✓N , N ))@�2,N

1

CCCCCA.

(38)

The Jacobians will be sparse and consequently MATLAB:s function sparse

is used to make calculations faster.The initial guesses of Y used are linear progression from the initial value

of the pendulum to the goal of the state trajectory ✓ = ⇡/2. From this initialguess of the angle, the angular velocities are calculated numerically. Thedual variables �1 and �2 are calculated via forward Euler from the end valueand the differential equation given in (9).

Method using the Riccati equation

To determine the optimal control of the linearised model, (32), the matrixRiccati differential equation, (14), is used. From the time evolution of thematrix P the optimal control is found.

Determining the elements of P for every time s allows us to calculate theoptimal control using (15). P has four elements, but only three need to besolved because it is symmetric. A simple forward Euler method solves this,

Pi+1 = P

i

+�t ˙Pi

, i = 1, 2, ..., N. (39)

Once P is known, the state trajectory can be found using once again forwardEuler, but on the differential equation

X 0(s) = (A� BR�1B0P (s))X(s). (40)

The optimal control u⇤ is found using (15).

13

Page 17: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

4 Discussion and results

Results

The two programs were implemented successfully. Some solutions for the twomethods are presented in Figure 6 and 7. Some comparisons between themcan be found in Figures 2, 3, and 4.

Figure 2: Comparison of the solutions of the two methods for a small startingdeviation from the target angle.

The program for solving the linear model behaved as expected. Its so-lutions’ deviation from the other method’s solutions turned out to be sur-prisingly small, and became evident only for rather big angles, as seen inthe three figures with comparisons. Only the last one clearly differs. Whenthe starting angle exceeded a certain deviation from the equilibrium, slightlyshort of ✓ = ⇡/2, the linearised model and its solutions became unrealistic.

The second program also behaved as expected. Given realistic initialguesses, the solutions converged, although the program did have problemsachieving the expected quadratic convergence associated with Newton’s method.This is probably due to some error among the analytically derived functionsgenerating the elements of the Jacobian.

The importance of the initial guesses was known from theory, and quickly

14

Page 18: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

Figure 3: Comparison of the solutions of the two methods for a mediumstarting deviation from the target angle.

Figure 4: Comparison of the solutions of the two methods for a large startingdeviation from the target angle.

15

Page 19: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

became central when analysing the program’s solutions. The standard initialguess used, a linear progression from the starting angle to the equilibrium,gave convergent results for angles close to the equilibrium, but gave divergentresults for all starting angles 0 � ✓ � �⇡. For these starting angles the initialguess needs more finess.

One example studied was the starting angle ✓ = 0. With the standardinitial guess the results diverged. This divergence seems intuitive: accelerat-ing the base can’t make the pendulum swing up if it is already parallel to themovement of the base. The region of convergence for the method could notbe reached. When the initial guess was changed to a linear guess from ✓ = 0

to ✓ = �3⇡/2, which is equivalent to ✓ = ⇡/2, a solution could be found, seeFigure 5.

Figure 5: Reaching the equilibrium through the means of a different initialguess for the non-linear model.

Using this solution as an initial guess for angles close to ✓ = 0 gave con-vergent solutions. Although not rigorously tested, this technique of dilutingold solutions seems a good way of extending our region of solvable startingangles.

Worth mentioning is that when using this technique for starting anglesslightly bigger than ✓ = 0, a different solution to the old one was found. Itbecomes obvious that the old solution is but a local minimum. This is anapparent weakness of our method: we cannot know if our solutions are globalor local.

16

Page 20: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

Figure 6: Solutions for the linearised method with several starting values.

Figure 7: Solutions for the non-linearised method with several starting values.

17

Page 21: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

Discussion

This short investigation of optimal control of an inverted pendulum is inmany ways incomplete. Many different aspects have been left unexploredand the two implementations presented remain a drop in the ocean.

Many extensions to this project can be imagined. Firstly the simplemodel of the pendulum could be reworked to a more refined one, or switchedto that of another type of pendulum, such as a double pendulum. Secondlydifferent kinds of Hamiltonians could be investigated. One could for instanceintroduce functions penalising the translational and angular velocity of thebase and the pendulum, as well as a terminal demand on the angular velocity.

On a final note one could also question the usefulness of this approachto optimal control and its relatability to real-life implementations. Since weassume perfect state knowledge, no friction, as well as zero disturbances tothe control and the system, the model is obviously idealised. One could arguethat optimising a feedback law might have been more useful in this sense.

18

Page 22: Optimal Control of Inverted Pendulum - KTHolofr/KEX/OskarStattinKEX.pdfThe inverted pendulum is one of the most studied examples of a dynam-ical system with an unstable equilibrium

Bibliography

[1] D. P. Bertsekas. Dynamic Programming and Optimal Control. AthenaScientific, 2000.

[2] L. C. Evans. Partial Differential Equations. American MathematicalSociety, 1998.

[3] A. Szepessy M. Sandberg. Convergence rates of symplectic pontryaginapproximations in optimal control theory. ESAIM: Mathematical Mod-elling and Numerical Analysis, 40(1), 2006.

[4] C. Sinestrari P. Cannarsa. Semiconcave Functions, Hamilton-JacobiEquations, and Optimal Control. Birkhäuser, 2004.

[5] E. D. Sontag. Mathematical Control Theory. Springer, 1998.

19