16
| | Final Recitation 17.12.2014 Dario Brescianini, Robin Ritz 1 Dynamic Programming and Optimal Control

Recitation 13 Slides -DP

  • Upload
    ptkien

  • View
    224

  • Download
    0

Embed Size (px)

DESCRIPTION

Recitation 13 Slides -DP

Citation preview

| |Final Recitation17.12.2014 Dario Brescianini, Robin Ritz 1Dynamic Programming and Optimal Control| | Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic SP Deterministic Continuous-Time Optimal Control17.12.2014 Dario Brescianini, Robin Ritz 2Outline| | 17.12.2014 Dario Brescianini, Robin Ritz 3Overview| | Basic Problem Alternative Problem Formulation Reformulations Time lag, correlated disturbances, forecasts, 17.12.2014 Dario Brescianini, Robin Ritz 4Dynamic Programming Algorithm (DPA)| | Basic idea: Principle of Optimality Algorithm: Minimizing the recursion equation for each andgives us the optimal policy:17.12.2014 Dario Brescianini, Robin Ritz 5Dynamic Programming Algorithm (DPA)| | Consider now problems where is a finite set, No disturbance. Convert DP to SP (and vice versa) DP: SP: Viterbi Algorithm17.12.2014 Dario Brescianini, Robin Ritz 6Deterministic Systems and the Shortest Path| | DP finds all optimal paths to end node. Sometimes not needed. Exploit structure of these problems to come up with efficient algorithms for solving shortest path problems:17.12.2014 Dario Brescianini, Robin Ritz 7Deterministic Systems and the Shortest PathLabel Correcting AlgorithmStep 1: Remove a node i from OPEN and for each child j of i, execute step 2.Step 2: If di+ aij< min{dj,UPPER}, set dj= di+ aijand set i to be the parent of j. In addition, if jt, place j in OPEN if it is not already in OPEN, while if j=t, set UPPER to the new value di+aitof dt.Step 3: If OPEN is empty, terminate; else go to step 1.| | Consider time-invariant system with infinite horizon: Optimal policy is stationary: Optimal cost solves Bellmans equation:17.12.2014 Dario Brescianini, Robin Ritz 8Infinite Horizon Problems| | Stochastic Shortest Path problems: Cost-free termination state : a policyand an integersuch that:Infinite Horizon Problems: Stochastic Shortest Path 17.12.2014 Dario Brescianini, Robin Ritz 9| | Value iteration: Step 1: Choose an initial guess. Step 2: Update cost values with the value iteration formula: Step 3: If converged for all, terminate. Else go to step 2.Infinite Horizon Problems: Stochastic Shortest Path 17.12.2014 Dario Brescianini, Robin Ritz 10| | Policy iteration: Step 1: Choose an initial stationary policy. Step 2: Policy evaluation (compute cost of current policy): Step 3: Policy improvement (find a better policy): Step 4: Iffor all, terminate. Else go to step 2.Infinite Horizon Problems: Stochastic Shortest Path 17.12.2014 Dario Brescianini, Robin Ritz 11(lin. sys. of eq.)| | Linear programming: Optimal costsolves the following linear program: For each admissible pairwe get one linear constraintInfinite Horizon Problems: Stochastic Shortest Path 17.12.2014 Dario Brescianini, Robin Ritz 12| | Discounted problems: Discounted cost:Infinite Horizon Problems: Discounted Problems17.12.2014 Dario Brescianini, Robin Ritz 13| | Basic Problem No noise: deterministic. Goal: Find an admissible control trajectory ,, and corresponding state trajectorywhich minimize the cost. Solution is found by HJB or Minimum Principle.17.12.2014 Dario Brescianini, Robin Ritz 14Deterministic Continuous-Time Optimal Control| | Hamilton-Jacobi-Bellman Equation (cont.-time analog to DPA) Derived by discretizing and taking limits of DPA. Partial differential equation. Very hard to solve! Usually guess a solution and proof that is satisfies HJB. Sufficient condition. Optimal policy:that minimize RHS of HJB.17.12.2014 Dario Brescianini, Robin Ritz 15Deterministic Continuous-Time Optimal Control| | Minimum Principle (Only finds optimal solution for a specific initial condition) Define Hamiltonian: Then: Only necessary conditions. Various extensions (e.g. fixed terminal state, ).17.12.2014 Dario Brescianini, Robin Ritz 16Deterministic Continuous-Time Optimal Control