Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
MITSUBISHI ELECTRIC RESEARCH LABORATORIEShttp://www.merl.com
Value-Aware Loss Function for Model Learning inReinforcement Learning
Farahmand, A.-M.; Barreto, A.M.S.; Nikovski, D.N.
TR2016-153 December 2016
AbstractWe consider the problem of estimating the transition probability kernel to be used by a model-based reinforcement learning (RL) algorithm. We argue that estimating a generative modelthat minimizes a probabilistic loss, such as the log-loss, might be an overkill because such aprobabilistic loss does not take into account the underlying structure of the decision problemand the RL algorithm that intends to solve it. We introduce a loss function that takes thestructure of the value function into account. We provide a finite-sample upper bound forthe loss function showing the dependence of the error on model approximation error and thenumber of samples.
European Workshop on Reinforcement Learning (EWRL)
This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy inwhole or in part without payment of fee is granted for nonprofit educational and research purposes provided that allsuch whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi ElectricResearch Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and allapplicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall requirea license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved.
Copyright c© Mitsubishi Electric Research Laboratories, Inc., 2016201 Broadway, Cambridge, Massachusetts 02139