View
16
Download
0
Category
Preview:
Citation preview
Lab 6-2: Q Network for Cart Pole
Reinforcement Learning with TensorFlow&OpenAI GymSung Kim <hunkim+ml@gmail.com>
Cart Pole
https://gym.openai.com/docs
Random trials
Rewards
Cart Pole Q-network
(2)Ws(1)s
Q-Network training (Network construction)
(2)Ws(1)s
Q-Network training (linear regression)
(2)Ws(1)s
y = r + �maxQ(s0)
cost(W ) = (Ws� y)2
Code: Network and setup
Code: Training
Code: apply
Results: really poor!
Why does not work? Too shallow?
Excise
• Why does not work?
• Hint: DQN
Recommended