13

Lab 6-2: Q Network for Cart Pole Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim <[email protected]>

Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Download PDF Report

Upload
others
View
16
Download
0

Embed Size (px)

Citation preview

Page 1: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Lab 6-2: Q Network for Cart Pole

Reinforcement Learning with TensorFlow&OpenAI GymSung Kim <[email protected]>

Page 2: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Cart Pole

https://gym.openai.com/docs

Page 3: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Random trials

Page 4: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Rewards

Page 5: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Cart Pole Q-network

(2)Ws(1)s

Page 6: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Q-Network training (Network construction)

(2)Ws(1)s

Page 7: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Q-Network training (linear regression)

(2)Ws(1)s

y = r + �maxQ(s0)

cost(W ) = (Ws� y)2

Page 8: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Code: Network and setup

Page 9: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Code: Training

Page 10: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Code: apply

Page 11: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Results: really poor!

Page 12: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Why does not work? Too shallow?

Page 13: Lab 6-2: Q Network for Cart Pole - GitHub Pageshunkim.github.io/ml/RL/rl06-l2.pdf · 2017-10-02 · import numpy as np import tensor flow as tf import gym gym. make( CartPole—vØ

Excise

• Why does not work?

• Hint: DQN

Health & Medicine

Documents

Sports

Total Gym Home Gyms & Exercise Machines | Total Gym

Total Gym Home Gyms & Exercise Machines | Total Gym

Documents

CENNIK Fitness&GYM 99zł / 1m-c Karnet Fitness&GYM + basen 169zł / 1m-c Karnet Fitness&GYM + basen VIP 199zł / 1m-c Karnet Fitness&GYM + basen 329zł / 1m-c Karnet Fitness&GYM +

CENNIK Fitness&GYM 99zł / 1m-c Karnet Fitness&GYM + basen 169zł / 1m-c Karnet Fitness&GYM + basen VIP 199zł / 1m-c Karnet Fitness&GYM + basen 329zł / 1m-c Karnet Fitness&GYM +

Documents

Gym Equipments Accessories In India | Unique Gym Equipment

Gym Equipments Accessories In India | Unique Gym Equipment

Healthcare

Documents

Documents

Documents

brought to you by gym lead machine GYM & PERSONAL …

brought to you by gym lead machine GYM & PERSONAL …

Documents

Bayou Black Gym Feb. 28, 2012 - TPCG · JULY 19 - Bayou Black Gym July 26 - Schriever Gym July 28 - Montegut Gym August 2 - East Houma Gym August 4 - Grand Caillou Gym • Round 2

Bayou Black Gym Feb. 28, 2012 - TPCG · JULY 19 - Bayou Black Gym July 26 - Schriever Gym July 28 - Montegut Gym August 2 - East Houma Gym August 4 - Grand Caillou Gym • Round 2

Documents

Health & Medicine

True Gym Whitepaper - True Gym - fitness and blockchain

True Gym Whitepaper - True Gym - fitness and blockchain

Documents

Total Gym 26000 Owner's Guide Gym 26000 Ow… · total gym 26000 owner’s guide owner’s guide tot al gym 26000 800 541 4900 efispor tsmedicine.com 3 parts identifier - total gym

Total Gym 26000 Owner's Guide Gym 26000 Ow… · total gym 26000 owner’s guide owner’s guide tot al gym 26000 800 541 4900 efispor tsmedicine.com 3 parts identifier - total gym

Documents

Documents

PowerPoint Presentation · import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.net.HttpURLConnection; import java.net.MalformedURLException;

PowerPoint Presentation · import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.net.HttpURLConnection; import java.net.MalformedURLException;

Documents

Technology

THE GYM - Harwell Campus OxfordThe Gym. THE GYM N Harwell Campus Gym & Cafe Runway Buildings and Café

THE GYM - Harwell Campus OxfordThe Gym. THE GYM N Harwell Campus Gym & Cafe Runway Buildings and Café

Documents

Sports

BRAIN GYM Gimnasia del Cerebro BRAIN GYM -ORIGEN

BRAIN GYM Gimnasia del Cerebro BRAIN GYM -ORIGEN

Documents

· Import from East (2.1.1+2_1.2+2.1.3+2.1.4) Import from Russia Federation Import from Ukraine Import from Uzbekistan Import from Kazakhstan Import from Turkmenistan Import from

· Import from East (2.1.1+2_1.2+2.1.3+2.1.4) Import from Russia Federation Import from Ukraine Import from Uzbekistan Import from Kazakhstan Import from Turkmenistan Import from

Documents

Education

Documents

Lifestyle

Status of CSR RL06 GRACE reprocessing and preliminary results · • Single ACC RL06 solutions from Nov 2016 to June 2017 will be ... is a 4-character mnemonic used to identify the

Status of CSR RL06 GRACE reprocessing and preliminary results · • Single ACC RL06 solutions from Nov 2016 to June 2017 will be ... is a 4-character mnemonic used to identify the

Documents

Export Export ---- Import Import Import Bureau Of ... · PDF fileExport Export ---- Import Import Import Bureau Of IndiaBureau Of IndiaBureau Of India 2nd Floor, -166, shrenikpark,

Export Export ---- Import Import Import Bureau Of ... · PDF fileExport Export ---- Import Import Import Bureau Of IndiaBureau Of IndiaBureau Of India 2nd Floor, -166, shrenikpark,

Documents

Deephaven Monday Tuesday Wednesday Thursday Friday … · 2019-09-30 · Gym: Statues Gym: Zookeeper Gym: Clean Yard Gym: Fastest Tagger (Pre-registration required) 14 15 Deadline

Deephaven Monday Tuesday Wednesday Thursday Friday … · 2019-09-30 · Gym: Statues Gym: Zookeeper Gym: Clean Yard Gym: Fastest Tagger (Pre-registration required) 14 15 Deadline

Documents

Documents

Documents

Sports