Save the princess

Save the princess!Simon Belak

@sbelaksimon@metabase.com

We will build an AI to play a silly little game by training a policy network defined using Cortex, using a hot new training algorithm we will implement from the paper first using Neanderthal and then make massively parallel using Onyx.

The game• Find the shortest path to the princess

• Moves: up, down, left, right

• Don’t fall off the edge of the world

The game• Find the shortest path to the princess

• Moves: up, down, left, right

Computers playing computer games

Reinforcement learning

• Interact with the environment [embodied cognition]

• Not a single solution but an action to take given environment [model of the world + model of self, consciousness?]

• Learns via positive/negative feedback

Reinforcement learning: how it’s usually done

Train a deep neural network using raw sensor data, usually pixels (ie. no feature engineering)

… but there is another way

population

mutate crossover

next generation

solution

jitter jitter … jitter

update

populate

sample weighted

Classic evolutionary algorithm Evolution strategies

combine weighted

Using ES to train a neural network

Benefits

• highly parallelizable • more robust (less hyperparameters, more

stabile, doesn’t care about the properties of reward function)

• can exploit structure• less computationally expensive

Downsides

• takes longer to converge

• noise must lead to different outcomes

Instead of backpropagation use ES on weights

Let’s build it!

Neanderthal

• Blazing fast matrix and linear algebra library

• Based on ATLAS and LAPACK

• Runs on CPUs and GPUs

• A study in writing efficient code

• Somewhat terse API (fluokitten helps)

x+y ax+y ax+by

1.1 ES parallelized

Onyxa masterless, cloud scale, fault tolerant,

high performance distributed computation system

[[:input :processing-1] [:input :processing-2] [:processing-1 :output-1] [:processing-2 :output-2]]

[{:flow/from :input-stream :flow/to [:process-adults] :flow/predicate :my.ns/adult? :flow/doc "Emits segment if an adult.”}]

workflow + flow conditions + catalogue [{:onyx/name :add-5

:onyx/fn :my/adder :onyx/type :function :my/n 5 :onyx/params [:my/n]}

{:onyx/name :in :onyx/plugin :onyx.plugin.core-async/input :onyx/type :input :onyx/medium :core.async :onyx/batch-size batch-size :onyx/max-peers 1 :onyx/doc "Reads segments from a core.async channel"}

{:onyx/name :out :onyx/plugin :onyx.plugin.core-async/output :onyx/type :output :onyx/medium :core.async :onyx/doc "Writes segments to a core.async channel"}]

[{:onyx/name :add-5 :onyx/fn :my/adder :onyx/type :function :my/n 5 :onyx/params [:my/n]}

{:onyx/name :in :onyx/plugin :onyx.plugin.core-async/input :onyx/type :input :onyx/medium :core.async :onyx/batch-size batch-size :onyx/max-peers 1 :onyx/doc "Reads segments from a core.async channel"}

{:onyx/name :out :onyx/plugin :onyx.plugin.core-async/output :onyx/type :output :onyx/medium :core.async :onyx/doc "Writes segments to a core.async channel"}]

[[:input :processing-1] [:input :processing-2] [:processing-1 :output-1] [:processing-2 :output-2]]

[{:flow/from :input-stream :flow/to [:process-adults] :flow/predicate :my.ns/adult? :flow/doc "Emits segment if an adult.”}]

workflow + flow conditions + catalogue

Describing computation

with data

update

outmonitor

populate

same channel

update

outmonitor

populate

accumulates state :(

update

outmonitor

populate

Resilience and handling state

• Activity log

• Window and trigger states checkpointed

• Resume points (transfer state from job to job)

• Configurable flux policies (continue/kill/recover)

Computation graphs are a great way to structure data processing code

2. Policy network

Cortex• Neural networks, regression and feature learning

• Clean idiomatic Clojure API

• Computation encoded as data (and makes good use of it)

• Uses core.matrix for heavy lifting

Encode princess = 1, hero = -1

3. Game

Simulation• Find the shortest path to the

princess

Reward function• Play the entire game (planning)

• Collect multiple playthoughts to lessen effects of randomness

Takeouts

Explore

Have fun

Go on an adventure!

QuestionsSimon Belak

@sbelaksimon@metabase.com

Save the princess

Software

The Witch Princess

The Colorless Princess

The Princess and the Goblin + The Princess and Curdie

The Princess Elizabeth

The Lobster Princess

The Princess and the Pea - pollyplattprimary.co.uk455127... · The princess was pea in the bed. The prince wet from the storm. The princess slept badly wanted to marry a real princess

The Princess 105 Princess St, Werrington Latest

“Princess Mirror-Belle and · 2020-06-11 · Mirror-Belle had 7 different ways, but we can do better. Help us be a brave knight and save the princess by making a cure! You will

The Princess Diaries

The Purple Princess

@ Usborne Activities Taken from Usborne English Readers Starter Level The Princess … · 2020. 5. 17. · The Princess and the Pea "l want to marry a princess - a real princess,"

THE PINK PRINCESS THE PINK PRINCESS Hello! I am the pink princess and I am very sad. Hello! I am the pink princess and I am very sad

The Princess Parrotfish

HAPPILY EVER AFTER: THE PRINCESS AND THE PEAd.site-cdn.net/916339b02a/a13ed7/the-princess-and-the-pea.pdf · HAPPILY EVER AFTER: THE PRINCESS AND THE PEA ... PRINCE PRINCESS QUEEN

Gitano & the Princess

THE PINK PRINCESS

Princess James Pappoe Jr. · Princess glances over to the figure. WOMAN Princess...get up. Princess stands up to realize her clothes are clean. She looks at the woman. PRINCESS Mama!

Save The Dateas1.wdpromedia.com/media/rundisney/pdf/princess/princess13program.pdf©Disney/CBS, Inc. Save The Date FEBRUARY 22-24, 2013. Table of Contents The Official Storybook Program:

The courageus princess

The mermaid princess