39
Advancements in Neural Program Synthesis Yufan Zhuang (yz3453) 1 Saikat Chakraborty (sc4537) 1 1 Columbia University April 4, 2019 (Columbia) Neural Program Synthesis April 4, 2019 1 / 39

Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Advancements in Neural Program Synthesis

Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1

1Columbia University

April 4, 2019

(Columbia) Neural Program Synthesis April 4, 2019 1 / 39

Page 2: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Overview

1 Problem Overview

2 Grammar and RL based NPSOverviewModel ArchitectureEvaluation

3 NPS with Inferred execution traceOverviewModel DetailsEvaluation

4 Summary & Observations

(Columbia) Neural Program Synthesis April 4, 2019 2 / 39

Page 3: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Problem Overview (contd.)

The Goal

Given a set of Inputs generate the corresponding Outputs.2 ways to address this:

Program Induction : Model the relation between Input and Outputby a neural network.

Difficult to model due to the discrete nature (often) of the I/O.Also because of non-differentiability of I/O relation.

Program Synthesis : Generate (or synthesize) a program that takesin the Input and generate Output

Do not have to model the input output behavior.Often some complex relations between I/O are implemented asfunctionality of code elements.

(Columbia) Neural Program Synthesis April 4, 2019 3 / 39

Page 4: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Problem Overview

Given set of inputs and outputs, the model is asked to give the correctprogram.

Existed Major Approaches and Corresponding Drawbacks

Neural-guided Search (Exponential Search Space)

Direct Synthesis (Could be Syntactically Wrong)

Differentiable Program with Gradient Methods (Difficult to Model)

(Columbia) Neural Program Synthesis April 4, 2019 4 / 39

Page 5: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Shortcomings of Current Direct Synthesis Methods

The model is typically a Neural Machine Translation Model(Inputs/Outputs -> Programs)

Two primary drawbacks

May not be a program: syntax correctness cannot be guaranteed(BLEU evaluation for NLP problems)

Program Aliasing: multiple programs can be semantically correct,but the supervised model would just go for the ground truth

(Columbia) Neural Program Synthesis April 4, 2019 5 / 39

Page 6: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Domain: Karel

Note that it has control flows but no data flows

(Columbia) Neural Program Synthesis April 4, 2019 6 / 39

Page 7: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Leveraging Grammar and Reinforcement Learning forNeural Program Synthesis

Rudy Bunel1, Matthew Hausknecht2, Jacob Devlin3, Rishabh Singh2,Pushmeet Kohli4

1University of Oxford

2Microsoft Research

3Google

4Deepmind

April 4, 2019

Bunel et al. (Oxford) RL Synthesis April 4, 2019 7 / 39

Page 8: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Goal of this Paper

Deep Learning Model to Generate Tokens Directly

Set up an Reinforcement Learning Framework to Overcome theProgram Aliasing Problem

Introduce a syntax checker to prune the space of possible programs

Bunel et al. (Oxford) RL Synthesis April 4, 2019 8 / 39

Page 9: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Problem Formulation

At training time, assumed to have access to N training samples, each withK input/output states and a ground truth program λ.

D = {({IOki }k=1...K , λi}i=1...N , s.t. λi (I

ki ) = Ok

i , ∀i ∈ 1...N, ∀k ∈ 1...K

The goal is to learn a synthesizer σ : {IOki }k=1...K −→ λ

At testing time, the test data is consisted of both specification examplesand held-out examples.

Dtest = {{IOkspecj }kspec=1...K , {IOktest

j }ktest=K+1...K ′}j=1...Ntest

In testing the output program λ is based on the specification sets, thenevaluated on the entire test set

Bunel et al. (Oxford) RL Synthesis April 4, 2019 9 / 39

Page 10: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Model Architecture

Each program is represented by a sequence of tokens

λ = [s1, s2, ..., sL]

where each token belongs to an alphabet Σ (dictionary).

The form of the model:

pθ(λi |{IOki }k=1...K ) =

Li∏t=1

pθ(st |s1, s2, ..., st−1, {IOki }k=1...K )

Bunel et al. (Oxford) RL Synthesis April 4, 2019 10 / 39

Page 11: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Model Architecture

Bunel et al. (Oxford) RL Synthesis April 4, 2019 11 / 39

Page 12: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Model Architecture

1 Each pair of IO is embedded jointly by a CNN (share wights)

2 One LSTM is run for each example (share wights)

3 At each time step, the IO pair embedding and the previous token isfed into the LSTM

4 The output of the LSTMs goes through a maxpool layer and a linearlayer to generate the next prediction after softmax

5 The syntax checker produces a mask on the output of the linear layerto filter out those have incorrect syntax

Bunel et al. (Oxford) RL Synthesis April 4, 2019 12 / 39

Page 13: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Objective Function

Maximum Likelihood Estimation The default solution is to do MLE,since this is a supervised learning problem.

θ∗ = argmaxθ∏i

pθ(λi |{IOki }k=1...K )

Drawbacks

Still has the programming aliasing problemExposure bias (the model won’t explore enough sample paths, since itonly sees the data distribution)

Bunel et al. (Oxford) RL Synthesis April 4, 2019 13 / 39

Page 14: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Objective Function

Reinforcement Learning A potentially better objective is to optimizeover the ecpectations

θ∗ = argmaxθLR(θ), where LR(θ) =∑i

(∑λ

pθ(λ|{IOki }k=1...K )Ri (λ))

where Ri (λ) denotes the reward for sampled programs. If a simulator isaccessible, R can be designed to optimize for generalization and forpreventing over-fitting. Other things like conciseness can also be takenaccount in R.

Bunel et al. (Oxford) RL Synthesis April 4, 2019 14 / 39

Page 15: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Objective Function

REINFORCE Trick/score function estimator Notice that the sum overall possible λ is intractable.

θ∗ = argmaxθLR(θ), where LR(θ) =∑i

(∑λ

pθ(λ|{IOki }k=1...K )Ri (λ))

We can only use monte-carlo methods to sample gradient, but how?

∇θLR(θ) =∑i

∑λ

∇θpθ(λ|{IOki }k=1...K )Ri (λ)

=∑i

∑λ

∇θ log(pθ(λ|{IOki }k=1...K ))Ri (λ)pθ(λ|{IOk

i }k=1...K )

≈∑i

S∑r=1

1

S∇θ log(pθ(λr |{IOk

i }k=1...K ))Ri (λ)

where λr ∼ pθ(λ|{IOki }k=1...K )

Bunel et al. (Oxford) RL Synthesis April 4, 2019 15 / 39

Page 16: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Objective Function

Approximate pθ with qθ Since the domain is discrete with small amountof training data per program, it is difficult to sample different programsaccording to the learned distribution. Thus, the model wouldn’t exploremuch, and this is not desirable.

At every time step, S most likely samples returned by a Beam Search areused to construct the approximate distribution qθ.

qθ(λr |{IOki }k=1...K ) =

pθ(λr |{IOki }k=1...K )∑

λr∈BS(pθ,S) pθ(λr |{IOki }k=1...K )

, otherwise 0

Bunel et al. (Oxford) RL Synthesis April 4, 2019 16 / 39

Page 17: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Objective Function

Approximate pθ with qθ Then the objective becomes tractable

LR(θ) ≈∑i

(∑λ

qθ(λ|{IOki }k=1...K )Ri (λ))

qθ approximates pθ at modes

Bunel et al. (Oxford) RL Synthesis April 4, 2019 17 / 39

Page 18: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Objective Function

Approximate pθ with qθ Using qθ, the objective function can be morecomplicated for a better exploration. Instead of optimizing over theexpected reward, a bag of C programs can be sampled and the best one iskept. By doing this, probability mass would be assigned to severalcandidate programs, thus resulting a higher diversity of outputs.

θ∗ = argmaxθ∑i

(∑

λ∈BS(pθ,S)C[ maxj∈{1,..,C}

Ri (λj)](∏

r∈1,...,Cqθ(λr |{IOk

i }k=1...K ))

Bunel et al. (Oxford) RL Synthesis April 4, 2019 18 / 39

Page 19: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Syntax Checker

Condition on the syntax If there is an syntax checker (serves like IDEauto-complete). Include the syntax checker in the likelihood, denote stx asthe event that the sampled program is syntactically correct.

p(λ|{IOki }k=1...K , stx) =

p(stx |λ, {IOki }k=1...K ) · p(λ|{IOk

i }k=1...K )

p(stx |{IOki }k=1...K )

∝ p(stx |λ, {IOki }k=1...K ) · p(λ|{IOk

i }k=1...K )

∝ p(stx |λ) · p(λ|{IOki }k=1...K )

At the token level, it is the same

p(st |s1, ..., st−1, {IOki }k=1...K , stx1,...,t) ∝ p(stx1,...,t |s1, ..., st)

· p(st |s1, ..., st−1, {IOki }k=1...K )

Bunel et al. (Oxford) RL Synthesis April 4, 2019 19 / 39

Page 20: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Syntax Checker

Jointly Learned Syntax If there isn’t syntax checker, we can train anetwork to learn that. Denote the checker network as syntaxLSTM gφ. Itsactivation is x −→ − exp(x)

In training, one additional loss term is added:

Lsyntax = −∑i

∑t

gφ(s it |s i1, ..., s it−1), where λi = [s i1, si2, ..., s

iL]

To penalize negative output (− exp gφ) for valid syntax.

Bunel et al. (Oxford) RL Synthesis April 4, 2019 20 / 39

Page 21: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Evaluation

Dataset: Synthetic dataset from sampling programs and sampling IOpairs. In the testing phase, 5000 programs are left out with 6 IO pairs splitinto a specification (5 pairs) and a held-out pair.

Top-1 Generalization Accuracy: the accuracy of the most likely programsynthesized having the correct behaviour across all input-output examples.

Exact Match Accuracy: the accuracy of the most likely programsynthesized is exactly the same as the reference program.

Bunel et al. (Oxford) RL Synthesis April 4, 2019 21 / 39

Page 22: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Experiment Results

Full dataset denotes the 1 million program dataset, small dataset containsonly 10,000 examples. RL beam div employs the richer objective function.RL beam div opt adds penalty for long programs on top of RL beam div.

Bunel et al. (Oxford) RL Synthesis April 4, 2019 22 / 39

Page 23: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Experiment Results

Top-k accuracy

MLE shows greater relative accuracy increases as k increases than RL.Methods employing beam search and diversity objectives reduce thisaccuracy gap by encouraging diversity in the beam of partial programs.

Bunel et al. (Oxford) RL Synthesis April 4, 2019 23 / 39

Page 24: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Experiment Results

Impact of Syntax

MLE leaned denotes the usage of syntaxLSTM, MLE handwritten denotesthe usage of a handwritten syntax checker, MLE large simply increases thesize of parameters to the same number as MLE leaned.

Just get a larger network

Bunel et al. (Oxford) RL Synthesis April 4, 2019 24 / 39

Page 25: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Experiment Results

Learned Syntax (Program A)Black indicates valid, white indicatesinvalid. Blue indicates thesyntaxLSTM predicts a valid tokento be invalid, red indicates thesyntaxLSTM predicts a invalid tokento be valid.

Bunel et al. (Oxford) RL Synthesis April 4, 2019 25 / 39

Page 26: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Improving Neural Program Synthesis with InferredExecution Trace

Richard Shin1, Illia Polosukhin2, Dawn Song1

1UC Berkley

2NEAR Protocol

April 4, 2019

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 26 / 39

Page 27: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Motivation

How to human thinks about problem solving

Given an Input and corresponding output, we think of sequence ofsteps.

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 27 / 39

Page 28: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Motivation Contd.

turnRight

turnRight

move

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 28 / 39

Page 29: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Motivation Contd.

Given the sequence of actions (i .e. execution trace), it is much easierto reason about the program.

For instance,

Given, < turnRight, turnRight,move >It is much easier to reason about the actual programThe actual program isrepeat(2) { turnRight() } move()

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 29 / 39

Page 30: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Overview of the Method

Figure: Traditional Program Synthesis approach

Figure: Program Synthesis With Inferred Execution Trace

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 30 / 39

Page 31: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Step 1: Trace Inference

Figure: Inference of the Execution Trace from Each of the Examples

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 31 / 39

Page 32: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Step 1: Trace Inference (Contd.)

For Each IO pair, a trace sequence is generated.

Convolution network is used for encoding the state andIntermediate States.

LSTM is used for generating sequence of actions.

Ideally, LSTM should mimic the change in state, but experimentshows, if the actual state after every action is provided as auxiliaryinput, it performs better.

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 32 / 39

Page 33: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Step 2: Code Synthesis

Figure: Synthesis of the actual code from the IO and also the inferred trace.

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 33 / 39

Page 34: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Step 2: Code Synthesis (Contd.)

LSTM is used to generated final for as a token sequence.

Inputs to the LSTM are:1 Previously generated code token.2 Attention context over inferred action sequence.3 Embedding of the IO pair through a convolution network.

Output of LSTM at time i corresponding to j ’th IO is oi ,j

Final token is generated as softmax(W .MaxPool(oi ,1, oi ,2, ..., oi ,N))where N is the number of IO pair.

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 34 / 39

Page 35: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Evaluation : Overall performance

Exact match is most stringent metric.

Gen. (generalization) is when the generated program passes all thetest cases, i.e. generates expected output.

Guided search is generation of K code sequence using guided beamsearch with the softmax probabilities from previous step. (very similarto DeepCoder[Balog et.al. ICLR 2017])

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 35 / 39

Page 36: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Evaluation : On different data stripe

Observations:

In no control flow, there is not branch, loop etc. hence code issequencetial, sequential models(i.e. LSTM) used in Step 1 and 2generates 100% correct code.

For programs with branches (and also loops), other models (Treebased, graph based:Allamanis et.al ICLR’18) are worth looking.

For Longer code sequences (30+ tokens), LSTM loses informationdue to long term dependency. Self attention models (Vaswani et.al.NIPS’17) can be worth looking here.

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 36 / 39

Page 37: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Neural Program Synthesis : Summary

Neural Program Synthesis is and interesting field that is gaining focusin past few years.

In last few years there are tons of papers published in differentprestigious conferences (i.e. NeurIPS, ICLR, ICML, POPL etc.)

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 37 / 39

Page 38: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

Neural Program Synthesis : Summary

Caveats:

The dataset this papers are on has no concept of variables.

All the programs are data dependent on a global state, and hence nosophisticated data dependency.

Synthesizing code for general purpose programming languages (i.e.Java, Python) with complex syntactic and semantic structure is anopen challenge as of now.

Opinion

For generating general purpose programs, some form of concreteprogram analysis technique is needed to augment ML basedtechniques.

The well-defined grammars and structural information in generalprograms should be utilized

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 38 / 39

Page 39: Advancements in Neural Program Synthesissuman/saikat_yufan_slides.pdfAdvancements in Neural Program Synthesis Yufan Zhuang (yz3453)1 Saikat Chakraborty (sc4537)1 1Columbia University

The End

Shin et al. (Berkley, NEAR) NPS with Execution trace April 4, 2019 39 / 39