29
Social Media & Text Analysis lecture 9 - Deep Learning for NLP CSE 5539-0010 Ohio State University Instructor: Wei Xu Website: socialmedia-class.org some slides are adapted from Richard Socher, Greg Durret, Chris Dyer, Dan Jurafsky, Chris Manning

Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Social Media & Text Analysis lecture 9 - Deep Learning for NLP

CSE 5539-0010 Ohio State UniversityInstructor: Wei Xu

Website: socialmedia-class.org

some slides are adapted from Richard Socher, Greg Durret, Chris Dyer, Dan Jurafsky, Chris Manning

Page 2: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

A Neuron• If you know Logistic

Regression, then you already understand a basic neural network neuron!

AsingleneuronAcomputa*onalunitwithn(3)inputs

and1outputandparametersW,b

Ac*va*onfunc*on

Inputs

Biasunitcorrespondstointerceptterm

Output

Page 3: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

A Neuronis essentially a binary logistic regression unit

hw,b(x) = f (wTx + b)

f (z) = 11+ e−z

w,baretheparametersofthisneuroni.e.,thislogis3cregressionmodel

b:Wecanhavean“alwayson”feature,whichgivesaclassprior,orseparateitout,asabiasterm

Page 4: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

A Neural Network= running several logistic regressions at the same time

Ifwefeedavectorofinputsthroughabunchoflogis6cregressionfunc6ons,thenwegetavectorofoutputs…

Page 5: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

A Neural Network= running several logistic regressions at the same time

…whichwecanfeedintoanotherlogis2cregressionfunc2on

Itisthelossfunc.onthatwilldirectwhattheintermediatehiddenvariablesshouldbe,soastodoagoodjobatpredic.ngthetargetsforthenextlayer,etc.

Page 6: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

A Neural Network= running several logistic regressions at the same timeBeforeweknowit,wehaveamul3layerneuralnetwork….

Page 7: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

f : Activation Function Wehave

Inmatrixnota/on

wherefisappliedelement-wise:

a1

a2

a3

a1 = f (W11x1 +W12x2 +W13x3 + b1)a2 = f (W21x1 +W22x2 +W23x3 + b2 )etc.

z =Wx + ba = f (z)

f ([z1, z2, z3]) = [ f (z1), f (z2 ), f (z3)]

W12

b3

Page 8: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Activation Functionlogis'c(“sigmoid”)tanh

tanhisjustarescaledandshi7edsigmoid tanh(z) = 2logistic(2z)−1

Page 9: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Activation Functionhardtanhso*sign rec$fiedlinear(ReLu)

•  hardtanhsimilarbutcomputa3onallycheaperthantanhandsaturateshard.•  GlorotandBengio,AISTATS2011discussso*signandrec3fier

rect(z) =max(z, 0)softsign(z) = a1+ a

Page 10: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Non-linearity• Logistic (Softmax) Regression only gives linear

decision boundaries

Page 11: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Non-linearity• Neural networks can learn much more complex

functions and nonlinear decision boundaries!

Page 12: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Non-linearityDeepNeuralNetworks

Adopted from Chris Dyer

}output of first layer

z = g(Vg(Wx+ b) + c)

z = VWx+Vb+ c

With no nonlinearity:

z = Ux+ dEquivalent to

Input OutputHiddenLayer

Page 13: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Non-linearityDeepNeuralNetworks

Input OutputHiddenLayer

[good]

[not]

‣ NodesinthehiddenlayercanlearninteracEonsorconjuncEonsoffeatures

II

notORgood

y = �2x1 � x2 + 2 tanh(x1 + x2)

Page 14: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

What about Word2vec (Skip-gram and CBOW)?

Page 15: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

So, what about Word2vec (Skip-gram and CBOW)?

It is not deep learning — but “shallow” neural networks.It is — in fact — a log-linear model (softmax regression).

So, it is faster over larger dataset yielding better embeddings.

Page 16: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Learning Neural NetworksLearningNeuralNetworks

changeinoutputw.r.t.input

changeinoutputw.r.t.hidden

changeinhiddenw.r.t.input

‣ CompuEngtheselookslikerunningthisnetworkinreverse(backpropagaEon)

‣ I’veomi3edsomedetailsabouthowwegetthegradients

Input OutputHiddenLayer

Page 17: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Strategy for Successful NNs• Select network structure appropriate for problem

- Structure: Single words, fixed windows, sentence based, document level; bag of words, recursive vs. recurrent, CNN, …

- Nonlinearity • Check for implementation bugs with gradient checks • Parameter initialization • Optimization tricks • Check if the model is powerful enough to overfit

- If not, change model structure or make model “larger” - If you can overfit: regularize

Page 18: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Neural Machine Translation

Page 19: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Recurrent Neural Network (RNN)

Source: Colah’s Blog

Page 20: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Recurrent Neural Network (RNN)

Source: Colah’s Blog

Page 21: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Long Short-Term Memory Networks (LSTM)

Source: Colah’s Blog

(Hochreiter & Schmidhuber, 1997)

Page 22: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Long Short-Term Memory Networks (LSTM)

Source: Colah’s Blog

Page 23: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Gated Recurrent Unit (GRU)

(Cho et al., 2014)

Source: Colah’s Blog

Page 24: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Twitter Conversation Data

Page 25: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Neural Conversation

Source: Google’s blog

Page 26: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Neural Conversation

Source: A Persona-Based Neural Conversation Model, Li et al. (ACL 2016)

modeling speakers

Page 27: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Neural Network Toolkits★ PyTorch: http://pytorch.org/

- Facebook AI Research and many others • Tensorflow: https://www.tensorflow.org/

- By Google, actively maintained, bindings for many languages • DyNet: https://github.com/clab/dynet

- CMU and other individual researchers, dynamic structures that change for every training instance

• Caffe: http://caffe.berkeleyvision.org/ - UC Berkeley, for vision

• Theano: http://deeplearning.net/software/theano - University of Montreal, less and less maintained

Page 28: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Happy Turkey Day!

Source: http://www.butterball.com/how-tos/stuffing-vs-dressing

Page 29: Social Media & Text Analysissocialmedia-class.org/slides/lecture9_deeplearning.pdf · 2020-03-24 · Wei Xu socialmedia-class.org A Neuron is essentially a binary logistic regression

Wei Xu ◦ socialmedia-class.org

Instructor: Wei Xuwww.cis.upenn.edu/~xwe/

Course Website: socialmedia-class.org