30
Escaping the local minimum By Kenny Friedman MAS.S63 Integrative Theories of Mind and Cognition May 9, 2016

Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Escaping the local

minimum

By Kenny Friedman

MAS.S63 Integrative Theories of

Mind and Cognition

May 9, 2016

Page 2: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence
Page 3: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence
Page 4: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence
Page 5: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Escaping the local

minimum

By Kenny Friedman

MAS.S63 Integrative Theories of

Mind and Cognition

May 9, 2016

Page 6: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Background / The Problem

Vision

3 Examples

Contributions

Agenda

Page 7: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Background / The Problem

Vision

3 Examples

Contributions

Agenda

For each example:

Define the Problem

Classic Method

Potential Modern Method

Page 8: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Electrical Engineering & Computer Science

Undergraduate, Junior

Taking 2 classes in AI this semester:

MAS.S63 & 6.803

Background

Page 9: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Electrical Engineering & Computer Science

Undergraduate, Junior

Taking 2 classes in AI this semester:

MAS.S63 & 6.803

Background

20161950

pape

rs r

ead

Page 10: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Electrical Engineering & Computer Science

Undergraduate, Junior

Taking 2 classes in AI this semester:

MAS.S63 & 6.803

Background

20161950

pape

rs r

ead

6.803

mas.s63

Page 11: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Electrical Engineering & Computer Science

Undergraduate, Junior

Taking 2 classes in AI this semester:

MAS.S63 & 6.803

Background

20161950

pape

rs r

ead

6.803

mas.s63

Page 12: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

AI has been performing gradient descent for decades

Then the field gets caught in a local minimum

Rarely looks around to see if it is stuck

The Vision

Page 13: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

AI has been performing gradient descent for decades

Then the field gets caught in a local minimum

Rarely looks around to see if it is stuck

The Vision

Take methods from recent success

Apply them to classic models & problems

NNs as a substrate for all previous work in the field

Page 14: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

1960s

SAINT (Symbolic automatic integrator)

ELIZA (chatterbot)

SHRDLU (NPL, Terry Winograd)

Gradient Descent

Page 15: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

1960s

SAINT (Symbolic automatic integrator)

ELIZA (chatterbot)

SHRDLU (NPL, Terry Winograd)

Gradient Descent

"Within a generation ... the problem of creating 'artificial intelligence' will

substantially be solved.” Minsky, 1967

Page 16: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

1960s

SAINT (Symbolic automatic integrator)

ELIZA (chatterbot)

SHRDLU (NPL, Terry Winograd)

Gradient Descent

"In from three to eight years we will have a machine with the general

intelligence of an average human being.”

Minsky, 1970

"Within a generation ... the problem of creating 'artificial intelligence' will

substantially be solved.” Minsky, 1967

Page 17: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

AI Winter Strikes AI Winter Strikes

AI Winter Strikes AI Winter Strikes

Page 18: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

1980s

Expert Systems

Knowledge Based Systems (Cyc)

“Intelligent Agents”

Gradient Descent

Page 19: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

AI Winter Strikes AI Winter Strikes

AI Winter Strikes AI Winter Strikes

Page 20: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

2010s

Deep Learning

Statistical Models

Voice Recognition Paired with NLP

Gradient Descent

Page 21: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

A Neural Net would predict a third winter any moment

Escape the current local minimum

(of superhuman perception)

Future?

Can neural networks become the substrate of all AI systems?

(What If… Poggio)

Page 22: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

1. Generic Symbol Manipulation / Logic

Systems

Three Examples

3. Language / Story Understanding

2. Minsky’s Multiplicity

Page 23: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Symbolic Integral Calculus (1960)

Recent Examples of Progress: NTM

Logic as vectors of real numbers

Generic Symbol Manipulation / Logic Systems

Page 24: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Architectures: Leabra, MicroPsi.

How much can emerge solely from NN?

Minsky’s Multiplicity

1960: 5 Areas: Search, Pattern-Recognition, Learning, Planning, and Induction

2006

Page 25: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Chomsky, Winston.

Merge operator.

Inner Language Hypothesis, Strong Story Hypothesis

Language / Story Understanding

reflection, cultural bias, personality understanding, question answering, onset detection, trouble anticipation, similarity measurement, similarity based retrieval, question driven interpretation, analogical interpretation, reader aware story telling, persuasion, and summary.

Word2Vec, RNNs, show promise but distance

Page 26: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Reviewed AI’s tendency to fall into a local minimum

Provide three examples of prior systems & models that can not

be done with deep learning

Contributions

Suggest that a research focus should be placed on enabling deep learning to prove neural nets can be the right level of

abstraction

Page 27: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence

Thanks

Page 28: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence
Page 29: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence
Page 30: Escaping the local minimum - Massachusetts Institute of ......SHRDLU (NPL, Terry Winograd) Gradient Descent "In from three to eight years we will have a machine with the general intelligence