Action-Perception-Learning Cycles 2012 Fall Graduate Course Byoung-Tak Zhang Department of Computer...
If you can't read please download the document
Action-Perception-Learning Cycles 2012 Fall Graduate Course Byoung-Tak Zhang Department of Computer Science and Engineering & Cognitive Science and Brain
Action-Perception-Learning Cycles 2012 Fall Graduate Course
Byoung-Tak Zhang Department of Computer Science and Engineering
& Cognitive Science and Brain Science Programs Seoul National
University http://bi.snu.ac.kr/
Slide 2
What is a Learning System? Learning is the improvement of
performance in some environment through the acquisition of
knowledge resulting from experience in that environment. the
improvement of behavior on some performance task through
acquisition of knowledge based on partial task experience 2012 (c)
SNU Biointelligence Lab, http://bi.snu.ac.kr/ 2
Slide 3
Activation Function Scaling Function Output Comparison
Information Propagation Error Backpropagation Input x 1 Input x 2
Input x 3 Output Input LayerHidden LayerOutput Layer Weights
Activation Function Machine Learning: An Example 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/ 3
Slide 4
Application Example: Autonomous Land Vehicle (ALV) NN learns to
steer an autonomous vehicle. 960 input units, 4 hidden units, 30
output units Driving at speeds up to 70 miles per hour ALVINN
System 2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
4
Slide 5
Google Self-Driving Car DARPA Grand Challenge (2005) DARPA
Grand Challenge DARPA Urban Challenge (2007) DARPA Urban Challenge
Google Self-Driving Car (2009) Google Self-Driving Car 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/ 5
Slide 6
Machine Learning (ML): Three Tasks Supervised Learning Estimate
an unknown mapping from known input and target output pairs Learn f
w from training set D = {(x,y)} s.t. Classification: y is discrete
Regression: y is continuous Unsupervised Learning Only input values
are provided Learn f w from D = {(x)} s.t. Compression Clustering
Reinforcement Learning Not target, but rewards (critiques) are
provided sequentially Learn a heuristic function f w from D t = {(s
t,a t,r t ) | t = 1, 2, } s.t. Action selection Policy learning
Zhang, B.-T., Next-Generation Machine Learning Technologies,
Communications of KIISE, 25(3), 2007 6
From Machine Learning to Brain- Like Cognitive Learning
Slide 9
Machine Learning vs. Human Learning Machine Learning Clear
separation of learning and inference Examples are assumed to be
statistically independent Mainly numerical, quantitative change
One-shot learning is difficult Requires uniquely labeled examples
(supervised classification) Good at discrimination and
classification (discriminative) Human Learning Learning and
inference interleaved Previous learning affects the next learning
(dynamic) Relational, qualitative change possible One-shot learning
is frequent Learns from unlabeled or self- labeled examples (self-
supervised) Can generate prototypes and instances (generative) 2012
(c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 9
11 Humans and Computers Current Computers What Kind of
Computers? Human Computers The Entire Problem Space 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/
Slide 12
Cognitive Systems 12 Openness Perception Action Cognitive
SystemCognitive Computing Real-Time Dynamics Multisensory
Integration Sequential Generation Cognitive Systems Require
Cognitive Computing or Cognitive Information Processing Cognitive
Systems Require Cognitive Computing or Cognitive Information
Processing Zhang, B.-T., Communications of KIISE, 30(1):75-111,
2012 2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Slide 13
TU Munich Rosie the Cognitive Robot 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/ 13
Slide 14
Apple Siri Personal Assistant 14 2012 (c) SNU Biointelligence
Lab, http://bi.snu.ac.kr/
Slide 15
Toward Human-Level Computational Intelligence: A Perspective of
the SNU Biointelligence Lab Q1: What capability is fundamentally
missing for achieving human-level computational intelligence? A1:
Human-level machine learning that enables rapid, flexible, and
robust decisions and actions in dynamic and uncertain environments.
Q2: What aspect is the most essential to study human-level machine
learning? A2: Lifelong learning with perception-action cycles, i.e.
the circular flow of information that takes place between the
organism and its environment in the course of a sensory-guided
sequence of behavior towards a goal (Fuster, 2004). Q3: What
capabilities are required for lifelong learning in
perception-action cycle systems? A3: Dynamic, incremental, online,
and predictive learning. Flexible representation and fast
reorganization. Multisensory integration, sensorimotor imagery, and
sequential decision making. Active, selective attention. Balancing
exploration and exploitation. Self-awareness, motivation,
self-sustainability. 15 2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
Slide 16
Course Introduction From machine learning to brain-like
cognitive learning Brain as a physical, thermodynamic computer
Perception-action cycles and Carnot cycles Models of
action-perception-learning cycles 2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/ 16
Slide 17
Brain as a Physical, Thermodynamic Computer
Slide 18
Brain is an open, dissipative system, operating far from
thermodynamic equilibrium. Brain requires energy and matter to
exchange with its environment to maintain stability. Brain can be
excited internally by chemical (enzymes) and electrical means
(action potentials) as well as externally. Continuous sensing of
external world and internal world. Continuous action on external
world and internal world. 2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/ 18
Slide 19
Mapping the World 2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/ 19
Free Energy and the Perception-Action Cycle [Friston, Trends in
Cognitive Sciences, 2009] 2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/ 38
Slide 39
Reinforcement Learning and the Perception-Action Cycles [Tishby
& Polani, 2010] = (information-to-go) (value-to-go) 2012 (c)
SNU Biointelligence Lab, http://bi.snu.ac.kr/ 39
Slide 40
Brain Mechanisms for the Perception- Action-Learning Cycle
Slide 41
Brain Computation: Speed, Flexibility, Robustness How can brain
computation be so fast, flexible, and robust in a changing
environment? Fast Object recognition: within 100 ms Anomaly
detection: N400, P600 Instant decision-making Flexible Invariant to
shift, scale, and rotation Various utterances for the same meaning
Art, music, literature, and dancing Robust Cluttered image Noisy
speech Intention reading under complex situations What brain
mechanisms for information processing and organization allow this?
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 41
Slide 42
42 Language Processing in the Brain N400: a brain wave related
to linguistic processes. Increased when semantically mismatched
Fig. 9.30: ERP waveforms differentiate between congruent words at
the end of sentences (work) and anomalous last words that do not
fit the semantic specifications of the preceding context (socks).
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Slide 43
43 Syntactic Processing in the Brain LAN (left anterior
negativity): negative wave over the left frontal areas when words
violate the required word category in a sentence (syntactic
violation) e.g. the red eats, he mow ERPs related to semantic and
syntactic processing. Semantic Syntactic 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/
Population Coding (Representation) 2012 (c) SNU Biointelligence
Lab, http://bi.snu.ac.kr/ 47 Rate Coding Gain Coding
Slide 48
Probabilistic Inference with Population Codes 48 [Knill and
Pouget, Trends in Neurosciences, 2004] 2012 (c) SNU Biointelligence
Lab, http://bi.snu.ac.kr/
Slide 49
Dynamics in Sensory Cue Integration 49 [Deneve et al., Nature
Neuroscience, 2001, from Knill and Pouget, Trends in Neurosciences,
2004] 2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Slide 50
Models of Perception-Action- Learning Cycles
Slide 51
Markov Models (Markov Chains) First-order Markov Model (Markov
Chain) Second-order Markov Model 2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/ 51
Filtering / Tracking We want to track the unknown state x of a
system as it evolves over time based on the (noisy) observations y
that arrive sequentially. y t+1 ytyt y t-1 x t-1 xtxt x t+1 state
p(x t |x t-1 ) Observation Transition p(y t |x t ) 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/ 53
Slide 54
Linear Dynamical Systems (Kalman Filters) 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/ 54
Slide 55
55 Kalman Filter Process to be estimated: y k = Ay k-1 + Bu k +
w k-1 z k = Hy k + v k Process Noise (w) with covariance Q
Measurement Noise (v) with covariance R Kalman Filter Predicted: -
k is estimate based on measurements at previous time-steps k = - k
+ K(z k - H - k ) Corrected: k has additional information the
measurement at time k K = P - k H T (HP - k H T + R) -1 - k = Ay
k-1 + Bu k P - k = AP k-1 A T + Q P k = (I - KH)P - k 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/
Slide 56
Filtering Discrete x Continuous x [Barber et al., 2011] 2012
(c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 56
Slide 57
Smoothing Parallel Smoothing Sequential Smoothing Discrete x
Continuous x [Barber et al., 2011] 2012 (c) SNU Biointelligence
Lab, http://bi.snu.ac.kr/ 57
Sequential Importance Resampling or Particle Filter [Barber et
al., 2011] 2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
60
Slide 61
Example: PF with N=4 [Barber et al., 2011] 2012 (c) SNU
Biointelligence Lab, http://bi.snu.ac.kr/ 61
Slide 62
Course Overview Action-Perception-Learning Cycles
Slide 63
Course Description How can the brain learn so fast, flexibly,
and robustly? What representational mechanisms and organizational
principles does the brain use? How can we apply these principles to
constructing intelligent cognitive machines that learn like humans?
To address these questions, it is important to observe that the
brain is embodied with sensors and actuators, and interacts with
its environment in a continuous perception-action cycle. Living in
a dynamic environment under uncertainty requires the brain to learn
moment by moment in real time and incrementally in this continuous,
rapid perception-action cycle. In this course we review recent
experimental and theoretical work on perception-action cycles and
neural coding principles in the brain. We also study mathematical
tools developed in information theory, control theory, and Bayesian
statistics that may be useful to model the biological information
processing in the brain. The goal is to develop computational
models of sequential learning processes, i.e.
action-perception-learning cycle machines, that enable rapid,
continuous, and reliable action and decision-making in a changing
environment over an extended period of time or lifelong. 2012 (c)
SNU Biointelligence Lab, http://bi.snu.ac.kr/ 63
Slide 64
Plan Part I: Neurocognitive Models Cortical Models Language
Models Thermodynamic Models Free Energy Models Decision-Theoretic
Models Information-Theoretic Models Exam 1: Thursday, Oct. 18, 2012
Part II: Computational Models Markov Models Dynamical Systems
Kalman Filters Probabilistic Population Codes Particle Filters Exam
2: Thursday, Nov. 29, 2012 2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/ 64