View
40
Download
0
Category
Tags:
Preview:
DESCRIPTION
Information Geometry and Neural Netowrks. Shun-ichi Amari RIKEN Brain Science Institute Orthogonal decomposition of rates and (higher-order) correlations Synchronous firing and higher correlations Algebraic singularities caused by multiple stimuli - PowerPoint PPT Presentation
Citation preview
Information Geometryand Neural Netowrks
Shun-ichi Amari RIKEN Brain Science Institute Orthogonal decomposition of rates and (higher-order) correlations
Synchronous firing and higher correlations
Algebraic singularities caused by multiple stimuli
Dynamics of learning in multiplayer perceptrons
Information GeometryInformation GeometryInformation GeometryInformation Geometry
Systems Theory Information Theory
Statistics Neural Networks
Combinatorics PhysicsInformation Sciences
Riemannian ManifoldDual Affine Connections
Manifold of Probability Distributions
Math. AI
2
2
1; , ; , exp
22
xS p x p x
Information GeometryInformation Geometry ? ?Information GeometryInformation Geometry ? ?
p x
;S p x θ
Riemannian metric
Dual affine connections
( , ) θ
Manifold of Probability DistributionsManifold of Probability DistributionsManifold of Probability DistributionsManifold of Probability Distributions
1 2 3 1 2 3
1,2,3 { ( )}
, , 1
x p x
p p p p p p
3p
2p1p
p
;M p x
Two StructuresTwo StructuresTwo StructuresTwo Structures
Riemannian metric and affine connectionRiemannian metric and affine connection
2
2
: log
1, : ,
2
ij i j
p
ds g d d
p xD p q E
q x
ds D p x p x d
Fisher informationFisher information
log logiji j
g E p p
Riemannian Structure
2 ( )
( )
( ) ( )
Euclidean
i jij
T
ij
ds g d d
d G d
G g
G E
Affine Connection
covariant derivative
geodesic X=X X=X(t)
( )
c
i jij
X Y
s g d d
minimal distance
straight line
1 2{ ( , )}S p x x1 2, 0,1x x
1 2{ ( ) ( )}M q x q x
Independent Distributions
Neural Firing
1x 2x 3x nx
higher-order correlations
orthogonal decomposition
1 2( ) ( , ,..., )np p x x xx
[ ]i iE x
[ , ]ij i jv Cov x x
----firing rate
----covariance
Information Geometryof Higher-Order Correlations ----orthogonal decomposition
Information Geometryof Higher-Order Correlations ----orthogonal decomposition
Riemannian metric
dual affine connections
Pythagoras theorem
Dual geodesics
,S p x
Correlations of Neural FiringCorrelations of Neural Firing
1 2
00 10 01 11
1 1
2 1
,
, , ,
p x x
p p p p
p
p
11 00
10 01
logp p
p p
1x 2x
2
1
1 2{( , ), } orthogonal coordinates
firing ratescorrelations
00110001011010100100110100
0101101001010
firing rates:correlation—covariance?
1x
2x
3x
00 01 10 11{ , , , }p p p p
1 2 12, ;
1 2{ ( , )}S p x x1 2, 0,1x x
1 2{ ( ) ( )}M q x q x
Independent Distributions
Pythagoras Theorem
p
qr
D[p:r] = D[p:q]+D[q:r]
p,q: same marginals
r,q: same correlations
1 2,
independent
correlations
( )[ : ] ( ) log
( )x
p xD p r p x
q x
estimation correlationtesting
invariant under firing rates
01100101……. 110001011001……. 101000111100……. 1001
1x
2x
3x
No pairwise correlations, Triplewise correlation
1 2 3 1 2 3
1 2 1 2
( , , ) ( ) ( ) ( )
( , ) ( ) ( )
p x x x p x p x p x
p x x p x p x
Pythagoras Decomposition of KL Divergence
( )p x
( )indp x
( )pairwise corrp x
only pairwise
independent
Higher-Order Correlations
1 2, , ,
exp
n
i i ij i j ijk i j k
x x x
p x x x x x x
x
x
0M
1M
[ ]
[ ]i i
ij i j
E x
E x x
( , , ,...)
( , , ,...)
i ij ijk
i ij ijk
Synfiring andHigher-Order Correlations
Amari, Nakahara, Wu, Sakai
Neurons
1x nx
1i ix u
Gaussian [ ]i i ju E u u
2x
Population and Synfire
Population and Synfire
hswu jiji ii ux 1
(1 )i iu h
, 0, 1i N
s
1x nx
2
[ ]
[ ] 1
i j
i
E u u
E u
timesame at the fire neurons Prob ipi
(1 )
Pr{ 1} Pr{ 0}
i n in i
i i
C F F
F x u
Pr{ }1
i
h
timesame at the fire neurons Prob ipi
Pr{ neurons fire}r
ir P nr
n
( , ) nH r nzq r e e d FrFr
nz 1 log 1 log
2
2
dt 2
1 2
0
2thaehaFF
1 22 1( , ) exp[ { ( ) } ]
2(1 ) 2 1q r c F h
1 2
1...
( , ) exp{ ...}
(1/ )k
i i ij i j ijk i j k
ki i i
p x x x x x x
O n
x
Synfiring
1( ) ( ,..., )
1n
i
p p x x
r x q rn
x
( )q r
r
Bifurcation
r
rP
ix : independent---single delta peak pairwise correlated
higher-order correlation !
Shun-ichi AmariRIKEN Brain Science Institute
amari@brain.riken.go.jp
Collaborators: Si Wu Hiro Nakahara
Field Theory of Population CodingField Theory of Population Coding
* *|x r z x
*r z f z x z
2
2exp
2
zf z
a
Population Coding and Neural Field
z
Population Encoding
r z f z x z
ˆdecoding r z x
x
f (z-x)
r(z)
z
z
Noise
2
2
22
0
' '
', ' 1 ' exp
2
z
n z z h z z
z zh z z n z z n
b
b
z
Probability Model
2
12
( ) exp2
nQ r z x c r z f z x h r z f z x
1 1 , ' ' 'r z h r z r z h z z r z dzdz
1 ' ' '' ' '', , h z z h z z dz z z
r z f z x z
Fisher information
2*
* | log
dx
xrQdExI
Cramer-Rao
)(
1ˆ
*
2*
xIxxE
Fourier Analysis
1
2i zf z F f z e dz
' 1
2i zh z z H h z e dz
222
22
FnI d
H
Fisher Information
2 2
2 2
2
22 2
21 2
a
b
n eI d
n b n e
3 2
3 2
1) No correlation 0
2) Uniform correlations
1
nI
ab
nI
a
2 3
2
3) Limited range correlations
1
1 '
14) Wide range correlations:
10 1
5) Special case: 1, 2
cb
nn
Ia c
bn
I A dc
b a
I
Dynamics of Neural Fields
, , ,
u z tu z t w z z u z t dz
uc r z
ShapingDetectingDecoding
How the Brain Solves Singularity in Population Coding
S. Amari and H. Nakahara
RIKEN Brain Science Institute
1x 2xZ
1x 2xZ
Neural Activity
1 2
11 2 2
1
1; , , exp
2
log log
: Fisher information matrix
iji j
ij
r z v z x v z x z
Q r z v x x r f h r f
Q QI E
I I
Parameter Space
v
1x2x
2 1
1 2
1
: difference
1 : center of gravity
, ,
Fisher information degenerates as 0
Cramer-Raoparadigm: error
u x x
w v x vx
w u v
u
I
2 2 1 3 3 1 1
2 3
1
2
3
; 1
1 1 2 1, ,
2 6
f z H z H z z
v v v v vw u u
g
I g
g
: Jacobian singular
T
J
I J I J
2
3
2
~ 1
1~
1~
1~i
w O
u Ou
v Ou
x Ou
w
synfiring resolves singularity
1 1 2
2 1 2
phase 1:
:
f z v z x v z x
f z v z x v z x
1 , 1v v
: regular as 0I u
1x 2xZ
1x 2xZ
synfiring mechanism
1z
2z
common multiplicative noisecommon multiplicative noise
S.Amari and H.Nagaoka,
Methods of Information GeometryAMS &Oxford Univ Press, 2000
Mathematical Neurons
i iy w x h w x
x y( )u
u
Multilayer Perceptrons
i iy v n w x
21; exp ,
2
, i i
p y c y f
f v
x x
x w x
x y
1 2( , ,..., )nx x x x
1 1( ,..., ; ,..., )m mw w v v
Multilayer Perceptron
1 1,
,
, ; ,
i i
m m
y f
v
v v
x θ
w x
θ w w
neuromanifold( )x
space of functions
Neuromanifold
• Metrical structure
• Topological structure
Riemannian manifold
22
ij i j
T
ds d
g d d
d G d
j
i
d
log ( | ; ) log ( | ; )( ) [ ]ij
i j
p y x p y xg E
Geometry of singular modelGeometry of singular model
y v n w x
W
v| | 0v w
Gaussian mixtureGaussian mixture
1 2 1 2; , , 1p x v w w v x w v x w
21 1exp
22x x
1 2: singular , 1 0 w w v v
1w
2w
v
Topological Singularities
S
M
singularities
Singularity of MLP---example
Backpropagation ---gradient learningBackpropagation ---gradient learningBackpropagation ---gradient learningBackpropagation ---gradient learning
1 1
2
examples : , , , training set
1( , ; ) ,
2 log , ;
t ty y
E y x y f
p y
x x
x
x
,
t t
i i
E
f v
x w x
Information Geometry of MLPInformation Geometry of MLP
Natural Gradient Learning : S. Amari ; H.Y. Park
1
1 1 1 11 1 T
t t t t
EG
G G G f f G
1 1 2 2( ) ( )y v w x v w x n
1 2
1 2
w w w
v v v
2 1
2 1
u w w
z v v
x y
1w
2w
z
1w
2w
1v
2v
2 hidden-units
1 1 2
1 2
1 2
2 1
2 1
2
: y v v n
w w w
v
u w w
v vz
v
v
v
2w x w x
Dynamics of Learning
1,
( , ), ( , )
( , ),
( , )
d dl G l
dt dt
du dzf u z k u z
dt dt
du f u z
dz k u z
2 2 1
log2
u z z c
The teacher is on singularity
2 2 3
2 4
2
1( )4
1( )4
1( )4
duA z u
dtdz
A z zudt
dz zu
du z
2 2 1log
2u z z c
The teacher is on singularity
2 2 3
2 4
2
1( )4
1( )4
1( )4
duA z u
dtdz
A z zudt
dz zu
du z
2 2 1log
2u z z c
Recommended