29
Neural Nets: Something you can use and something to think about Cris Koutsougeras • What are Neural Nets • What are they good for • Pointers to some models and formulations • Methods that you can use • Fun and open issues

Neural Nets: Something you can use and something to think about

  • Upload
    myra

  • View
    23

  • Download
    0

Embed Size (px)

DESCRIPTION

Neural Nets: Something you can use and something to think about. What are Neural Nets What are they good for Pointers to some models and formulations Methods that you can use Fun and open issues. Cris Koutsougeras. What are they good for. System of interest (Black box). - PowerPoint PPT Presentation

Citation preview

Page 1: Neural Nets: Something you can use and something to think about

Neural Nets: Something you can use and something to think about

Cris Koutsougeras

• What are Neural Nets

• What are they good for

• Pointers to some models and formulations

• Methods that you can use

• Fun and open issues

Page 2: Neural Nets: Something you can use and something to think about

What are they good for

System of interest

(Black box)

If I have a finite set of observations (samples) of input-output behavior, can I figure out what function the box performs?

• System identification

• Prediction-forecasting

• Controls, Autogenerate a simulator

• Dealing with cases where only samples are known about the function of interest

Page 3: Neural Nets: Something you can use and something to think about

Why Neural Nets

• Target function is unknown except from samples

• Target is known but is very hard to describe in finite terms (e.g. closed form expression)

• Target function is non-deterministic

Something to think about

Page 4: Neural Nets: Something you can use and something to think about

General Structure of a NN

Inputs

Outputs

Page 5: Neural Nets: Something you can use and something to think about

Function Approximation

• Output is a composition of the various node functions.

• Output is parametric on the inputs; weights-thresholds are parameters

• Bottomline: function approximators

Page 6: Neural Nets: Something you can use and something to think about

Nets perform functional compositions

Net output=

,)),(((

),(

2152542421

3132121

WWOWOff

WOWOf ),....))(( 315353 WWOf

Page 7: Neural Nets: Something you can use and something to think about

Output is a complex function of the inputs. The complexity comes from the deep nesting of the typical neuron functions.

Inputs

f

f

f fff

Y

x1 x2 xn

Y=f(f(f(f(x1, x2, … xn), f(f(f (x1, x2, … xn),

f(f(f((x1, x2, … xn), …..)))

Page 8: Neural Nets: Something you can use and something to think about

Net’s function is an “elastic” curve

By adjusting the weights (curve parameters) the curve fits the samples

Adjusting the weights is the key issue

Page 9: Neural Nets: Something you can use and something to think about

How does it work

• We have a Sample set Si={(x1,t1), (x2,t2),…(xn,tn)}

• We have the net producing: yi=f(xi, W)

• We define a Quality measure Q(W) that involves f and the targets ti.

• We adjust W iteratively: W=-aw Q until Q is optimized.

• A convenient Q is usually the mean square error

Page 10: Neural Nets: Something you can use and something to think about

How does it work

Target

Net function

QaW W QaW W

Something you can use

Page 11: Neural Nets: Something you can use and something to think about

Nonlinear regression

Quality function is the sum of individual errors. Minimizing the error is like stretching the curve to fit the samples.

Problem: How do we know that we are done?

Page 12: Neural Nets: Something you can use and something to think about

Nonlinear regression

Page 13: Neural Nets: Something you can use and something to think about

Nonlinear regression

Page 14: Neural Nets: Something you can use and something to think about

Problems

• Not enough non-linearity to fit, or • Overfitting

• Need for minimal nonlinearity that can accomplish fitting

Page 15: Neural Nets: Something you can use and something to think about

Gradient Decent can get stuck

Weight Space

Total Error(Q)

Page 16: Neural Nets: Something you can use and something to think about

Simulated Annealing

QtaW W )(

Turn a into a time function; start with very large values and gradually reduce it

Theorem: If a is reduced at a slow enough rate the probability of landing at the global minimum asymptotically tends to 1

Something you can use

Page 17: Neural Nets: Something you can use and something to think about

Simulated Annealing

• By starting with lots of energy and reducing it slowly enough, the probe will eventually have enough energy to jump out of local minima but not out of the global. If it remains long enough in that energy range it will get trapped in the global minimum area.

Page 18: Neural Nets: Something you can use and something to think about

Let’s have some fun

• What network structure do we need?• Particularly how many nodes?

Page 19: Neural Nets: Something you can use and something to think about

Let’s have some fun

Page 20: Neural Nets: Something you can use and something to think about

Inputs Xij

Wrj

Vrj

Yj

Yj = F(WrjVrj)

So:WrjVrj) = F-1(Yj) r linear equations

Page 21: Neural Nets: Something you can use and something to think about

Our Framework

Page 22: Neural Nets: Something you can use and something to think about

New Class of Training Algorithms

• We conclude that after proper training (by any method) all intermediate normalized vectors Y project at the same point in the direction of W.

• Thus all Y’s are aligned on a plane that is perpendicular to W

• New class of algorithms: – Find weights for hidden layer that align all Y’s on a

plane– W for the output layer is the normal to that plane

Page 23: Neural Nets: Something you can use and something to think about

One such Algorithm

di

Yi

W

i

id2

Minimize which is parametric on all weights. Thus use as

and perform a gradient descent:

i

idQ 2quality function :

QW w

Something you can use

Page 24: Neural Nets: Something you can use and something to think about

Open Questions:

• What is the minimum number of neurons needed?

• What is the minimum nontrivial rank that the system can assume? This determines the number of neurons in the intermediate layer.

Page 25: Neural Nets: Something you can use and something to think about

Interesting Results• The local activation functions must be nonlinear for hidden layer but not for the output layer. We thus arrive at the same result as Kolmogorov’s theorem

• The solvability of proves universal approximation with only one necessary hidden layer

• The minimum nontrivial rank of the matrix provides the number of hidden layer neurons necessary for proper fitting

• Problem: the matrix is parametric and we have no effective method for computing the lowest (non trivial) rank

• We came up with other characterizations based on Vapnik-Chervonenkis dimension and PAC learning

• However, the problem of a precise optimum number for the hidden layer is at large still open (Something to think about)

1jitjYW

jiY

Page 26: Neural Nets: Something you can use and something to think about

Clustering Models(pattern recognition/classification)

Neuron functions represent “discriminant functions” that can be used to construct borders among classes.

Page 27: Neural Nets: Something you can use and something to think about

Clustering Models(pattern recognition/classification)

Neuron functions represent “discriminant functions” that can be used to construct borders among classes.

Page 28: Neural Nets: Something you can use and something to think about

Linear neurons (thresholding)

Output = 1 if F(w1x1 + w2x2 +.. wnxn ) > T

0 if F(w1x1 + w2x2 +.. wnxn ) < T

F

x1 x2

xn...W1 Wn

W2

T

+

+

+

- - --

-

-

- -

Page 29: Neural Nets: Something you can use and something to think about

Radial Basis

Output = 1 if (w1- x1)2 + (w2-x2 )2 +.. (wn-xn )2 > R2

F

x1 x2

xn...W1 Wn

W2

R

0 if (w1- x1)2 + (w2-x2 )2 +.. (wn-xn )2 < R2

+

+

+

- - --

-

-

- -