Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Industrial Applications of Neural
Networks – Path to the Future
Heikki Koivo
Aalto University
ESPOO, FINLAND
Outline
• Background
• Examples of Neural Networks in prediction,
classification and control
• Applications of neural networks
• Future Directions
2
Machine learning – Section 1 ebook
Lotfi Zadeh & Company
4
IEEE Workshop on Neuro-Fuzzy Control
Muroran, Japan 1993
5
Biological neural cells
Artificial Neural Networks –
Beginning of Articial Intelligence
• Model of nerve cell firing - Perceptron
7
Inputs
to cells
+ f
Activation function f
(Threshold)x1
x2
w1
w2
w3x3
McCullough – Pitts, 1943
Constant
weights
Biological neural networks
• Many nerve cells
8
Neural networks
• Multilayer perceptron network (modelling many nerve cells)
9
MATLAB
Neural Network
Toolbox
Need input-output data (measurements).
Then find the best weights, wi to fit the data.
History
• Many kinds of neural networks have been
proposed– Multilayer Perceptron Networks (Feedforward, Back
Propagation)
– Radial Basis Function Networks
– Support Vector Machines
– Recurrent Neural Networs
– Self-Organizing Maps (SOM, Teuvo Kohonen)
– ANFIS (Adaptive Neuro-Fuzzy Inference Systems)
– Convolution Networks
10
Numerical optimization methods
(needed to determine weights wi in MPC)
• All ’gradient’ methods can be written in the form
1i i i i x x s
step size
search direction
R3
w2
w3
w4
COMPETITIVE LEARNING (Clustering technique)
1 2 3, , , 1,..., 4j j j jw w w j w
Blue points = old data, normalized – unit ball
w1
Form cluster centers,
Have to decide how many.
Call them weight vectors
Weight vectors (marked with red x)
1 2 3, , , 1,..., 4j j j jw w w j w
R3
w1
w2
w3
w4
COMPETITIVE LEARNING (Clustering technique)
• Input vector x
x Weight vector w
4 Output units
Another activation value (Euclidean distance)
1/ 2
32
1
, 1,...,4new new j
j i ij
i
a x w j
x w
The most competitive w j giving the
smallest activation value?
’Winner (cluster k) takes all’ updating
as before.
x
= learning rate
Different learning rate functions are used
1 0
2 0
1
3 0
( ) , 0,
( ) , 1,
( ) 1 , 0< max ,
tt e
t t
t t t
Kohonen self-organizing map
Spread a net and let it reshape itself according to competitive learning,
while keeping its (topological) form.
Support Vector Machine
16
Early 2000 Deep Learning
• HOT TOPIC
• Speech recognition
• Image classification
• The layer closest to the data vectors learns simple
features, while the higher layers learn higher level
features
• Google Microsoft Facebook after 2013 very active….
17
Convolution network
18
Simon Haykin: Neural Networks and Learning Machines, Third Edition, Pearson, 2009
Google: Largest neural networks have over billion connections
.
In my group – Systems engineering
• Modeling dynamical systems
• Prediction with neural network time series models
• Monitoring and fault diagnosis
• Control
• Theory of neural networks
• Soft computing
19
ˆ( ) ( ( ),..., ( ), ( ),..., ( ))y t k f y t y t n u t u t m
Represents neural network
Main steps in applying neural networks
• Design of experiments, very important (or simulation
model)
• Performing the experiments
(rich enough data should be collected)
• Preprocessing of data
• Choice of neural network and its structure, teaching it
• Validation using independent data
20
Chernobyl 1986 – Unsuccesful experiment
2121
Applications: Forest machines
• Cut-to-length harvesting method
• Two machines– Harvester processes trees to logs
• Felling, delimbing of branches, measuring, cutting to logs, etc.
– Forwarder transports the logs from forest to the roadside
• Machines have CAN-buses from where measurements are gathered– Processing information
– Diagnostic information
– Joystick signals and button presses of the operator
22
Forwarder (top) and harvester (bottom).
Multiobjective optimization
• Maximal productivity
• Minimal fuel consumption
• Minimal time used
• High quality (logs need have to be cut to measure)
• Etc.
• Implication: Development in real environment –
In university research cannot cover many of the
above aspects
23
Motivation: Work shift-wise productivity
of forest harvesters
24
• In forest science the typical
sample sizes are ~1000
processed stems
• Nowadays it is possible to
gather data during normal
work!
• Example: ~1.7 Million
measurement points (stems
processed)
Statistically significant
amount of data!
Apply Hidden Markov
Model to the data to
discover operator’s
subtasks
Forwarder (loader)
25
Forest machine
26
Example: average productivity curves of 13
harvester operators• Figure shows the average productivity of
13 harvester operators– Data recorded during normal operation
using the Timberlink-software
• Up to 50 % differences between the best and the worst operator
– Conforms with the earlier results reported by forest scientists
– For stem volume size 1 m3 the productivitydifference between best and worstoperator is over 30 m3/h producedtimber!
• Better operators reach– Better productivity
– Better fuel economy
– Better quality
• Potential monetary savings even more than 50 %!
• Can we increase the performance level of worst operators? (Yes we can!)
• Has been used in Finnish Forestry Practice and Management schools 27
0 0.5 1 1.5 2 2.50
20
40
60
80
100
120
1401270 stemProductivity
Stem volume [m3]
Pro
cessin
g p
roductivity [
m3/h
]
op 1
op 2
op 3
op 4
op 5
op 6
op 7
op 8
op 9
op 10
op 11
op 12
op 13
Figure: Difference between the
best and the worst operator up to
50 %!
Pyhäsalmi mine and its transportation
system
28
Online payload determination of a moving
loader in a mine (cannot stop, no scales
available)Use secondary measurements:
pressure, position, upper pressure,
inclination angle
Neural network (MLP)+Kalman filter
Data fusion (should be ready in 3 s)
… …
Weight
estimationMeasure-
ments
Neural
network
Kalman
filter
Foam Enrichment
Grinding
Foaming
Mine
Ore
Enrichment
Crushing
Enriched minerals
Smelter
Sakeutus
Water removal
Foam Enrichment (Pyhäsalmi)
31
XRF analyzer, X-Ray Fluoresence (XRF)
Copper circuit Zinc circuitFeed
Copper enrichment Zinc enrichment
Waste
Waste
Feed
Mixture
Air FeedFoam
Enriched
Principles of flotation
• Raw ore is ground into fine powder
• Grain size typically 50-100mm
• Valuable minerals are made hydrophobic with surface active chemicals minerals rise to the surface with air bubbles
• Froth is skimmed off and dried, leaving a ”clean” concentrate
Main parts of a flotation cell
32
Principles of flotation (cont.)• The visual appearance of
the froth gives information
about the state of the
process
• Delays are short compared
to X-ray analysers
• Operators typically use this
information in control
decision making
33
Variables chosen
• The following 5 were chosen:
FROTH COLOUR,
BUBBLE SIZE DISTRIBUTION,
FROTH SPEED,
BUBBLE COLLAPSE SPEED,
BUBBLE LOAD
34
On-line analyzer (the original single-camera system)
35
36
Multi-Camera Analysis - PLS
37
• Can cleaner cell Zinc grade be predicted by using only
image variable as predictors?
• 3 methods were tested:
– Unsupervised Principal Component Analysis (PCA)
– Principal Component Regression (PCR)
– Partial Least Squares(PLS)?
• In PCA X data is used for training, Y for validation
• In PCR both X and Y data are used in training,
• In PLS both X and Y data are used in training,
Results – PLS
38
• Results do not show very much improvement compared
with PCA.
Slurry analysis
• Instead of froth, could apply spectral analysis to slurry
and combine it with XRF analysis to obtain almost
continuous measurement
• XRF is accurate measurement, but the measurement
interval is long, 15-20 min. Disturbances could not be
observed, if they happen between the samples
• XRF analyzers are bulky and expensive. Therefore
several differnt flows are processed sequentially
• Partial Least Squares (PLS) was used as fitting
teachnique
Initial laboratory tests
• Laboratory test set-up
Integration with automation system –
Operators can compare with XRF
Results
• Figure shows
possibilities of
spectrum approach:
Almost continuous
measurement is
achieved (blue).
red dots are XRF
measurements
Results
• Oscillation is
revealed with the
developed method,
but not with XRF
• VNIR = Visible and
Near Inra-Red
Fault diagnosis of electric machines
• Support Vector Machine
44 44
Signature
generation
inputs outputs
signatures
Classification
fault decision
Data generated by simulation, FEM model
Also bars were broken
Measured data from a faulted
machine
Measurement set-up
IM DCG~
~
Frequencyconverter
CurrentVoltagePower
Vibrationsensors
Searchcoils
Grid
THE DATA – FEM MODEL
• The magnetic field of the core is assumed to be two dimensional
• The three dimensional end region fields are modelled by constant end winding impedances in the circuit equations of the windings
• Current density is assumed to be constant in the stator conductors
• The laminated iron core is assumed to be a non-conducting and magnetically non-linear, where the non-linearity is modelled using a single value magnetisation curve
( ) zut l
AA e
u Ri R dt
S
AS 1
1 1
1 1( ) ( )
2 2
k kk k k ku u R i i R d
t
S
A AS
1 1 1 1
2 2( ) ( )k k k k z k k k k zu u
t l t l
A A e A A e
THE DATA
• There are two different sets of FEM-data, each with slightly different input voltages
• The models are created for following conditions(each load condition separately):– No fault– Three broken rotor bars and an end-ring– Turn to turn stator fault
• The models are validated using same set of data, but with the half which was not used for model creation
BAYESIAN CLASSIFIER
• The classifier used for fault classification is the Bayesian
classifier
• It gives out the conditional probability that the data from
the model represents the data to which it is compared to
( | , ) ( | )( | , )
( )
i ii
P e m Z P m ZP m e Z
p e
DATA GENERATION
- 35 kW cage induction motor
- Inverter, with switching frequency fixed at 3 kHz
- a DC generator is the motor load.
Input
voltage
Converter
output
voltage
HEALTHY MOTOR
• Stator current i1 in a healthy motor
no load
half load
full load
Clearly, need to deal with different load situations separately
Zoom
MOTOR WITH ROTOR FAULT
• Stator current i1 in a motor with rotor fault
no load
half load
full load
Zoom
Shape is now different
MOTOR WITH STATOR TURN FAULT
no load
half load
full load
Sampling frequency = 40 kHz
MODEL FIT
NN - model output (blue) and the testing
data set (red) for a healthy motor under
Half load
MODEL BANK FOR MODEL-
BASED FDI
System
NN Model 0
Residual
generation
NN Model 1
NN Model 2
NN Model 3
u(t) i(t)
i2(t)
i1(t)
i3(t)
i4(t)
Fault
classification
&
Decision making
(Bayesian
classifier)
MODEL STRUCTURE USED
TOOL: NNSYSID
Different n were tested; n=10 gave good results
Using a Bayesian classifier
• Full load
USING A BAYESIAN CLASSIFIER
• Half load
USING A BAYESIAN CLASSIFIER
• No load
FORECASTING DISTRICT HEATING LOAD(CO-OPERATION WITH FORTUM)
Power plant Residential area
Nonlinear model, e.g. NARX model
ˆ( ) ( ( ),..., ( ), ( ),..., ( ))y t k f y t y t n u t u t m
Nordpool electricity market
6017.12.2018
60
Windfarm, Kemi, Finland –connected to
maingrid
• 3 MW windturbines.
• Total power30 MW
6117.12.2018
61
Spot market or day-ahead market
• Producers and distributors (buyers) of electricity agree
to sell or buy electricity at a certain price and volume
• Equilibrium point trading – carried out once a day
for every hour of the next day, usually at noon
• The price for tomorrow is decided today
• Bids are sent in - auction will follow
62
Very important to predict 24 hours ahead
Use neural networks in prediction of wind speed
Currently used NWP timeline
63
Measured wind, NWP prediction and
measured power
6417.12.2018
64
Intelligent Machine
Terminator
(neural chip)
65
Deep learning – why succesful
• Big Data (Internet)
• Computing power (can use parallel computing)
• Convolution networks
66
SELF-DRIVING CARS
INTELLIGENT HIGHWAY
67
Intelligent ships (Autonomous ships)
AUTOPILOT
Oldest adaptive controller for ships and airplanes
Lawrence Sperry flying in Paris 1914
Father Elmer Sperry, inventor of gyro compass
For ships gyro controlled the steering
engine to hold a ship on predetermined
course
Extension: Gyro stabilizer system to
prevent the ship rolling
AUTOPILOT – To control the trajectory of an aircraft using gyrocompass
• Electric Ships - future trend in ship building
• Electrical power train is simpler than mechanical
power train
• Saves environment (no coal, no oil)
• Easier to apply computer control, automation
• Better situation awareness using Sensor Fusion,
Artificial Intelligence (AI) and Augmented Reality
(AR)
• Safety improvement
• Overall Optimization including On Shore
Operation Center
Port Liner, Holland
CHINA LAUNCHED THE FIRST ELECTRIC CARGO SHIP IN
2O17 –Hangzhou Modern Ship Design & Research Co
Situational Awareness
• Having a good perception of your surroundings at all times
• Comprehending what's happening around you
• Predicting how this will affect your boat
• Intelligent ships have similar sensors as self-driving cars:
– Lidars
– Radars– RGB cameras
– NIR cameras
– PTZ camera
– Navigation radar
– RTK-GPS
– AIS
– IMU
System of Systems
On shore
Operations
Center -
Performance
and condition
analysis
Satellite
Data collection
Real-time optimization
Passenger Ship
Intelligent Ship (future)
Ferry boat
Cargo ship
In harbour
• Mathworks Research Summit, Boston, June 2018.
Neural networks + applic
• http://en.worldrobotconference.com/
• Beijing, August, 2018. Intelligent ships,Waste Sorting
Thank You