Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Robert StengelPrinceton University
School of Engineering and Applied Science
! Cognitive and Biological Paradigms for Intelligent Agents
! Intelligent Vehicle/Highway Systems
! Advanced Vehicle Control Systems
– Control of a Fuel-Cell Preferential Oxidizer
– Adaptive Critic Neural Control of an Aircraft
! Autonomous Vehicles
– Intelligent Guidance for Headway and Lane Control
Neural-Adaptive Control of
Dynamic Systems
presented at George Washington
University, February 23, 2007
Intelligent Agents
• Perform useful functions driven by objectivesand current knowledge
– Emulate biological and cognitive processes
– Process information to achieve goals
– Learn by example or from experience
– Adapt functions to a changing environment
Cognitive and BiologicalParadigms for Intelligent Agents
Thinking- Syntax (form) and Semantics (meaning)- Algorithmic vs. Non-Algorithmic Behavior- Consistency, Emotion, "The Collective Subconscious"- Generating Alternatives- Randomized Search
Consciousness- Self-Awareness and Perception- Creativity, Wisdom, and Imagination- Common Sense, Understanding, and Judgment of Truth- Learning by Example
Recognizing Aliases and ObjectsHandling EmergenciesFocusing on Things that are "Out of Focus"Richness of Sensory InformationHierarchical and Redundant StructuresGenetic Reproduction of Elements
Learning Requires Error or Incompleteness
Biological Adaptation is a Slow Process
REM (Rapid Eye Movement) Sleep is a Time ofLearning, Consolidating, and Pruning Knowledge
Cells Undergo Birth-Life-Death Cycle
Short-Term Memory Recedes into Long-TermMemory or is Forgotten
Humans Form Chords of Actions
Knowledge Acquisition,Behavior, and Control
Conscious Thought- Awareness- Focus- Reflection- Rehearsal- Declarative Processing of Knowledge or
Beliefs
Unconscious Thought- Subconscious Thought
> Procedural Processing> Communication> Learned Skills> Subliminal Knowledge Acquisition
- Preconscious Thought> Pre-attentive Declarative Processing> Subject Selection for Conscious Thought> Concept Development> Information Pathway to Memory> Intuition
Hierarchy ofDeclarative, Procedural,and Reflexive Actions
Reflexive Behavior- Instantaneous Response to
Stimuli- Elementary, Forceful Actions- Stabilizing Influence- Simple Goals
Intelligent Agent PossessingDeclarative, Procedural, and Reflexive Traits
Declarative Functions Expert Systems, Decision TreesProcedural Functions modeled by Estimation and Control "Circuits"Reflexive Functions Neural Networks
Natural Neurons
• Neurons are biological cellswith significantelectrochemical activity
• ~10 billion neurons in thebrain
• Neuron activity is complex,but output is scalar
• Single neuron– receives many inputs– produces a single output
ComputationalNeural Networks
• Functional structure– Algebraic processing of
inputs to producecontinuous outputs• Flow-through network
• Convergence to desiredsolution as outputsequence evolves
– Search of input space toidentify discrete outputs
• Recursive network
• Convergence to distinctsolution before outputoccurs
• Training categories– Supervised learning
• Define input-outputrelationship fromexamples, e.g.,backpropagation
– Unsupervised learning• Identify inputs that are
similar or close
– Non-adaptive Network
• Training beforefunctional application
– Adaptive Network• Training during
functional application(adaptive)
Computational (Artificial)Neuron Complex
• Synapse effects represented by weights,gains, or multipliers
• Neuron firing frequency is modeled by lineargain or nonlinear element
w11
w12
w13
w21
w22
w23
Natural NeuralNetworks
• Dendrites receive signalsfrom other neurons• Axons transmit signals to
other neurons and endeffectors• Synapses reflect
connection strength–Excite or inhibit neuron
activity–Are the learning
parameters of the nervoussystem
Layout of an AlgebraicNeural Network
Layered, parallelstructure for computation
Intelligent Vehicle/Highway Systems(IVHS)
• Rationale• Congestion• Highway Throughput and Trip Time• Traveler Safety• Industrial and Social Productivity• Convenience• Handicapped and Elderly
• Functional Areas• Vehicle Control Systems• Traffic Management Systems• Traveler Information Systems• Public Transportation Systems• Cargo Transportation Systems• Rural Transportation Systems
• Issues• Smart Cars and Smart Highways• Autonomy and System Architecture• Cost and Resource Allocation• Benefits and Privacy• Rights and Responsibilities• Regulation and Liability
Predicted Market Penetration Dates for IVHS Control Technologies (1991)
Feature 5% 50%Vehicle Probes Producing Traffic Data 2000 2016Real-Time Optimal Route Guidance 2000 2020Frontal Collision Warning 2002 2013Backup and Blind-Spot Detection 2002 2015Roadway Imaging 2010 NeverGPS Navigation 2000 2012Map-Matching/Dead-Reckoning Navigation 2000 2020Adaptive Cruise Control 2004 2015Automatic Backup Braking 2008 2020Autonomous Lane-Keeping 2012 2032Platooning 2035 NeverAutomatic Chauffeuring 2040 Never
compiled by Steven Underwood, “Delphi Forecast and Analysis of Intelligent Vehicle-Highway Systems,” U. Mich, 1992.
Elements of an IntelligentVehicle/Highway System
RegionalTraffic Management
Organization
CellularTraffic Management
Center
CellularTraffic Management
Center
CellularTraffic Management
Center
Road/Highway/Communication Infrastructure
Roadway Systems• Traffic Lights• Changeable Message Signs• Ramp Metering• Tolling Systems• Law Enforcement• Maintenance• Radio Links• Wire/Optic Lines• Loops, Video Detectors• Weather Information
Client Vehicles• Regular Traffic - Passenger Cars - Buses - Trucks & Vans - Motorcycles - Bicyles - Pedestrians• Transient Traffic - ...... - ......
Traffic Obstructions• Accidents• Road Maintenance• Roadside Construction• Sports/Social Events• Natural Calamities
A Network of Intelligent Agents
Regional Traffic ManagementIntelligentAgent
Cellular Traffic ManagementIntelligentAgent
Police/EmergencyIntelligentAgent
ClientVehicleIntelligentAgent
Controlling
Devices
Sensing
DevicesControlling
Devices
Sensing
DevicesControlling
Devices
Sensing
Devices
Cellular Traffic ManagementIntelligentAgent
RoadsideIntelligentAgent
Functions of Two AgentTypes in an IVHS
• Cellular Traffic ManagementAgent• Declarative Functions
• Area traffic monitoring• Nominal traffic routing plan• Area emergency planning• Accident strategic response
• Procedural Functions• Traffic flow assessment• Driver information services• Accident detection and tactical
response
• Reflexive Functions• Normal traffic signaling• Traffic volume and speed logging• Communication with vehicles,
adjacent cells, and regionalcenter
• Driver/Automobile Agent• Declarative Functions
• Destination and route selection• Choice and timing of waypoints• Strategy selection
• Procedural Functions• Neighboring traffic assessment• Roadway, obstacle, and hazard
assessment• Obeying traffic rules and
regulations
• Reflexive Functions• Steering and accelerating• Normal and emergency braking• Internal systems control• Communication with adjacent
vehicles and traffic managementsystem
I. Control of a Fuel-CellPreferential Oxidizer
BATTERIES
POWER
CONDITIONING
AND MOTOR
CONTROL
GEARMOTOR/
GEN.
FUEL
PROCESSOR
FUEL
STORAGE
FUEL CELL
STACKShift
2H O
Air
PrOx
Reformer or Partial
Oxidation Reactor
• Control logic mimics functions of the brain’s cerebellum
PreferentialOxidizer
• Proton-Exchange Membrane Fuel Cell converts hydrogen andoxygen to water and electrical power
• Steam Reformer/Partial Oxidizer-Shift Reactor converts fuel (e.g.,alcohol or gasoline) to H2, CO2, H2O, and CO. Fuel flow rate µpower demand
• CO “poisons” the fuel cell and must be removed from thereformate
• Catalyst promotes oxidation of CO to CO2 over oxidation of H2 in aPreferential Oxidizer (PrOx)
• PrOx reactions are nonlinear functions of catalyst, reformatecomposition, temperature, and air flow
FUEL
PROCESSOR
Shift
2H O
Air
PrOx
Reformer or Partial
Oxidation Reactor
TheCerebellum
• Cerebellum integrates sensoryinput and motor output
Cerebellar ModelArticulation
Controller (CMAC)
• CMAC: Two-stagemapping of a vectorinput to a scalar output
• First mapping: Inputspace to associationspace– f is fixed
– a is binary
• Second mapping:Association space tooutput space– g contains learned
weights
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut 1
quant. widthof input 2
!
f : x" a
Input" Selector vector
!
g :a" y
Selector"Output
Single-Input CMAC Example
• x is in (xmin, xmax)
• Selector vector is binary and has Nelements
• Receptive regions of associationspace map x to a
• NA = Number of receptive regions =N + C – 1 = dim(a)
• C = Generalization parameter = # ofoverlapping regions
• Input quantization = (xmax –!xmin) / N
!
f : x" a
Input" Selector vector
!
g :a" y
Selector"Output
!
a = 0 0 0 1 1 1 0 0[ ]T
CMAC Outputand Training
• CMAC output from activated cells of cAssociative Memory layers:
• Least-squares training of CMAC weights:
where ! is the learning rate and wj is anactivated cell weight
• Localized generalization and training
wj ,new = wj ,old +!
cydesired " wi, old
i=1
c
#$
% & '
(
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut
1
quant. widthof input 2
!
yCMAC = wTa = wi,activated
i= j
j+C"1
# j= index of first activated region
CMAC Outputand Training
• In higher dimensions, association space isdim(x), a plane, cube, or hypercube
• Potentially large memory requirements
• Granularity (quantization) of output
• Variable generalization and granularity
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut
1
quant. widthof input 2
CMAC/PID* Control Systemfor Preferential Oxidizer
desired H2
conversion
airCMAC
airPID
airTOTAL
training
+-
+
+! ! PROX
PID
CMAC
H2 conv.
error
HYBRID CONTROL SYSTEM
(ANN)
(Conventional)
PROX reformate flow rate
PROX inlet [CO] Inlet coolant temperature
gains=f(flow rate)
Inlet reformate
Outlet reformate
H2 conv. =
f(airTotal, [H2]in, [H 2]out,
flow rate, sensor dynamics)
H2 Conversion Calc.
actual H2 conversion[H2]out
[H2]in
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut
1
quant. widthof input 2
* Proportional-Integral-Derivative
Feedback andAdaptation
• Feedback– Learn the current dynamic state of the system
and adjust the control commands
• Adaptation– Learn the current status of the system’s
dynamic model and adjust the control law
desired H2
conversion
airCMAC
airPID
airTOTAL
training
+-
+
+! ! PROX
PID
CMAC
H2 conv.
error
HYBRID CONTROL SYSTEM
(ANN)
(Conventional)
PROX reformate flow rate
PROX inlet [CO] Inlet coolant temperature
gains=f(flow rate)
Inlet reformate
Outlet reformate
H2 conv. =
f(airTotal, [H2]in, [H 2]out,
flow rate, sensor dynamics)
H2 Conversion Calc.
actual H2 conversion[H2]out
[H2]in
Summary of CMACCharacteristics
• Inputs and Number of Divisions:
– PrOx inlet reformate flow rate (95)
– PrOx inlet cooling temperature (80)
– PrOx inlet CO concentration (100)
• Output: PrOx air injection rate
• Associative Layers, C: 24
• Number of Associative Memory Cells/Weights and LayerOffsets: 1,276 and [1,5,7]
• Learning Rate, !: ~0.01
• Sampling Interval: 100 ms
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut
1
quant. widthof input 2
Flow Rate and Hydrogen Conversionof CMAC/PID Controller
• H2 conversion command (across PrOx only): 1.5%
• Novel data, with (—-) and without pre-training (––)
• Federal Urban Driving Cycle (= FUDS)
Comparison of PrOxControllers on FUDS
mean H2 error
maximum H2 error
mean CO out
max. CO out
% % ppm ppm %
• Fixed-Air 0.68 0.87 6.3 28 57.2
• Table Look-up 0.13 1.43 6.5 26 57.8
• PID 0.05 0.51 7.7 30 58.1
• CMAC/PID 0.02 0.16 7.3 26 58.1
net H2 output
Time (seconds)
Flo
w r
ate
(g/
s)
0
1
2
3
4
5
6
7
8
9
10
0 200 400 600 800 1000 1200 1400
0200
400600
800
-1000
-500
0
6750
6800
6850
6900
6950
7000
7050
• Adaptive criticcontroller– Estimates cost
function– “Criticizes” non-
optimal performance– Adapts control gains to
improve performance– Adapts cost model to
improve estimate
II. Adaptive Critic NeuralControl of an Aircraft
Design Philosophy for Adaptive CriticNeural Control
• Define an acceptable linear controlstructure
• Design linear controllers that satisfyrequirements at n operating points
• Train neural networks
– Off-line to replicate control responseat n operating points (~ “GainScheduling”)
– On-line to optimize performance
Linear-Quadratic Proportional-Integral(LQ-PI) Control System
• LQ-PI regulatorprovides:
– Multi-input/multi-output control
– Damping andstabilization
– Command response
– Disturbance rejection
– Implicitly accountsfor system modelingerrors
• Gains chosen tominimize a cost (orvalue) function
!
min"u( tk )
V "x tk( )[ ] = min
"u( tk )L "x t
k( ),"u tk( )[ ] +V "x tk+1( )[ ]{ }
L "x tk( ),"u tk( )[ ] =
1
2"xT t
k( ) "uT tk( )[ ]
Q M
MTR
#
$ %
&
' ( "x t
k( )"u t
k( )
#
$ %
&
' (
Structure of EquivalentProportional-Integral
Neural Controller
!
uk
= c xk,y
ck, "y
kdt,# a
k[ ]=NN
FyCk
,ak[ ] +NN
Bxk,a
k[ ] +NNI
"yckdt# ,a
k[ ]
Off-Line Initializationof Neural Networks
!
uk
= c xk,y
ck
, "ykdt,# a
k[ ]
!
"uk
= "c •[ ] =#c
#yc
"yck
+#c
# ycdt$( )
%"yck
dt$ +#c
#x"x
k
=CF"y
ck
+CI%"y
ck
dt$ +CB"x
k
"xa
"u
A
B*
*
• Pre-training paradigm– Nonlinear optimal control
hypersurfaces (unknown)
– Optimal linear control gainmatrices and trim settingscomputed at operating points(known)
– Gain matrices define slopes ofnonlinear control hypersurfaces
– Algebraic training of neuralnetworks fits controlhypersurfaces and gradientsexactly at operating points
• Interpolation and gainscheduling via neural networks
• One node/operating point ineach neural network
On-Line Training
• Dual Heuristic ProgrammingAdaptive Critic for infinite-horizon optimization problem(tf -> ∞)
• Critic and Action (i.e., Control)networks adapted concurrently
• LQ-PI cost function
• Modified resilientbackpropagation for neuralnetwork training
V x tk
( )[ ] = L x tk
( ),u tk( )[ ] + V x tk+1( )[ ]
!V
!u=!L
!u+!V
!x
!x
!u= 0
!
"V
"xk
=NNCyCk
,xk,a
k[ ]
!
uk
= c xk,y
ck, "y
kdt,# a
k[ ]=NN
FyCk
,ak[ ] +NN
Bxk,a
k[ ] +NNI
"yckdt# ,a
k[ ]
BackpropagationTraining of a Single
Sigmoid Neuron
!
"J
"p= ˆ y # yT( )
"y
"p= ˆ y # yT( )
"ˆ y
"r
"r
"p
where
r = wTx + b
dˆ y
dr= 1# ˆ y ( ) ˆ y
"r
"p= x
T1[ ]
!
pk +1= pk "#
$J
$p
%
& '
(
) *
k
T
= pk "#+k
x k
1
,
- .
/
0 1
or
w
b
,
- . /
0 1
k +1
=w
b
,
- . /
0 1
k
"# ˆ y k " yT( ) 1" ˆ y ( ) ˆ y kx k
1
,
- .
/
0 1
!
" = ˆ y # yT
J =1
2"2 =
1
2
ˆ y # yT( )2
=1
2
ˆ y 2# 2 ˆ y yT + yT
2( )!
p =
p1
p2
...
pn+1
"
#
$ $ $ $
%
&
' ' ' '
=w
b
"
# $ %
& ' =
Input Weights
Bias
"
# $
%
& '
• Training error and cost function
• Neuron parameters
• Cost functiongradient
• Backpropagation algorithm
BackpropagationTraining of a
Sigmoid Network
!
p1,2
=Vec W( )b
"
# $
%
& ' 1,2
=
p1
p2
...
pn+1
"
#
$ $ $ $
%
&
' ' ' ' 1,2
!
" = ˆ y # yT
J =1
2"T" =
1
2
ˆ y # yT( )
T
ˆ y # yT( ) =
1
2
ˆ y T
ˆ y # 2ˆ y T
yT
+ yT
T
yT( )
!
p1,2k+1
= p1,2k
"#$J
$p1,2
%
& '
(
) * k
T
• Training error and cost function
• Neuron parameters
• Backpropagation algorithm
Adaptation of Action (Control) andCritic (Optimizing) Networks
Train action network, at time t,holding the critic parameters fixed
Train critic network, at time t,holding the action parameters fixed
Effect of AdaptiveCritic in Steep Turn
0200
400600
800
-1000
-500
0
6750
6800
6850
6900
6950
7000
7050
• 70-deg banking turn– Outside normal flight envelope of jet transport– Pre-trained neural network ignores longitudinal-lateral-directional coupling
Uncoupled
control
Adaptive critic control
• 50% thrust reduction• 15-deg rudder jam
Effect of Adaptive Criticwith Control Failures
• 50% reduction incontrol effectiveness
• 20% reduction inlongitudinal stability
• 30% reduction indirectional stability
Effect of Adaptive Criticwith System Parameter
Variations
Movie from California PATH, 1997
III. Intelligent Guidance forHeadway and Lane Control (IGHLC)
Typical Equipage for IGHLC(or Automatic Chauffeuring)
Illustration from Ohio State University, 1997
Functions of anAutomatic Chauffeur
• Control logic mimics functions of the brain’s cerebrum
TheCerebrum
• Language andcommunication
• Movement
• Olfaction
• Memory
• Emotion
• Cerebrum integrates declarativethought and action
An Expert System forGuidance and Control
• Automated inferenceor reasoning
• Subject-specific rulesand data
• Knowledgerepresentation andacquisition
• Higher-order controlof side effects
• Explanation
• User interface
Expert System forHighway Driving
Functions of theIGHLC Expert System
• Top Level Executive– Guide other functions to determine controller
parameters
• Situation Assessment– Determine if current situation is safe or unsafe
– Invoke normal or emergency expert
• Normal Expert– Select option and issue command that is safe
and satisfies driver’s goal
• Emergency Expert– Select option and issue command that is safe
• Projected Action– Assess outcome of guidance command
• Lane-Change Indications– Identify desirable lane-change option
• Default Strategies– Backup driver-selected values
Normal ExpertSystem Flow
• Identify Own Vehicle’s– Speed goal
– Lane goal
– Aggressiveness factor
– Security factor
• Worst-Plausible-Case Decision-Making (WPCDM)– Probabilistic evaluation of
current state and uncertainty ofOwn Vehicle• Known characteristics of Own
Vehicle
– Probabilistic evaluation ofcurrent state and uncertainty ofall neighboring vehicles• Distinct plausible strategies and
corresponding control actions ofneighboring vehicles
• Worst plausible strategy andhazard function identified for eachvehicle
IGHLC Rules, Parameters,and Structure
• Elements of a Rule– Type, Name, and Status
– Parameters tested by rule
– Parameters set by rule
– Premise: Logical statement ofproposition or predicates
– Action: Logical consequenceof premise being true
– Description of premise andaction (for explanation)
• Elements of a Parameter– Type, Name, and Current value
– Rules that test the parameter
– Rules that set the parameter
– Allowable values of theparameter
– Description of parameter (forexplanation)
Size of the IGHLCKnowledge Base
Lane ChangePlausibility Scores
• Plausibility function reflects likely lateral action for each vehicle
• Large negative value effectively rules out the option
!
"ik = Score Increment jk( )i
j=1
Observation Number
# , i =1,..., Number of Vehicles, k = Left, Same, Right
IGHLC (All Cars) – Low UncertaintyA
Own
B C
A
B Own
C
B Own A
C
t = 4 s
t = 8.4 s
t = 0 s
Vehicle Own A B CInitial Lane 2 3 1 1Distance, ft 0 10 30 130Velocity, ft/s 90 100 70 65Maximum Acceleration, ft/s2 10 10 10 10
Maximum Deceleration, ft/s2 10 10 10 10Aggressiveness Factor 0.5 0.5 0.5 0.5Security Factor 0.5 0.5 0.5 0.5Desired Separation Time, s 2 2 2 2Desired Velocity, ft/s 100 100 100 65Vehicle Length, ft 13.44 13.44 13.44 13.44
Vehicle A B CDistance Std. Dev, ft 0.5 1 3Velocity Std. Dev, ft/s 1 1 3
Couple
Sep. Time / Desired
Sep. Time
Req. Decel. / Max. Decel. Safety
Worst B/C 0.56 0.09 SafeBest Own/B 0.1 0.92 EmergencyBest Own/A -0.01 0 Collision
A
Own
B C
IGHLC (All Cars) - High Uncertainty
A
Own C
B
A
Own C
B
t = 0.2 s
t = 2.6 s
t = 0 s
Vehicle A B CDistance Std. Dev, ft 0.5 1 3Velocity Std. Dev, ft/s 1 1 15
Couple
Sep. Time / Desired
Sep. Time
Req. Decel. / Max. Decel. Cost Safety
Worst B/C 0.56 0.63 1.07 EmergencyBest Own/B 0.1 0.92 1.82 EmergencyBest Own/A -0.01 0 0.32 Safe
Conclusions and Future Work
! Adaptation
! Vehicle characteristics
! Driver preferences
! Data transfer andcommunication
! Panel displays
! Cellular traffic management
! Neighboring vehicles
! Current state
! Intent
! Biological and CognitiveModels
! Cerebellar Model ArticulationController
! Simple, adaptive controlstructure
! Adaptive Critic Neural Control
! Algebraic pre-training forinitialization
! On-line optimization
! Failure tolerance
! Rule-Based Expert System forControl
! Declarative decision making
! Significance of probabilisticapproach
Acknowledgments
! Intelligent Vehicle/Highway Systems
! Timothy Chao, ‘94
! Alexander Maravas, ‘94
! Control of a Fuel-Cell Preferential Oxidizer
! Laura Iwan, *97
! Adaptive Critic Neural Control of an Aircraft
! Silvia Ferrari, *02
! Intelligent Guidance for Headway and Lane Control
! Axel Niehaus, *95
Addendum: DARPA GrandChallenge, Oct 6, 2005
• 132-mile course through the desert
• Winner: Stanford racing team
• Winning time: 6 hr, 54 min
Princeton’s 2005 Entry:Prospect Eleven• Alternate semi-finalist
• 10th seed in National Qualifying Event
• 10/8/2005: Disabled at 9.6 miles, fartherthan any entrants in 2004 GrandChallenge
• Bug in one line of code
• Post-Grand Challenge– Bug in code fixed
– Successfully navigated 2005 course, withmanual diversions for mud and new ruts
– Successfully navigated 2004 course, withmanual diversions for mud and new ruts
• Team ofundergraduatesadvised by Prof.Alain Kornhauser
Princeton AutonomousVehicle Engineering
Program
The Princeton Team:
40 undergraduates
8 faculty advisors
http://pave.princeton.edu/main/team/
• DARPA Urban Challenge, 2007,stresses the difficulties of making anautonomous vehicle drive within acomplex urban network, including lossof GPS coverage, intersections, lanechanging, merging, and parking -while obeying traffic laws.