Upload
akamu
View
28
Download
3
Embed Size (px)
DESCRIPTION
Selection of Behavioral Parameters: Integration of Case-Based Reasoning with Learning Momentum. Brian Lee, Maxim Likhachev, and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA. This research was funded under the DARPA MARS program. Integrated Multi-layered Learning. - PowerPoint PPT Presentation
Citation preview
Selection of Behavioral Parameters: Integration of Case-Based Reasoning
with Learning Momentum
Brian Lee, Maxim Likhachev,and Ronald C. ArkinMobile Robot LaboratoryGeorgia TechAtlanta, GA
This research was funded under the DARPA MARS program.
Integrated Multi-layered Learning
THE LEARNINGCONTINUUM:
Deliberative (pre-mission)
.
.
.
Behavioral switching
.
.
.
Reactive (online
adaptation)
.
.
.
• CBR Wizardry– Guide the operator
• Probabilistic Planning– Manage complexity
for the operator• RL for Behavioral
Assemblage Selection– Learn what works for
the robot• CBR for Behavior
Transitions– Adapt to situations
the robot can recognize
• Learning Momentum– Vary robot
parameters in real time
Motivation
• It’s hard to manually derive behavioral controller parameters.– The parameter space increases exponentially with the number of
parameters.
• You don’t always have a priori knowledge of the environment.– Without prior knowledge, a user can’t confidently derive
appropriate parameter values, so it becomes necessary for the robot to adapt on its own to what it finds.
• Obstacle densities and layout in the environment may be heterogeneous.– Parameters that work well for one type of environment may not
work well with another type.
• A solution is to provide adaptability to the system while remaining fully reactive.
Context for Case-based Reasoning (CBR)
• Spatial and temporal features are used to select stored cases from a case library.
• Cases contain parameters for a behavior-based reactive controller.
• Selected parameters are adapted for the current situation.
• The controller is updated with new parameters that should be more appropriate to the current environment.
CBR Module
FeatureIdentification
SpatialFeature
Matching
TemporalFeature
Matching
RandomSelectionProcess
CaseLibrary
CaseSwitchingDecision
CaseAdaptation
CaseApplication
Sensors
Context for Learning Momentum (LM)
• A crude form of reinforcement learning.– If the robot is doing well, try doing what it’s doing a
little more, otherwise try something different.
• Behavior parameters are continually changed in response to progress and obstacles.
• Static rules for pre-defined situations are used to update behavior parameters.
• Different sets of rules for parameter changes can be used (ballooning versus squeezing).
LM Strategies
• Ballooning– Alter parameters so the robot reacts to obstacles at larger
distances than normal to push it out of box canyon situations.
• Squeezing– Alter parameters so the robot reacts to obstacles only at
shorter distances than normal so it can move between closely spaced obstacles.
• Example ballooning rule:if ( situation == NO_PROGRESS_WITH_OBSTACLES )
obstacle_sphere_of_influence += 0.5 meters
else
obstacle_sphere_of_influence -= 0.5 meters
LM Module
SensorsShort
SensorHistory
SituationMatching
BehavioralParameters
ParameterDeltas
ParameterAdaptation
Oldparameters
Adaptedparameters
Effects of CBR and LM When Used Separately
• Reported in ICRA 2001• Effects of CBR
– Distances traversed were shorter– Time taken was shorter
• Effects of LM– Completion rates were much higher for dense obstacles– Completion times were higher than those for successful
non-adaptive robots
Why Integrate?
• Want discontinuous switching + continuous searching in the parameter space.
• CBR is not continuous– Parameter changes are triggered by environment changes or
case time-outs. – Case library is manually built to provide only ballpark
solutions for different environment types.
• LM does not make large, discontinuous changes– LM may take a while to adapt to large environmental
changes.
• LM cannot change strategies at run time– The LM strategies of ballooning and squeezing are tuned for
different environments.
Currently Used Behaviors
• Move to Goal– Always returns a vector pointing toward the goal
position.
• Avoid Obstacles– Returns a sum of weighted vectors pointing away from
obstacles.
• Wander– Returns vectors pointing in random directions.
• Bias Move– Returns a vector biasing the robot’s movement in a
certain direction (i.e. away from high obstacle densities), and is set by the CBR module.
– Only used when CBR is present.
Adjustable Behavioral Parameters
• Move to goal vector gain• Avoid obstacle vector gain• Avoid obstacle sphere of influence
– Radius around the robot inside of which obstacles reacted to
• Wander vector gain• Wander persistence
– The number of consecutive steps the wander vector points in the same direction
• Bias Move vector gain• Bias Move X, Bias Move Y
– These are the components of the vector returned by Bias Move
Integration
Core Behavior-Based Controller
BehavioralParameters
Sensors
Actuators
Base System
Integration
Core Behavior-Based Controller
BehavioralParameters
Sensors
Actuators
CBR Module
Updated Parameters
Addition of CBR Module
Integration
Core Behavior-Based Controller
BehavioralParameters
Sensors
Actuators
CBR Module
LM Module
Updated Deltas and Parameter Bounds Updated Parameters
Addition of LM Module
Simulation Setup
• Heterogeneous Environments– varying obstacles density, order, and size– 350 x 350 meters
• Homogeneous Environments– even obstacle distribution– random obstacle placement and size– two environments with 15% density and two environments
with 20% density– 150 x 150 meters
CBR-LM in Simulation
Completed Runs
0%10%20%30%40%50%60%70%80%90%
100%
non-adaptive LM CBR CBR-LM
adaptation algorithm
Pe
rce
nt
Co
mp
lete
Simulation Results
For a Heterogeneous Environment
Simulation Results
For a Heterogeneous Environment
Average Steps to Completion
0
500
1000
1500
2000
2500
3000
3500
4000
non-adaptive LM CBR CBR-LM
adaptation algorithm
Ste
ps
For a Homogeneous Environment
Simulation Results
Completed Runs
0%10%20%30%40%50%60%70%80%90%
100%
15% coverage 20% coverage
adaptation algorithm
Pe
rce
nt
Co
mp
lete
non-adaptive
LM
CBR
CBR-LM
For a Homogeneous Environment
Simulation Results
Average Steps to Completion
0.00
2000.00
4000.00
6000.00
8000.00
10000.00
12000.00
14000.00
15% coverage 20% coverage
adaptation algorithm
Ste
ps
non-adaptive
LM
CBR
LM-CBR
Simulation Observations
• Beneficial Attributes of CBR are Preserved.– We see quick, radical changes in behavior.– Time taken is about the same as CBR only.
• Beneficial Attributes of LM are not always apparent.– Results can probably be attributed to a well-tuned case
library.– If the case library is good enough, LM should not be
needed.
• RWI ATRV-Jr robot• Forward and rear LMS SICK
laser scanners• Odometry, compass, and
gyroscope for localization• Straight-line start to goal
distance of about 46 meters
Physical Robot Experiments
• Outdoor environment with trees and man-made obstacles• CBR-LM, CBR, LM, and non-adaptive systems were
compared• The squeezing strategy was used in the LM-only
experiments.• Data was averaged over 10 runs per adaptation algorithm
Outdoor Run
Physical Experiments Results
• All valid runs were able to reach the goal.
• Both CBR and LM beat the non-adaptive system.
• The CBR-LM integrated system gave the best performance.
Average Steps to Completion
0
200
400
600
800
1000
1200
1400
non-adaptive
LM CBR CBR-LM
adaptation algorithmS
tep
s
Difference From Simulation
• CBR-LM outperformed CBR on the physical robot more than in simulation.– The case library for the real robot may not have been
as well tuned as the simulation library.
Time Improvement of CBR-LM Over CBR
-5%
0%
5%
10%
15%
20%
25%
30%
35%
Heterogeneous Env.
Homogeneous Env. (15%coverage)
Homogeneous Env. (20%coverage)
Physical Robot
Conclusions
• A performance increase is not guaranteed.• For a well-tuned case library, there may be little
for LM to do.• Integration of CBR and LM can result in a
performance increase– observed up to 29% improvement in steps over CBR
• Benefits of LM are more likely to be apparent when the CBR case library is not well-tuned (which is likely to be the case for real robots.)
• LM could be used to dynamically update the case library with better sets of parameters.