16
Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering and Computer Science* Institute for Simulation and Training**

Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Embed Size (px)

Citation preview

Page 1: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Evolving the goal priorities of autonomous agents

Adam Campbell*Advisor: Dr. Annie S. Wu*

Collaborator: Dr. Randall Shumaker**School of Electrical Engineering and Computer Science*

Institute for Simulation and Training**

Page 2: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Goal Develop a controller for a team of

collaborating autonomous vehicles Simple implementation Allows for new goals (behaviors) to easily be added Would like to add social interactions between the

agents in the future Evolve the parameters of this controller to

determine how the goal weights correlate with different environments

These simple tests will allow us to get a better idea of how the goals interact with one another

Having the goal priorities evolve will allow us to more easily hand code the parameters for future experiments

Page 3: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Motivation Prioritizing conflicting, parallel goals in a robot controller

is a difficult, and open problem in artificial intelligence Action selection

This research examines an evolutionary approach to the Action Selection problem

Imagine an insect with two goals Get food Avoid predator When should the “get food” goal priority be higher than

the “avoid predator” goal? Two general methods

Take one action Combine actions

Used in this research

Page 4: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Genetic algorithm Survival of fittest amongst problem

solutions General algorithm…

1) Initialize random population

2) Evaluate population3) Select individuals4) Recombine/mutate selected individuals5) If stopping condition not satisfied6) GOTO 2

Page 5: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Genetic algorithm example

Problem: find all black squares

Random population4

3

3

4

5

2

Fitness

Page 6: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

GA example continued

Selected population Crossover & Mutation5

Fitness

4

2

6

4

5

Legend:

Crossover point

Page 7: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

How is the GA used? Immediate goal functions

Produce a vector indicating where the agent should move in order to best satisfy the goal

Each immediate goal has a weight associated to it

Five immediate goal functions Avoid agent Avoid obstacle Momentum Go to area of interest (AOI) Follow obstacle

Page 8: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Additional parameters

Randomness

Comfort Allows obstacle following

to occur

0.00 0.01 0.04

Page 9: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Parameters

Simulation parameters

Test cases 2

Simulation ticks 10000

Agents 25

Runs per test case

30

Weight range [0.0, 1.0]

Genetic algorithm parameters

Population size 50

Generations 50

Crossover rate 0.9

Mutation rate (per weight)

0.005

Page 10: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Two scenarios

Environment 1 Environment 2

Page 11: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Average fitness Agents must survive and see as many AOIs as possible Not much difference in fitness between two scenarios

Page 12: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Evolved parameters

Page 13: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Evolved agents in action

Page 14: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Summary and conclusion Discussed action selection problem in

artificial intelligence and showed an evolutionary approach to solving Method combines actions of goals

Tested approach on simple problem scenarios Performed well on both scenarios

New behaviors (goals) can easily be added to the system

The parameters evolved are specific to the environment they were learned in

Page 15: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Future work Social interactions between agents

Allow communication of data between agents New immediate goal functions needed

Allow agents to have more than one set of goal weights Depending on the agent’s state (hungry, low on fuel, in

danger, etc.) use a different set of goal weights Other ways to combine vectors from immediate goal

functions Non-linear combination of vectors Genetic programming

Currently being worked on at George Mason University Better test scenarios Evolve parameters that generalize well to unseen

environments

Page 16: Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering

Related work Action selection

M. Humphrys. Action selection in a hypothetical house robot: Using those RL numbers. In Proceedings of the First International ICSC Symposia on Intelligent Industrial Automation (IIA-96) and Soft Computing (SOCO-96), 1996.

M. Humphrys. Action selection methods using reinforcement learning. In From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA, pages 135-144. MIT Press, Bradford Books, 1996.

Robot control R. C. Arkin. Motor schema based navigation for a mobile robot. In Proceedings of the IEEE

International Conference on Robotics and Automation (ICRA), Raleigh, NC, pages 264-271, May 1987.

O. Buffet, A. Dutech, and F. Charpillet. Automatic generation of an agent's basic behaviors. In Proceedings of the 2nd International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS'03), 2003.

J. Casper, M. Micire, and R. R. Murphy. Issues in intelligent robots for search and rescue. In Proceedings SPIE Volume 4024, Unmanned Ground Vehicle Technology II, pages 292-302, July 2000.

S. Koenig and M. Likhachev. Improved fast replanning for robot navigation in unknown terrain. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 968-975, 2002.

J. Rosenblatt. DAMN: A distributed architecture for mobile navigation. In Proceedings of the 1995 AAAI Spring Symposium on Lessons Learned from Implemented Software Architectures for Physical Agents. AAAI Press, March 1995.

S. P. Singh, T. Jaakkola, and M. I. Jordan. Learning without state-estimation in partially observable Markovian decision processes. In International Conference on Machine Learning, pages 284-292, 1994.