28
1 Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan

Learning through Interactive Behavior Specifications

  • Upload
    brigit

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

Learning through Interactive Behavior Specifications. Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan. Goal. Automatically generate cognitive agents Reduce the cost of agent development - PowerPoint PPT Presentation

Citation preview

Page 1: Learning through Interactive Behavior Specifications

1

Learning through Interactive Behavior Specifications

Tolga KonikCSLI, Stanford University

Douglas PearsonThree Penny Software

John LairdUniversity of Michigan

Page 2: Learning through Interactive Behavior Specifications

2

Goal

Automatically generate cognitive agents

Reduce the cost of agent development

Reduce the expertise required to develop agents.

Page 3: Learning through Interactive Behavior Specifications

3

Domains

Autonomous Cognitive agents Dynamic Virtual Worlds Real time decisions based on

knowledge and sensed data Soar agent architecture

Page 4: Learning through Interactive Behavior Specifications

4

Learning by Observation

Approach: Observe expert behavior Learn to replicate it

Why? We may want human-like agents In complex domains, imitating

humans maybe easier than learning from scratch

Page 5: Learning through Interactive Behavior Specifications

5

Bottleneck in pure Learning by Observation

PROBLEM: You cannot observe the internal reasoning

of the expert

SOLUTION: Ask the expert for additional information

Goal annotations Use additional knowledge sources

Task & domain knowledge

Page 6: Learning through Interactive Behavior Specifications

6

Learning by Observation

Agent

Actions Percepts

Learner

Goalannotations

Additional Task Knowledge

Interface EnvironmentExpert

Page 7: Learning through Interactive Behavior Specifications

7

Agent Interface Environment

ILP 2004

Machine Learning Journal (forthcoming)

Learning by Observation

Page 8: Learning through Interactive Behavior Specifications

8

Learning by ObservationCritic Mode

Agent Interface Environment

Expert

critic

Learner

Page 9: Learning through Interactive Behavior Specifications

9

One Body, Two Minds

?

How and when to switch control

How the expert and the agent program communicate

? Agent Interface Environment

Expert

Page 10: Learning through Interactive Behavior Specifications

10

Expert

Diagrammatic Behavior Specification

Agent

EnvironmentRedux

Learner

Page 11: Learning through Interactive Behavior Specifications

11

Redux

Visual rule editing

Diagrammatic Behavior Specification

Page 12: Learning through Interactive Behavior Specifications

12

Get-item-in-room(Item)

Get-item(Item)

Go-through(Door)

Goto-next-roomGet-item-different-room(Item)

Go-to-door(D)Go-to(Door)

Goal Hierarchy

Task-Performance knowledge is represented with a hierarchy of durative goals.

i3

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3 i3 i3

Page 13: Learning through Interactive Behavior Specifications

13

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

Get-item-in-room(Item)

Get-item(i3)

Go-through(Door)

Goto-next-roomGet-item-different-room(Item)

Go-to-door(D)Go-to(Door)

i3

Get-item-in-room(i3)

Item=i3

Goal Hierarchy

Page 14: Learning through Interactive Behavior Specifications

14

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

Get-item-different-room(Item)Get-item-different-room(i3)

Go-to(Door)

Get-item-in-room(Item)

Get-item(i3)

Go-through(Door)Go-to(d1)

i3

Door=d1

Item=i3

Goal Hierarchy

Page 15: Learning through Interactive Behavior Specifications

15

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

Get-item-in-room(Item)

Get-item(i3)

Go-through(d1)

Goto-next-roomGet-item-different-room(i3)

Go-to-door(D)Go-to(Door)

i3

Door=d1

Goal Hierarchy

Page 16: Learning through Interactive Behavior Specifications

17

Behavior Specification

Agent

Expert

Expert draws initial abstract situation Create senario by selecting actions

Page 17: Learning through Interactive Behavior Specifications

18

Goal Specification

Agent

Expert

Goals are explicitly selected The agent contributes based on the current

situation, current goal and its knowledge

Page 18: Learning through Interactive Behavior Specifications

20

Goal Hierarchy

Learning by Observation perspective Unobservable mental reasoning of the expert

Learning Perspective Bias hypothesis space “learn agent” problem reduced to “learn goal

selection and termination” MI Perspective

information exchange between the expert and the agent

Page 19: Learning through Interactive Behavior Specifications

21

Relevant Knowledge Specification

Agent

Prepare food

Expert can mark important objects in a decision

Expert

Page 20: Learning through Interactive Behavior Specifications

22

Expert specified undesired actions and goals

Expert rejected actions and goals of the approximately learned agent program

Watch TV

Rich Behavior Trace

Page 21: Learning through Interactive Behavior Specifications

23

Hypothetical Actions and Goals Situation history : a tree structure of

possible behaviors

Rich Behavior Trace

Page 22: Learning through Interactive Behavior Specifications

24

Input: Relational Situations Goal and action selections and rejections Additional annotations (i.e. important objects) Background knowledge

Output: Rule based agent program

Learn goal/action selection/termination generalizing over multiple examples

Inductive Logic Programming to combine rich knowledge structures

Relational Learning by Observation

Page 23: Learning through Interactive Behavior Specifications

25

Relational Learning by Observation

Page 24: Learning through Interactive Behavior Specifications

26Find the common structures in the decision examples

Relational Learning by Observation

Page 25: Learning through Interactive Behavior Specifications

27

?

“Select a door in the current room, which leads to a room that contains the item the agent wants to get”

Learn relations between what the agent wants, perceives and knows.

Relational Learning by Observation

Page 26: Learning through Interactive Behavior Specifications

32

Summary

Diagrammatic behavior specification approach: To extract rich behavior knowledge Interactive behavior specification Communication medium between the

agents (explicit goals and assumed situation)

Relational learning by observation approach to combine multiple complex knowledge sources

Page 27: Learning through Interactive Behavior Specifications

33

Future Work

Improve mixed initiative interaction of the interface

Explore domain independent diagrammatic interface features

Allow the expert to enter context sensitive knowledge

Page 28: Learning through Interactive Behavior Specifications

34

Mixed initiative perspective

Interactive behavior specification Diagrammatic representation of behavior

communication medium between the agents Explicit goals and desired behavior

Facilitates interaction between the agents