5
1 ISLE Transfer Learning Team Main Technology Components The ICARUS Architecture Markov Logic Networks Executes skills in the environment Long-Term Long-Term Conceptual Conceptual Memory Memory Short-Term Short-Term Conceptual Conceptual Memory Memory Short-Term Short-Term Goal/Skill Goal/Skill Memory Memory Conceptual Conceptual Inference Inference Skill Skill Execution Execution Perception Perception Environment Environment Perceptual Perceptual Buffer Buffer Problem Solving Problem Solving Skill Learning Skill Learning Motor Motor Buffer Buffer Skill Retrieval Skill Retrieval Long-Term Long-Term Skill Memory Skill Memory Contains relational and hierarchical knowledge about relevant concepts Generates beliefs using observed environment and long term conceptual knowledge Creates internal description of the perceived environment Contains descriptions of the perceived objects Contains inferred beliefs about the environment Contains hierarchical knowledge about executable skills Finds novel solutions for achieving goals Acquires skills from successful problem solving traces Selects relevant skills based on beliefs and goals Contains goals and intentions The Soar Architecture The Companions Architecture Body Long-Term Memories Procedural Short-Term Memory Decision Procedur e Chunking Episodic Episodic Learning Semantic Learning Semantic Reinforceme nt Learning Percepti on Action Markov Logic Weighted Satisfiabi lity Markov Chain Monte Carlo Inductive Logic Programming Weight Learning Target Domain Source Domain

1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory

Embed Size (px)

Citation preview

Page 1: 1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory

1

ISLE Transfer Learning TeamMain Technology Components

The ICARUS Architecture

Markov Logic Networks

Executes skills in the environment

Long-TermLong-TermConceptualConceptual

MemoryMemory

Short-TermShort-TermConceptualConceptual

MemoryMemory

Short-TermShort-TermGoal/SkillGoal/SkillMemoryMemory

ConceptualConceptualInferenceInference

SkillSkillExecutionExecution

PerceptionPerception

EnvironmentEnvironment

PerceptualPerceptualBufferBuffer

Problem SolvingProblem SolvingSkill LearningSkill Learning

MotorMotorBufferBuffer

Skill RetrievalSkill Retrieval

Long-TermLong-TermSkill MemorySkill Memory

Contains relational and hierarchical knowledge

about relevant concepts

Generates beliefs using observed environment and long term

conceptual knowledge

Creates internal description of the

perceived environment

Contains descriptions of the perceived objects

Contains inferred beliefs about the environment

Contains hierarchical knowledge about executable skills

Finds novel solutions for

achieving goals

Acquires skills from successful problem

solving traces

Selects relevant skills based on beliefs and goals

Contains goals and intentions

The Soar Architecture

The Companions Architecture

Body

Long-Term MemoriesProcedural

Short-Term Memory

Dec

isio

n P

roce

dure

Chunking

Episodic

EpisodicLearning

SemanticLearning

Semantic

ReinforcementLearning

Perception Action

MarkovLogic

WeightedSatisfiability

Markov ChainMonte Carlo

Inductive LogicProgramming

WeightLearning

TargetDomain

SourceDomain

Page 2: 1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory

2

Year 1: Transfer in Three Testbeds

1. A ball is released from rest from the top of a 200 m tall building on Earth and falls to the ground. If air resistance is negligible, which of the following is most nearly equal to the distance the ball falls during the first 4 s after it is released?

(A) 20 m (gt/2) (B) 40 m (gt)

(C) 80 m ( 2 2gt )

(D) 160 m ( 2gt ) Concepts: One dimensional kinematics with constant acceleration, distance-time relationship.

Rationale: From the distance-time relationship for constant acceleration, 20 2x v t at= + , initial

speed 0v is zero so the distance is equal to 2 2gt , where g, the acceleration due to gravity, is

approximately 10 2m s .

Urban Combat is a first-person shooter game that involves spatial reasoning and reactive control

General Game Playing covers a broad class of N-person games that involve strategic reasoning.

ETS Physics involves finding answers to physics problems through a mixture of plausible inference and search

Crawl under

Climb over

UrbanCombat

ETSPhysics

GGP

Soar

Companions

ICARUS

Our Year 1 efforts focused on Urban Combat (Soar and ICARUS), but with some work on GGP (ICARUS) and ETS Physics Companions)

source source

targettarget

Year 1 emphasizes lower levels (1 to 8) of transfer learning

Page 3: 1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory

3

ISLE Team: Year 2 Integration Plans

Alchemy (Washington)

Long-TermLong-TermConceptualConceptual

MemoryMemory

Short-TermShort-TermConceptualConceptual

MemoryMemory

Short-TermShort-TermGoal/SkillGoal/SkillMemoryMemory

ConceptualConceptualInferenceInference

SkillSkillExecutionExecution

PerceptionPerception

EnvironmentEnvironment

PerceptualPerceptualBufferBuffer

Problem SolvingProblem SolvingSkill LearningSkill Learning

MotorMotorBufferBuffer

Skill RetrievalSkill Retrieval

Long-TermLong-TermSkill MemorySkill Memory

Reinforcement Learning (UT Austin)

WeightedSatisfiability

Markov ChainMonte Carlo

Inductive LogicProgramming

WeightLearning

MarkovLogic

β(A’)→A

γ(S’)→S

1

2:S A ' :S' A '

ICARUS (ISLE)

Hierarchical Task Networks (Maryland)

InsertCycorp

Mountainhere

CYC (Cyrcorp)

Year 2 integration will revolve around replacing existing ICARUS modules with software from other team members

Page 4: 1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory

4

Year 2: Deep Transfer in Three Testbeds

In Urban Combat, we will demonstrate transfer from urban military missions to fire rescue operations

In General Game Playing, we will show transfer across quite different games that use related strategies(e.g., forking moves)Crawl under

Climb over

UrbanCombat

ETSPhysics

GGP

Soar

Companions

ICARUS

In Year 2, We will evaluate each pair of architectures on at least one shared testbed and we will test them all on the General Game Playing testbed

source

source

targettarget

Year 2 will focus on higher levels (9 -10) of transfer learning

In ETS Physics, we will show transfer from linear systems to rotational, thermal, hydraulic, and electrical systems

source

target

Page 5: 1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory

5

ISLE Team: Year 2 Evaluation Plans

Comparison among architectures that use different mechanisms should reveal which approaches best support transfer learning.

We will evaluate each pair of architectures on at least one shared testbed and will test them all on the General Game Playing testbed.

Experiments will examine how well the frameworks support transfer in settings that emphasize reactive control, conceptual inference, and heuristic search.

Year 2 evaluations will focus on high-level (9 and 10) transfer in each of the three testbeds.

UrbanCombat

ETSPhysics

GGP

Soar

Companions

ICARUS

Arc

hite

ctur

es

Testbeds