ITCS 3153Artificial Intelligence
Lecture 2Lecture 2
AgentsAgents
Lecture 2Lecture 2
AgentsAgents
Chess Article
Deep Blue (IBM)Deep Blue (IBM)
• 418 processors, 200 million positions per second418 processors, 200 million positions per second
Deep Junior (Israeli Co.)Deep Junior (Israeli Co.)
• 8 processors, 3 million positions per second8 processors, 3 million positions per second
KasparovKasparov
• 100 billion neurons in brain, 2 moves per second100 billion neurons in brain, 2 moves per second
But there are 85 billion ways to play the first four movesBut there are 85 billion ways to play the first four moves
Deep Blue (IBM)Deep Blue (IBM)
• 418 processors, 200 million positions per second418 processors, 200 million positions per second
Deep Junior (Israeli Co.)Deep Junior (Israeli Co.)
• 8 processors, 3 million positions per second8 processors, 3 million positions per second
KasparovKasparov
• 100 billion neurons in brain, 2 moves per second100 billion neurons in brain, 2 moves per second
But there are 85 billion ways to play the first four movesBut there are 85 billion ways to play the first four moves
Chess Article
Cognitive psychologists report chess is a game of Cognitive psychologists report chess is a game of pattern matching for humanspattern matching for humans
• But what patterns do we see?But what patterns do we see?
• What rules do we use to evaluate perceived patterns?What rules do we use to evaluate perceived patterns?
Cognitive psychologists report chess is a game of Cognitive psychologists report chess is a game of pattern matching for humanspattern matching for humans
• But what patterns do we see?But what patterns do we see?
• What rules do we use to evaluate perceived patterns?What rules do we use to evaluate perceived patterns?
What is an agent?
PerceptionPerception• Sensors receive input from environmentSensors receive input from environment
– Keyboard clicksKeyboard clicks
– Camera dataCamera data
– Bump sensorBump sensor
ActionAction
• Actuators impact the environmentActuators impact the environment
– Move a robotic armMove a robotic arm
– Generate output for computer displayGenerate output for computer display
PerceptionPerception• Sensors receive input from environmentSensors receive input from environment
– Keyboard clicksKeyboard clicks
– Camera dataCamera data
– Bump sensorBump sensor
ActionAction
• Actuators impact the environmentActuators impact the environment
– Move a robotic armMove a robotic arm
– Generate output for computer displayGenerate output for computer display
Perception
PerceptPercept
• Perceptual inputs at an instantPerceptual inputs at an instant
• May include perception of internal stateMay include perception of internal state
Percept SequencePercept Sequence
• Complete history of all prior perceptsComplete history of all prior percepts
Do you need a Do you need a percept sequencepercept sequence to play Chess? to play Chess?
PerceptPercept
• Perceptual inputs at an instantPerceptual inputs at an instant
• May include perception of internal stateMay include perception of internal state
Percept SequencePercept Sequence
• Complete history of all prior perceptsComplete history of all prior percepts
Do you need a Do you need a percept sequencepercept sequence to play Chess? to play Chess?
An agent as a function
Agent maps percept sequence to actionAgent maps percept sequence to action
• Agent:Agent:
– Set of all inputs known as Set of all inputs known as state spacestate space
Agent FunctionAgent Function
• If inputs are finite, a table can store mappingIf inputs are finite, a table can store mapping
• Scalable?Scalable?
• Reverse Engineering?Reverse Engineering?
Agent maps percept sequence to actionAgent maps percept sequence to action
• Agent:Agent:
– Set of all inputs known as Set of all inputs known as state spacestate space
Agent FunctionAgent Function
• If inputs are finite, a table can store mappingIf inputs are finite, a table can store mapping
• Scalable?Scalable?
• Reverse Engineering?Reverse Engineering?
*;)( ppsapsf
Evaluating agent programs
We agree on what an agent must doWe agree on what an agent must do
Can we evaluate its quality?Can we evaluate its quality?
Performance MetricsPerformance Metrics
• Very ImportantVery Important
• Frequently the hardest part of the research problemFrequently the hardest part of the research problem
• Design these to suit what you really want to happenDesign these to suit what you really want to happen
We agree on what an agent must doWe agree on what an agent must do
Can we evaluate its quality?Can we evaluate its quality?
Performance MetricsPerformance Metrics
• Very ImportantVery Important
• Frequently the hardest part of the research problemFrequently the hardest part of the research problem
• Design these to suit what you really want to happenDesign these to suit what you really want to happen
Rational Agent
For each percept sequence, a rational agent For each percept sequence, a rational agent should select an action that maximizes its should select an action that maximizes its performance measureperformance measure
Example: autonomous vacuum cleanerExample: autonomous vacuum cleaner
• What is the performance measure?What is the performance measure?
For each percept sequence, a rational agent For each percept sequence, a rational agent should select an action that maximizes its should select an action that maximizes its performance measureperformance measure
Example: autonomous vacuum cleanerExample: autonomous vacuum cleaner
• What is the performance measure?What is the performance measure?• Penalty for eating the cat? How much?Penalty for eating the cat? How much?
• Penalty for missing a spot?Penalty for missing a spot?
• Reward for speed?Reward for speed?
• Reward for conserving power?Reward for conserving power?
• Penalty for eating the cat? How much?Penalty for eating the cat? How much?
• Penalty for missing a spot?Penalty for missing a spot?
• Reward for speed?Reward for speed?
• Reward for conserving power?Reward for conserving power?
Learning and Autonomy
LearningLearning
• To update the agent function in light of observed To update the agent function in light of observed performance of percept-sequence to action pairsperformance of percept-sequence to action pairs
– ExploreExplore new parts of state space new parts of state space
Learn from trial and errorLearn from trial and error
– Change internal variables that influence action selectionChange internal variables that influence action selection
LearningLearning
• To update the agent function in light of observed To update the agent function in light of observed performance of percept-sequence to action pairsperformance of percept-sequence to action pairs
– ExploreExplore new parts of state space new parts of state space
Learn from trial and errorLearn from trial and error
– Change internal variables that influence action selectionChange internal variables that influence action selection
Adding intelligence to agent function
At design timeAt design time• Some agents are designed with clear procedure to improve Some agents are designed with clear procedure to improve
performance over time. Really the engineer’s intelligence.performance over time. Really the engineer’s intelligence.
– Camera-based user identificationCamera-based user identification
At run-timeAt run-time• Agent executes complicated equation to map input to outputAgent executes complicated equation to map input to output
Between trialsBetween trials• With experience, agent changes its program (parameters)With experience, agent changes its program (parameters)
At design timeAt design time• Some agents are designed with clear procedure to improve Some agents are designed with clear procedure to improve
performance over time. Really the engineer’s intelligence.performance over time. Really the engineer’s intelligence.
– Camera-based user identificationCamera-based user identification
At run-timeAt run-time• Agent executes complicated equation to map input to outputAgent executes complicated equation to map input to output
Between trialsBetween trials• With experience, agent changes its program (parameters)With experience, agent changes its program (parameters)
How big is your percept?
Dung BeetleDung Beetle
• Largely feed forwardLargely feed forward
Sphex WaspSphex Wasp
• Reacts to environment (feedback) but not learningReacts to environment (feedback) but not learning
A DogA Dog
• Reacts to environment and can significantly alter behavior Reacts to environment and can significantly alter behavior
Dung BeetleDung Beetle
• Largely feed forwardLargely feed forward
Sphex WaspSphex Wasp
• Reacts to environment (feedback) but not learningReacts to environment (feedback) but not learning
A DogA Dog
• Reacts to environment and can significantly alter behavior Reacts to environment and can significantly alter behavior
Qualities of a task environment
Fully ObservableFully Observable• Agent need not store any aspects of stateAgent need not store any aspects of state
– The Brady Bunch as intelligent agentsThe Brady Bunch as intelligent agents
– Volume of observables may be overwhelmingVolume of observables may be overwhelming
Partially ObservablePartially Observable• Some data is unavailableSome data is unavailable
– MazeMaze
– Noisy sensorsNoisy sensors
Fully ObservableFully Observable• Agent need not store any aspects of stateAgent need not store any aspects of state
– The Brady Bunch as intelligent agentsThe Brady Bunch as intelligent agents
– Volume of observables may be overwhelmingVolume of observables may be overwhelming
Partially ObservablePartially Observable• Some data is unavailableSome data is unavailable
– MazeMaze
– Noisy sensorsNoisy sensors
Qualities of a task environment
DeterministicDeterministic
• Always the same outcome for state/action pairAlways the same outcome for state/action pair
StochasticStochastic
• Not always predictable – randomNot always predictable – random
Partially Observable vs. StochasticPartially Observable vs. Stochastic
• My cats think the world is stochasticMy cats think the world is stochastic
• Physicists think the world is deterministicPhysicists think the world is deterministic
DeterministicDeterministic
• Always the same outcome for state/action pairAlways the same outcome for state/action pair
StochasticStochastic
• Not always predictable – randomNot always predictable – random
Partially Observable vs. StochasticPartially Observable vs. Stochastic
• My cats think the world is stochasticMy cats think the world is stochastic
• Physicists think the world is deterministicPhysicists think the world is deterministic
Qualities of a task environment
MarkovianMarkovian
• Future state only depends on current state Future state only depends on current state
EpisodicEpisodic
• Percept sequence can be segmented into independent temporal Percept sequence can be segmented into independent temporal categoriescategories
– Behavior at traffic light independent of previous trafficBehavior at traffic light independent of previous traffic
SequentialSequential
• Current decision could affect all future decisionsCurrent decision could affect all future decisions
Which is easiest to program?Which is easiest to program?
MarkovianMarkovian
• Future state only depends on current state Future state only depends on current state
EpisodicEpisodic
• Percept sequence can be segmented into independent temporal Percept sequence can be segmented into independent temporal categoriescategories
– Behavior at traffic light independent of previous trafficBehavior at traffic light independent of previous traffic
SequentialSequential
• Current decision could affect all future decisionsCurrent decision could affect all future decisions
Which is easiest to program?Which is easiest to program?
Qualities of a task environment
StaticStatic
• Environment doesn’t change over timeEnvironment doesn’t change over time
– Crossword puzzleCrossword puzzle
DynamicDynamic
• Environment changes over timeEnvironment changes over time
– Driving a carDriving a car
Semi-dynamicSemi-dynamic
• Environment is static, but performance metrics are dynamicEnvironment is static, but performance metrics are dynamic
– Drag racingDrag racing
StaticStatic
• Environment doesn’t change over timeEnvironment doesn’t change over time
– Crossword puzzleCrossword puzzle
DynamicDynamic
• Environment changes over timeEnvironment changes over time
– Driving a carDriving a car
Semi-dynamicSemi-dynamic
• Environment is static, but performance metrics are dynamicEnvironment is static, but performance metrics are dynamic
– Drag racingDrag racing
Qualities of a task environment
DiscreteDiscrete• Values of a state space feature (dimension) are constrained Values of a state space feature (dimension) are constrained
to distinct values from a finite setto distinct values from a finite set
– Blackjack: f(your cards, exposed cards) = actionBlackjack: f(your cards, exposed cards) = action
ContinuousContinuous• Variable has infinite variationVariable has infinite variation
– Antilock brakes: f (vehicle speed, wheel velocity) = unlockAntilock brakes: f (vehicle speed, wheel velocity) = unlock
– Are computers really continuous?Are computers really continuous?
DiscreteDiscrete• Values of a state space feature (dimension) are constrained Values of a state space feature (dimension) are constrained
to distinct values from a finite setto distinct values from a finite set
– Blackjack: f(your cards, exposed cards) = actionBlackjack: f(your cards, exposed cards) = action
ContinuousContinuous• Variable has infinite variationVariable has infinite variation
– Antilock brakes: f (vehicle speed, wheel velocity) = unlockAntilock brakes: f (vehicle speed, wheel velocity) = unlock
– Are computers really continuous?Are computers really continuous?
Qualities of a task environment
Towards a terse description of problem domainsTowards a terse description of problem domains• State space: features, dimensionality, degrees of freedomState space: features, dimensionality, degrees of freedom
• Observable?Observable?
• Predictable?Predictable?
• Dynamic?Dynamic?
• Continuous?Continuous?
• Performance metricPerformance metric
Towards a terse description of problem domainsTowards a terse description of problem domains• State space: features, dimensionality, degrees of freedomState space: features, dimensionality, degrees of freedom
• Observable?Observable?
• Predictable?Predictable?
• Dynamic?Dynamic?
• Continuous?Continuous?
• Performance metricPerformance metric
Building Agent Programs
The table approachThe table approach• Build a table mapping states to actionsBuild a table mapping states to actions
– Chess has 10Chess has 10150 150 entries (10entries (108080 atoms in the universe) atoms in the universe)
– I’ve said memory is free, but keep it within the confines of I’ve said memory is free, but keep it within the confines of the boundable universethe boundable universe
• Still, tables have their placeStill, tables have their place
Discuss four agent program principlesDiscuss four agent program principles
The table approachThe table approach• Build a table mapping states to actionsBuild a table mapping states to actions
– Chess has 10Chess has 10150 150 entries (10entries (108080 atoms in the universe) atoms in the universe)
– I’ve said memory is free, but keep it within the confines of I’ve said memory is free, but keep it within the confines of the boundable universethe boundable universe
• Still, tables have their placeStill, tables have their place
Discuss four agent program principlesDiscuss four agent program principles
Simple Reflex Agents
• Sense environmentSense environment
• Match sensations with rules in databaseMatch sensations with rules in database
• Rule prescribes an actionRule prescribes an action
Reflexes can be badReflexes can be bad
• Don’t put your hands down when falling backwards!Don’t put your hands down when falling backwards!
Inaccurate informationInaccurate information
• Misperception can trigger reflex when inappropriateMisperception can trigger reflex when inappropriate
But rules databases can be made large and complexBut rules databases can be made large and complex
• Sense environmentSense environment
• Match sensations with rules in databaseMatch sensations with rules in database
• Rule prescribes an actionRule prescribes an action
Reflexes can be badReflexes can be bad
• Don’t put your hands down when falling backwards!Don’t put your hands down when falling backwards!
Inaccurate informationInaccurate information
• Misperception can trigger reflex when inappropriateMisperception can trigger reflex when inappropriate
But rules databases can be made large and complexBut rules databases can be made large and complex
Simple Reflex Agents
RandomizationRandomization
• The vacuum cleaner problemThe vacuum cleaner problem
RandomizationRandomization
• The vacuum cleaner problemThe vacuum cleaner problem
Left Right
DirtyDirty
Model-based Reflex Agents
So when you can’t see something, you model it!So when you can’t see something, you model it!
• Create an internal variable to store your expectation of Create an internal variable to store your expectation of variables you can’t observevariables you can’t observe
• If I throw a ball to you and it falls short, do I know why?If I throw a ball to you and it falls short, do I know why?
– Aerodynamics, mass, my energy levels…Aerodynamics, mass, my energy levels…
– I do have a modelI do have a model
Ball falls short, throw harderBall falls short, throw harder
So when you can’t see something, you model it!So when you can’t see something, you model it!
• Create an internal variable to store your expectation of Create an internal variable to store your expectation of variables you can’t observevariables you can’t observe
• If I throw a ball to you and it falls short, do I know why?If I throw a ball to you and it falls short, do I know why?
– Aerodynamics, mass, my energy levels…Aerodynamics, mass, my energy levels…
– I do have a modelI do have a model
Ball falls short, throw harderBall falls short, throw harder
Model-based Reflex Agents
Admit it, you can’t see and understand everythingAdmit it, you can’t see and understand everything
Models are very important!Models are very important!
• We all use models to get through our livesWe all use models to get through our lives
– Psychologists have many names for these context-Psychologists have many names for these context-sensitive modelssensitive models
• Agents need models tooAgents need models too
Admit it, you can’t see and understand everythingAdmit it, you can’t see and understand everything
Models are very important!Models are very important!
• We all use models to get through our livesWe all use models to get through our lives
– Psychologists have many names for these context-Psychologists have many names for these context-sensitive modelssensitive models
• Agents need models tooAgents need models too
Goal-based Agents
Lacking moment-to-moment performance measureLacking moment-to-moment performance measure
Overall goal is knownOverall goal is known
How to get from A to B?How to get from A to B?
• Current actions have future consequencesCurrent actions have future consequences
• SearchSearch and and PlanningPlanning are used to explore paths through state are used to explore paths through state space from A to Bspace from A to B
Lacking moment-to-moment performance measureLacking moment-to-moment performance measure
Overall goal is knownOverall goal is known
How to get from A to B?How to get from A to B?
• Current actions have future consequencesCurrent actions have future consequences
• SearchSearch and and PlanningPlanning are used to explore paths through state are used to explore paths through state space from A to Bspace from A to B
Utility-based Agents
Goal-directed agents that have a utility functionGoal-directed agents that have a utility function
• Function that maps internal and external states into a scalarFunction that maps internal and external states into a scalar
– A scalar is a numberA scalar is a number
Goal-directed agents that have a utility functionGoal-directed agents that have a utility function
• Function that maps internal and external states into a scalarFunction that maps internal and external states into a scalar
– A scalar is a numberA scalar is a number
Learning Agents
Learning ElementLearning Element
• Making improvementsMaking improvements
Performance ElementPerformance Element
• Selecting actionsSelecting actions
CriticCritic
• Provides learning element with feedback about progressProvides learning element with feedback about progress
Problem GeneratorProblem Generator
• Provides suggestions for new tasks to explore state spaceProvides suggestions for new tasks to explore state space
Learning ElementLearning Element
• Making improvementsMaking improvements
Performance ElementPerformance Element
• Selecting actionsSelecting actions
CriticCritic
• Provides learning element with feedback about progressProvides learning element with feedback about progress
Problem GeneratorProblem Generator
• Provides suggestions for new tasks to explore state spaceProvides suggestions for new tasks to explore state space
A taxi driver
Performance ElementPerformance Element
• Knowledge of how to drive in trafficKnowledge of how to drive in traffic
CriticCritic
• Observes tips from customers and horn honking from other carsObserves tips from customers and horn honking from other cars
Learning ElementLearning Element
• Relates low tips to actions that may be the causeRelates low tips to actions that may be the cause
Problem GeneratorProblem Generator
• Proposes new routes to try and improved driving skillsProposes new routes to try and improved driving skills
Performance ElementPerformance Element
• Knowledge of how to drive in trafficKnowledge of how to drive in traffic
CriticCritic
• Observes tips from customers and horn honking from other carsObserves tips from customers and horn honking from other cars
Learning ElementLearning Element
• Relates low tips to actions that may be the causeRelates low tips to actions that may be the cause
Problem GeneratorProblem Generator
• Proposes new routes to try and improved driving skillsProposes new routes to try and improved driving skills
Review
Outlined families of AI problems and solutionsOutlined families of AI problems and solutions
Next class we study search problemsNext class we study search problems
Outlined families of AI problems and solutionsOutlined families of AI problems and solutions
Next class we study search problemsNext class we study search problems