Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Logic and Decision Theory
The prisoner’s dilemma Deciding whether or not to take an umbrella Smart choices Combining decision theory and abduction in Abductive Logic Programming (ALP) ALP gives a logical semantics to production systems ALP includes composite event processing and composite transactions
The need to reason with utility and uncertainty The general pattern of cause and effect: utility a particular outcome happens if I do a certain action and the world is in a particular state. uncertainty David Poole [Journal of Artificial Intelligence, 1997] has shown that associating probabilities with assumptions gives ALP the expressive power of Bayesian networks.
The need to reason with utility and uncertainty You will be rich if you buy a lottery ticket and your number is chosen. It will rain if you do a rain dance and the gods are pleased. You can control your own actions (like buying a ticket or doing a rain dance). You might be able to judge the utility of their consequences (£1 million) But you cannot always control the state of the world (your number is chosen or the gods are pleased). You might be to judge its probability (one in 10 million?).
The Prisoner’s Dilemma in Classical Decision Theory
Action State of the world John cooperates John refuses
I cooperate I refuse
I get 3 years in jail I get 0 years in jail I get 4 years in jail I get 1 year in jail
Goal: If I am arrested then I cooperate or I refuse
Observe Act
The World
Prisoner’s dilemma in Computational Logic
I am arrested
Decide
5
Derive consequences, judge their utility and uncertainty
The Prisoner’s Dilemma in Computational Logic
Maintenance Goal: If I am arrested then I cooperate or I refuse
Beliefs: A prisoner gets 0 years in jail if the prisoner cooperates and the other prisoner refuses
A prisoner gets 4 years in jail if the prisoner refuses and the other prisoner cooperates
A prisoner gets 3 years in jail if the prisoner cooperates and the other prisoner cooperates
A prisoner gets 1 year in jail if the prisoner refuses to cooperate and the other prisoner refuses Alternative candidate actions: I cooperate or I refuse
action1
action2
p11
p12
p13
p14
p21
p22
p23
p24
u11
u12
u23
u13
u14
u21
u22
u24
Expected utility of action1 p11·u11+p12·u12+p13·u13+p14·u14 p11+p12+p13+p14 = 1
Expected utility of action2 p21·u21+p22·u22+p23·u23+p24·u24 p21+p22+p23+p24 = 1
Decision Theory: to find the expected utility of a candidate action, find all its possible consequences, judge their probabilities and utilities weigh each utility by its probability, and add them all up. Choose the action of highest expected utility
Classical Decision Theory
Assume Probability john cooperates = .5 Probability john refuses = .5 Assume I cooperate. Reason forwards: Utility if john refuses = I get 0 years in jail Utility if john cooperates = I get 3 years in jail Expected utility . = 5 × 0 + .5 × 3 = 1.5 Assume I refuse . Reason forwards: Utility if john cooperates = I get 4 years in jail Utility if john refuses = I get 1 years in jail Expected utility = .5 × 4 + .5 × 1 = 2.5 Decide 2.5 > 1.5 I cooperate with the police! The decision does not depend on exact judgements of utility and probability.
An alternative is to use lower level maintenance goals (or heuristics) instead:
if an agent requests me to perform an action, and the action does not harm another person and the action does not require much resources then I perform the action.
if an agent requests me to perform an action, and the action harms another person then I refuse to perform the action.
if an agent requests me to perform an action, and the action requires much resources then I refuse to perform the action.
A more common example
Maintenance goal: if I go out then I am prepared Beliefs: I am prepared if I take the umbrella I am prepared if I leave the umbrella I stay dry if I take the umbrella.
I get wet if I leave the umbrella and it rains. I stay dry if it does not rain.
Alternative candidate actions:
I take the umbrella or I leave the umbrella.
Logic and Decision Theory
The prisoner’s dilemma Deciding whether or not to take an umbrella Smart choices Combining decision theory and abduction in Abductive Logic Programming (ALP) ALP gives a logical semantics to production systems ALP includes composite event processing and composite transactions
Deciding whether or not to carry an umbrella Assume Probability it rains = .1 Probability it does not rain = .9 Utility of getting wet = – 10 Utility of staying dry = 1 Utility of taking the umbrella = – 2 Utility of leaving the umbrella = 0 Assume I take the umbrella. Reason forwards: Forward reasoning I stay dry with probability 1 and utility -2 +1. Expected utility 1 × -1 = -1 Assume I leave the umbrella. Reason forwards: Forward reasoning I get wet with probability .1 and utility -10. I stay dry with probability .9 and utility 1 Expected utility 0 -10 × .1 + 1 × .9 = -1 + .9 = -.1 Decide: -1 < -.1 So I leave the umbrella! The decision does not depend on exact judgements of utility and probability.
An alternative is to use lower level maintenance goals (or heuristics) instead: If I go out and it is raining then I take an umbrella. If I go out and there are dark clouds in the sky then I take an umbrella. If I go out and the weather forecast predicts rain then I take an umbrella. The heuristics compile decision-making into lower-level maintenance goals.
Logic and Decision Theory
The prisoner’s dilemma Deciding whether or not to take an umbrella Smart choices Combining decision theory and abduction in Abductive Logic Programming (ALP) ALP gives a logical semantics to production systems ALP includes composite event processing and composite transactions
Smart Choices: A Practical Guide to Making Better Decisions 73 customer reviews
5 star – 49 4 star – 17 3 star – 3 1 star - 1
Smart choices – a better decision theory
Classical decision theory assumes that all of the alternative candidate actions are fixed and given in advance. To make smart decisions: • identify the goals that motivate the alternatives • identify the beliefs that reduced the goals to actions • judge whether the beliefs are true • investigate whether there are any other true beliefs that might generate other alternatives • investigate and take into account any other goals that were not considered when the actions were generated • identify the events that triggered the goal that motivated the actions and prepare for them before they happen.
Conflicting ways of solving different goals can sometimes be resolved by finding alternative solutions
Improve enjoyment of life
Improve standard of living
Work less hard Provide for old age
Save money Increase pay
Work harder Go on strike
and
or
or
Conflicting ways of solving different goals can sometimes be resolved by finding alternative solutions
Improve enjoyment of life
Improve standard of living
Work less hard Provide for old age
Save money Increase pay
Work harder Go on strike
and
or
or
Logic and Decision Theory
The prisoner’s dilemma Deciding whether or not to take an umbrella Smart choices Combining decision theory and abduction in Abductive Logic Programming (ALP) ALP gives a logical semantics to production systems ALP includes composite event processing and composite transactions
Decision making and abduction can be unified
Decision Making = decide between actions Abduction step1 = generate assumptions and Abduction step 2 = decide between assumptions Common features:
Reason backwards from goals to generate assumptions
Reason forwards from assumptions to generate consequences
Judge probability and utility of consequences to evaluate alternative candidate assumptions
Decide between alternative candidates
Abduction
Abduction (step 1) is the task of generating assumptions ∆ to explain observations O. For example: Observation: there is smoke Belief: there is smoke if there is a fire. The assumptions ∆ = {there is a fire}
• explain the observation • can be generated by reasoning backwards from the observation.
Semantics of an abductive logic program <G, B, A>
G is a set of goals or constraints represented by clauses in FOL B is set of beliefs represented as a logic program A is a set of atomic sentnces representing actions or other assumptions. Given a set O of atomic sentences representing observations, the task is to generate a set ∆ ⊆ A such that G ∪ O is true in the “intended” model determined by B* = B ∪ ∆ . If B is a set of Horn clauses (no negative conditions), then the intended model of B* is the unique minimal model of B*.
The semantics of ALP is to make all goals and observations true
G: if there is an emergency then I get help. B: a person gets help if the person alerts the driver. a person alerts the driver if the person presses the alarm signal button. there is an emergency if there is a fire. there is smoke if there is a fire. O : there is smoke G ∪ O is true in the minimal model of B ∪ ∆ ∆ = {there is a fire, I press the alarm button}. makes O true. makes G true.
observe act
Maintenance goal
Forward reasoning
Backward reasoning
Achievement goal I get help
I press the alarm signal button
I alert the driver
If there is an emergency then I get help There is an emergency
Abduction used to explain observations
Passenger
The world
There is smoke
There is a fire
The term abduction was introduced by the logician Charles Sanders Peirce (1931)
He gave the following example: Deduction: All the beans from this bag are white. These beans are from this bag. Therefore These beans are white. Induction: These beans are from this bag. These beans are white. Therefore All the beans from this bag are white. Abduction: All the beans from this bag are white. These beans are white. Therefore These beans are from this bag.
ALP unifies philosophy of science and decision theory
There can be several, alternative ∆ that, with B, make G and O both true. The challenge (step 2) is to find the best ∆ within the computational resources available. In philosophy of science, the value of an explanation ∆ is measured similarly in terms of its probability and utility. (The more observations explained the more useful.) In classical decision theory, the value of actions ∆ is measured by the expected utility of their consequences.
There can be different candidate abductive explanations ∆ of the same observation
Beliefs: the grass is wet if it rained. the grass is wet if the sprinkler was on. Observation: the grass is wet. Backward reasoning: or Explanations: it rained. the sprinkler was on. The decision-theoretic challenge (step 2) is to find the best ∆ within the computational resources available.
There can be several alternative abductive explanations of the same observation. The more observations explained the better.
Beliefs: the grass is wet if it rained. the grass is wet if the sprinkler was on. Observation: the grass is wet. Additional observation: the skylight is wet. Forward Backward reasoning: reasoning: Hypotheses: it rained. the sprinkler was on. Different hypotheses can have different consequences. Different hypotheses can also have different probabilities. In Dubai, it would be more probable that the sprinkler was on. In Japan, it would be more probable that it rained.
Abductive hypotheses should be relevant to the observation
Relevance is automatically satisfied by reasoning backwards . Backward reasoning ensures every hypothesis is relevant to the observation. Relevance is weaker than the requirement that explanations be minimal. Minimality insists that no subset of the explanation is also an explanation. Beliefs: the floor is wet if it rained and the window was open. the floor is wet if it rained and there is a hole in the roof. there is a hole in the roof. Observation: the floor is wet. Relevant explanation: it rained and the window was open. Minimal explanation: it rained. Irrelevant explanation: it rained and the dog was barking.
Abductive hypotheses should be consistent with beliefs and constraints.
The consistency requirement excludes impossible explanations, such as it rained, if there were clothes outside and they didn’t get wet. G: if a thing is dry and the thing is wet then false. i.e. nothing is both dry and wet. B: the grass is wet if it rained. the grass is wet if the sprinkler was on. the clothes outside are wet if it rained. the clothes outside are dry. Observation: the grass is wet. Hypotheses: it rained or the sprinkler was on. Hypothesis: it rained Forward reasoning: the clothes outside are wet Forward reasoning: if the clothes outside are dry then false Forward reasoning: false The derivation of false eliminates it rained as an explanation of the grass is wet.
Sherlock Holmes (The Adventure of the Beryl Coronet)
“It is an old maxim of mine that when you have excluded the impossible, whatever remains, however improbably, must be the truth.” Sherlock Holmes described this as deduction. Deduction in logic leads from given beliefs to inescapable conclusions. If the beliefs used to deduce the conclusions are true, then the conclusions must also be true. Abduction can lead from true observations and other beliefs to false hypotheses.
ALP for fault diagnosis Beliefs: car doesn’t start if fault in the battery. car doesn’t start if fault in the fuel supply. . Observation: car doesn’t start Backward reasoning Hypotheses: fault in the battery fault in the fuel supply We may prefer to believe a hypothesis that is judged to have greater probability, using statistics about the past. The decision-theoretic challenge (step 2) is to find the best ∆ within the computational resources available.
ALP for global warming
Beliefs: world temperatures rise if there is a man-made increase in greenhouse gases. world temperatures rise if there is natural climate change. . Observation: world temperatures are rising Backward reasoning Hypotheses: there is a man-made there is natural increase in greenhouse gases climate change We may prefer to believe a hypothesis that is judged to have greater probability. According to expert opinion, most of the observed increase in global temperatures since the mid-20th century is more than 90% likely to be due to the increase in man-made greenhouse gas concentrations. The decision-theoretic challenge (step 2) is to find the best within the computational resources available.
Logic and Decision Theory
The prisoner’s dilemma Deciding whether or not to take an umbrella Smart choices Combining decision theory and abduction in Abductive Logic Programming (ALP) ALP gives a logical semantics to production systems ALP includes composite event processing and composite transactions
Three kinds of production rules
Logical rules used to reason forwards.
Reactive rules that implement heuristic input-output associations.
Pro-active rules that using forward chaining with rules of the form: If you want G and conditions C hold, then add H as a sub-goal to simulate backward reasoning with logic programs of the form: G if C and H
Mind: An Introduction to Cognitive Science, by Paul Thagard
“Unlike logic, rule-based systems can easily represent strategic information about what to do”: If you want to go home and you have the bus fare, then you can catch a bus. But the sentence can be expressed literally in the logical form:
can(you catch bus) if want(you go home) and you have the bus fare
This uses modal operators or modal predicates for want and can. But it misses the real logic of the procedure:
You go home if you have the bus fare and you catch a bus.
Backward reasoning with this logic behaves like forward reasoning with the rule.
Condition-action rule: If you want to go home and you have the bus fare, then you can catch a bus.
Goal?: want to go home
Act The World
catch a bus or hire a car or walk Conflict resolution
37
Consult working memory, to check whether you have the bus fare.
Forward chaining with condition-action rules
Candidate actions:
Belief: you go home if you have the bus fare and you catch a bus.
Goal: you go home
Act The World
38
Consult working memory, to check whether you have the bus fare.
Backward reasoning with beliefs in logic programming form
catch a bus or hire a car or walk Decide
Candidate actions
Logic for production systems (LPS)
Logical rules used to reason forward are represented as logic programs, used to reason forwards from observations.
Reactive rules are represented by clauses in first-order logic:
If condition1 and condition2 …. and conditionn then conclusion1 or conclusion2 …. or conclusionm representing “maintenance goals”, used to reason forwards from conditions to conclusions, which become “achievement” goals. Conflict resolution is the problem of deciding which conclusioni to achieve.
Pro-active rules are represented by logic programs, used to reason
backwards, to evaluate conditions and to reduce achievement goals to sub-goals and eventually to actions.
Reactive rules can be represented as generalised maintenance goals
Maintenance goals have the logical form: ∀X [antecedent(X)→ ∃Y consequent(X, Y)] where X is the set of all variables that occur in antecedent(X), and Y is the set of all variables that occur only in consequent(X, Y).
ALP gives a logical semantics to production systems
Condition-action rules represented as maintenance goals: ∀ T1 [methane-level(M, T1) ∧ critical ≤ M → ∃ T2 [ alarm(T2) ∧ T1 < T2 ≤ T1 + ε ]]
∀ T1 [methane-level (M, T1) ∧ critical > M ∧ water-level (W, T1) ∧ high < W → ∃ T2 [ pump(T2) ∧ T1 < T2 ≤ T1 + ε ]] ∀ T1 [methane-level (M, T1) ∧ critical > M ∧ water-level (W, T1) ∧ low < W ∧ pump-active T1 → ∃ T2 [ pump(T2) ∧ T1 < T2 ≤ T1 + ε ]] 41 11/29/2013
The maintenance goals G and observations are all true in the sequence of states represented by the single minimal model:
Logic and Decision Theory
The prisoner’s dilemma Deciding whether or not to take an umbrella Smart choices Combining decision theory and abduction in Abductive Logic Programming (ALP) ALP gives a logical semantics to production systems ALP includes composite event processing and composite transactions
ALP includes composite event processing and composite transactions
G = maintenance goal (or “heuristic rule”) ∀ A T1 T2 T [heat-sensor detects high temperature in area A at time T1 ∧ smoke detector detects smoke in area A at time T2 ∧ |T1 – T2 | ≤ 60 sec ∧ max(T1, T2, T) → ∃ T3 T4 [activate sprinkler in area A at time T3 ∧ T < T3 ≤ T + 10 sec ∧ send security guard to area A at time T4 ∧ T3 < T4 ≤ T3 + 30 sec] ∨ ∃ T5 [call fire department to area A at time T5 ∧ T < T5 ≤ T + 120 sec]] Notice that abductive reasoning to generate fire as an explanation of the observations of high temperature and smoke is compiled into a lower-level heuristic.
ALP – The same beliefs can be used both to recognise complex events and to perform complex transactions
Maintenance goal: ∀ T1 T2 [sentence(you, T1, T2) → ∃ T3 T4 [sentence (me, T3, T4) ∧ T2 < T3 < T2 + 3 sec ]] Beliefs: adjective(Agent, T1, T2) ← word(Agent, my, T1, T2) adjective(Agent, T1, T2) ← word(Agent, your, T1, T2) noun(Agent, T1, T2) ← word(Agent, name, T1, T2) verb(Agent, T1, T2) ← word(Agent, is, T1, T2) noun(Agent, T1, T2) ← word(Agent, bob, T1, T2) noun(Agent, T1, T2) ← word(Agent, what, T1, T2) sentence(Agent, T1, T3)← noun-phrase(Agent, T1, T2) ∧ verb-phrase(Agent, T2, T3) noun-phrase(Agent, T1, T3) ← adjective(Agent, T1, T2) ∧ noun(Agent, T2, T3) noun-phrase(Agent, T1, T2) ← noun(Agent, T1, T2) verb-phrase(Agent, T1, T3) ← verb(Agent, T1, T2) ∧ noun-phrase(Agent, T2, T3) verb-phrase(Agent, T1, T2) ← verb(Agent, T1, T2)
Example – simplified conversation Observations: word(you, what, 1, 2) word(you, is, 2, 3) word(you, your, 3, 4) word(you, name, 4, 5) Actions: word(me, my, 6, 7) word(me, name, 7, 8) word(me, is, 8, 9) word(me, bob, 9, 10) The maintenance goal ∀ T1 T2 [sentence(you, T1, T2) → ∃ T3 T4 [sentence (me, T3, T4) ∧ T2 < T3 < T2 + 3 sec ]] is true in the minimal model: word(you, what, 1, 2) noun(you, 1, 2) noun-phrase(you, 1, 2) word(you, is, 2, 3) verb(you, is, 2, 3) verb-phrase(you, 2, 3) word(you, your, 3, 4) adjective(you, your, 3, 4) word(you, name, 4, 5) noun(you, name, 4, 5) noun-phrase(you, 4, 5) noun-phrase(you, 3, 5) verb-phrase(you, 2, 5) sentence(you, 1, 4) sentence(you, 1, 5) word(me, my, 6, 7) adjective(me, 6, 7) word(me, name, 7, 8) noun(me, 7, 8) noun-phrase(me, 7, 8) word(me, is, 8, 9) verb(me, 8, 9) verb-phrase(me, 8, 9) word(me, bob, 9, 10) noun(me, 9, 10) noun-phrase(me, 9, 10) noun-phrase(me, 6, 8) verb-phrase(me, 8, 10) sentence(me, 7, 10) sentence(me, 6, 10)
Beyond purely reactive systems
The semantics gives a complete specification of the task. But the operational semantics of forward and backward reasoning is incomplete. It cannot preventively make a maintenance goal true by making its antecedents false: attacks(X, me, T1) ∧ ¬ prepared-for-attack(me, T1) → surrender(me, T2) ∧ T1 < T2 ≤ T1 + δ It cannot proactively make a maintenance goal true by making its consequents true before its antecedents become true: enter-bus(me, T1) → have-ticket(me, T2) ∧ T1 < T2 ≤ T1 + ε
47
Consequences Decide
Maintenance goals
Achievement goals
Observations Actions
Minimal model semantics
Heuristic short cuts
Candidates
48
Conclusion: Computational Logic as a unifying framework
Abductive explanations
Consequences Decide