Upload
melvin-young
View
222
Download
2
Tags:
Embed Size (px)
Citation preview
EstimationEstimation• Problems with maximum likelihood estimates:
- If I flip a coin once, and it’s heads, what’s the estimate for P(heads)?- What if I flip it 50 times with 27 heads?- What if I flip 10M times with 8M heads?
• Basic idea:- We have some prior expectation about parameters
(here, the probability of heads)- Given little evidence, we should skew toward prior- Given lots of evidence, we should listen to data
• How can we accomplish this? Stay tuned!
Bayesian Networks: Big PictureBayesian Networks: Big Picture
• Two big problems with joint probability distributions:- Unless there are only a few variables, the distribution is too big to
represent explicitly (Why?)- Hard to estimate anything empirically about more than a few
variables at a time (Why?)
• Hard to compute answers to queries of the form P(y | a) (Why?)
• Bayesian networks are a technique for describing complex joint distributions (models) using a bunch of simple, local distributions- It describes how variables interact locally- Local interactions chain together to give global, indirect interactions- For about 10 min, we’ll be very vague about how these interactions are specified
Size of a Bayes’ NetSize of a Bayes’ Net
• How big is a joint distribution over N Boolean variables?
• 2N
• How big is a Bayes net if each node has k parents?• N 2k
• Both give you the power to calculate P(X1,X2,…,Xn)• Bayesian Networks = Huge space savings!• Also easier to elicit local CPTs• Also turns out to be faster to answer queries (future
class)
Causality?Causality?• When Bayes’ nets reflect the true causal patterns:
Often simpler (nodes have fewer parents) Often easier to think about Often easier to elicit from experts
• BNs need not actually be causal Sometimes no causal net exists over the domain E.g. consider the variables Traffic and RoofDrips End up with arrows that reflect correlation, not causation
• What do the arrows really mean? Topology may happen to encode causal structure Topology really encodes conditional independencies
Creating Bayes’ NetsCreating Bayes’ Nets
• So far, we talked about how any fixed Bayes’ netencodes a joint distribution
• Next: how to represent a fixed distribution as aBayes’ net Key ingredient: conditional independence The exercise we did in “causal” assembly of BNs was a kind of intuitive use of conditional
independence Now we have to formalize the process
• After that: how to answer queries (inference)