1. Sample Space and Probability Part IV: Pascal Triangle ...ipollak/ece302/FALL09/notes/Bernoulli_Trials.pdf1. Sample Space and Probability Part IV: Pascal Triangle and Bernoulli

1.SampleSpaceandProbabilityPartIV:PascalTriangleand

BernoulliTrialsECE302Fall2009TR3‐4:15pmPurdueUniversity,SchoolofECE

Prof.IlyaPollak

ConnecMonbetweenPascaltriangleandprobabilitytheory:NumberofsuccessesinasequenceofindependentBernoullitrials

•  ABernoullitrialisanyprobabilisMcexperimentwithtwopossibleoutcomes,e.g.,– WillCiMgroupbecomeinsolventduringnext12months?

–  DemocratsorRepublicansinthenextelecMon?– WillDowJonesgouptomorrow?

– Willanewdrugcureatleast80%ofthepaMents?

•  Terminology:someMmesthetwooutcomesarecalled“success”and“failure.”

•  Supposetheprobabilityofsuccessisp.Whatistheprobabilityofksuccessesinnindependenttrials?

ProbabilityofksuccessesinnindependentBernoullitrials

•  nindependentcointosses,P(H)=p


•  nindependentcointosses,P(H)=p•  E.g.,P(HTTHHH)=p(1‐p)(1‐p)p3=p4(1‐p)2


•  nindependentcointosses,P(H)=p•  E.g.,P(HTTHHH)=p(1‐p)(1‐p)p3=p4(1‐p)2•  P(specificsequencewithkH’sand(n‐k)T’s)=pk(1‐p)n‐k


•  nindependentcointosses,P(H)=p•  E.g.,P(HTTHHH)=p(1‐p)(1‐p)p3=p4(1‐p)2•  P(specificsequencewithkH’sand(n‐k)T’s)=pk(1‐p)n‐k•  P(kheads)=(numberofk‐headsequences)∙pk(1‐p)n‐k


•  nindependentcointosses,P(H)=p•  E.g.,P(HTTHHH)=p(1‐p)(1‐p)p3=p4(1‐p)2•  P(specificsequencewithkH’sand(n‐k)T’s)=pk(1‐p)n‐k•  P(kheads)=(numberofk‐headsequences)∙pk(1‐p)n‐k

AninteresMngpropertyofbinomialcoefficients

€

Since P(zero H's) + P(one H) + P(two H's) +…+ P(n H's) =1,

it follows that nk

pk (1− p)n−k =1.

k= 0

n

∑

Another way to show the same thing is to realize thatnk

pk (1− p)n−k = (p + (1− p))n =1n =1.

k= 0

n

∑

BinomialprobabiliMes:illustraMon

BinomialprobabiliMes:illustraMon

CommentsonbinomialprobabiliMesandthebellcurve

•  SummingmanyindependentrandomcontribuMonsusuallyleadstothebell‐shapeddistribuMon.

•  Thisiscalledthecentrallimittheorem(CLT).

•  WehavenotyetcoveredthetoolstopreciselystatetheCLT,butwewilllaterinthecourse.

•  ThebehaviorofthebinomialdistribuMonforlargenshownaboveisamanifestaMonoftheCLT.

InteresMngly,wegetthebellcurveevenforasymmetricbinomialprobabiliMes

ThistellsushowtoempiricallyesMmatetheprobabilityofanevent!

•  ToesMmatetheprobabilitypbasedonnflips,dividetheobservednumberofH’sbythetotalnumberofexperiments:k/n.

•  ToseethedistribuMonofk/nforanyn,simplyrescalethex‐axisinthedistribuMonofk.

•  ThisdistribuMonwilltellus– WhatweshouldexpectouresMmatetobe,onaverage,and

– Whaterrorweshouldexpecttomake,onaverage

Note: o  for 50 flips, the most likely outcome is the correct one, 0.8 o  it’s also close to the “average” outcome o  it’s very unlikely to make a mistake of more than 0.2

If p=0.8, when estimating based on 1000 flips, it’s extremely unlikely to make a mistake of more than 0.05.

If p=0.8, when estimating based on 1000 flips, it’s extremely unlikely to make a mistake of more than 0.05. •  Hence, when the goal is to forecast a two-way election, and the actual p is reasonably far from 1/2, polling a few hundred people is very likely to give accurate results.

If p=0.8, when estimating based on 1000 flips, it’s extremely unlikely to make a mistake of more than 0.05. •  Hence, when the goal is to forecast a two-way election, and the actual p is reasonably far from 1/2, polling a few hundred people is very likely to give accurate results. •  However,

o  independence is important; o  getting a representative sample is important (for a country with 300M population, this is tricky!) o  when the actual p is extremely close to 1/2 (e.g., the 2000 presidential election in Florida or the 2008 senatorial election in Minnesota 2008), pollsters’ forecasts are about as accurate as a random guess.

Franken‐ColemanelecMon

•  Franken1,212,629votes•  Coleman1,212,317votes

•  Inouranalysis,wewilldisregardthirdpartycandidatewhogot437,505votes(heactuallymakespre‐elecMonpollingevenmorecomplicated)

•  EffecMvely,p≈0.500064

ProbabiliMesforfracMonsofFrankenvoteinpre‐elecMonpollingbasedonn=2.5M(morethanall

FrankenandColemanvotescombined)

•  Even though we are unlikely to make an error of more than 0.001, this is not enough because p-0.5=0.000064! •  Note: 42% of the area under the bell curve is to the left of 1/2. •  When the election is this close, no poll can accurately predict the outcome. •  In fact, the noise in the voting process itself (voting machine malfunctions, human errors, etc) becomes very important in determining the outcome.

EsMmaMngtheprobabilityofsuccessinaBernoullitrial:summary

•  Asthenumbernofindependentexperimentsincreases,theempiricalfracMonofoccurrencesofsuccessbecomesclosetotheactualprobabilityofsuccess,p.

•  TheerrorgoesdownproporMonatelyton1/2.I.e.,erroraler400trialsistwiceassmallasaler100trials.

•  Thisiscalledthelawoflargenumbers.

•  Thisresultwillbepreciselydescribedlaterinthecourse.

Documents

1. Sample Space and Probability Part IV: Pascal Triangle ...ipollak/ece302/FALL09/notes/Bernoulli_Trials.pdf1. Sample Space and Probability Part IV: Pascal Triangle and Bernoulli