CS 121: Lecture 23Probability Review
Madhu Sudan
https://madhu.seas.Harvard.edu/courses/Fall2020
Book: https://introtcs.org
Only the course heads (slower): [email protected]{How to contact usThe whole staff (faster response): CS 121 Piazza
Announcements:β’ Advanced section: Roei Tell (De)Randomizationβ’ 2nd feedback surveyβ’ Homework 6 out today. Due 12/3/2020β’ Section 11 week starts
Where we are:Part I: Circuits: Finite computation, quantitative study
Part II: Automata: Infinite restricted computation, quantitative study
Part III: Turing Machines: Infinite computation, qualitative study
Part IV: Efficient Computation: Infinite computation, quantitative study
Part V: Randomized computation: Extending studies to non-classical algorithms
Review of course so farβ’ Circuits:
β’ Compute every finite function β¦ but no infinite functionβ’ Measure of complexity: SIZE. ALLππ β SIZE ππ 2ππ
ππ;βππ β SIZE ππ 2ππ
ππ
β’ Finite Automata:β’ Compute some infinite functions (all Regular functions) β’ Do not compute many (easy) functions (e.g. EQ π₯π₯,π¦π¦ = 1 β π₯π₯ = π¦π¦)
β’ Turing Machines:β’ Compute everything computable (definition/thesis)β’ HALT is not computable:
β’ Efficient Computation:β’ P Polytime computable Boolean functions β our notion of efficiently computableβ’ NP Efficiently verifiable Boolean functions. Our wishlist β¦ β’ NP-complete: Hardest problems in NP: 3SAT, ISET, MaxCut, EU3SAT, Subset Sum
Counting
Pumping/Pigeonhole
Diagonalization, Reductions
Polytime Reductions
Last Module: Challenges to STCT
β’ Strong Turing-Church Thesis: Everything physically computable in polynomial time can be computable in polynomial time on Turing Machine.
β’ Challenges: β’ Randomized algorithms: Already being computed by our computers:
β’ Weather simulation, Market prediction, β¦ β’ Quantum algorithms:
β’ Status:β’ Randomized: May not add power?β’ Quantum: May add power?β’ Still can be studied with same tools. New classes: BPP, BQP:
β’ B β bounded error; P β Probabilistic, Q - Quantum
Today: Review of Basic Probability
β’ Sample spaceβ’ Eventsβ’ Union/intersection/negation β AND/OR/NOT of eventsβ’ Random variablesβ’ Expectationβ’ Concentration / tail bounds
Throughout this lectureProbabilistic experiment: tossing ππ independent unbiased coins.
Equivalently: Choose π₯π₯ βΌ {0,1}ππ
EventsFix sample space to be π₯π₯ βΌ {0,1}ππ
An event is a set π΄π΄ β {0,1}ππ. Probability that π΄π΄ happens is Pr π΄π΄ = π΄π΄2ππ
Example: If π₯π₯ βΌ {0,1}3 , Pr π₯π₯0 = 1 = 48
= 12
0 1
0
0 0 0 0
01
1 1 1 1
1
000 001 010 011 100 101 110 111
Q: Let ππ = 3, π΄π΄ = π₯π₯0 = 1 ,π΅π΅ = π₯π₯0 + π₯π₯1 + π₯π₯2 β₯ 2 ,πΆπΆ = {π₯π₯0 + π₯π₯1 + π₯π₯2 = 1 ππππππ 2} .What are (i) Pr[π΅π΅], (ii) Pr[πΆπΆ], (iii) Pr[π΄π΄ β© π΅π΅], (iv) Pr[π΄π΄ β© πΆπΆ] (v) Pr[π΅π΅ β© πΆπΆ] ?
0 1
0
0 0 0 0
01
1 1 1 1
1
000 001 010 011 100 101 110 111
Q: Let ππ = 3, π΄π΄ = π₯π₯0 = 1 ,π΅π΅ = π₯π₯0 + π₯π₯1 + π₯π₯2 β₯ 2 ,πΆπΆ = {π₯π₯0 + π₯π₯1 + π₯π₯2 = 1 ππππππ 2} .What are (i) Pr[π΅π΅], (ii) Pr[πΆπΆ], (iii) Pr[π΄π΄ β© π΅π΅], (iv) Pr[π΄π΄ β© πΆπΆ] (v) Pr[π΅π΅ β© πΆπΆ] ?
ππππ
ππππ
= ππππ
ππππ
= ππππ
ππππ
= ππππ
ππππ
0 1
0
0 0 0 0
01
1 1 1 1
1
000 001 010 011 100 101 110 111
Q: Let ππ = 3, π΄π΄ = π₯π₯0 = 1 ,π΅π΅ = π₯π₯0 + π₯π₯1 + π₯π₯2 β₯ 2 ,πΆπΆ = {π₯π₯1 + π₯π₯2 + π₯π₯3 = 1 ππππππ 2} .What are (i) Pr[π΅π΅], (ii) Pr[πΆπΆ], (iii) Pr[π΄π΄ β© π΅π΅], (iv) Pr[π΄π΄ β© πΆπΆ] (v) Pr[π΅π΅ β© πΆπΆ] ?
ππππ
ππππ
= ππππ
ππππ
= ππππ
ππππ
= ππππ
ππππ
Compare Pr π΄π΄ β© π΅π΅ vs Pr π΄π΄ Pr[π΅π΅]Pr[π΄π΄ β© πΆπΆ] vs Pr π΄π΄ Pr[πΆπΆ]
Operations on eventsPr π΄π΄ ππππ π΅π΅ βππππππππππππ = Pr[ π΄π΄ βͺ π΅π΅]
Pr π΄π΄ ππππππ π΅π΅ βππππππππππ = Pr[ π΄π΄ β© π΅π΅]
Pr π΄π΄ ππππππππππβ²π‘π‘ βππππππππππ = Pr π΄π΄ = Pr {0,1}ππ β π΄π΄ = 1 β Pr[π΄π΄]
Q: Prove the union bound: Pr π΄π΄ β¨ π΅π΅ β€ Pr π΄π΄ + Pr[π΅π΅]
Example: Pr π΄π΄ = 4/25 , Pr π΅π΅ = 6/25, Pr π΄π΄ βͺ π΅π΅ = 9/25
IndependenceInformal: Two events π΄π΄,π΅π΅ are independent if knowing π΄π΄ happened doesnβt give any information on whether π΅π΅ happened.Formal: Two events π΄π΄,π΅π΅ are independent if Pr π΄π΄ β© π΅π΅ = Pr π΄π΄ Pr[π΅π΅]
Q: Which pairs are independent?(i) (ii) (iii)
Pr π΅π΅ π΄π΄ = Pr π΅π΅βConditioning is the soul
of statisticsβ
More than 2 eventsInformal: Two events π΄π΄,π΅π΅ are independent if knowing whether π΄π΄ happened doesnβt give any information on whether π΅π΅ happened.
Formal: Three events π΄π΄,π΅π΅,πΆπΆ are independent if every pair π΄π΄,π΅π΅ , π΄π΄,πΆπΆ and π΅π΅,πΆπΆis independent and Pr π΄π΄ β© π΅π΅ β© πΆπΆ = Pr π΄π΄ Pr[π΅π΅] Pr[πΆπΆ]
(i) (ii)
Random variablesAssign a number to every outcome of the coins.
Formally r.v. is ππ: {0,1}ππ β β
Example: ππ π₯π₯ = π₯π₯0 + π₯π₯1 + π₯π₯2
0 1 1 2 1 2 2 3
ππ π·π·π·π·[πΏπΏ = ππ]
0 1/8
1 3/8
2 3/8
3 1/8
ExpectationAverage value of ππ : πΌπΌ ππ = βπ₯π₯β{0,1}ππ 2βππππ(π₯π₯) = βπ£π£ββ π£π£ β Pr[ππ = π£π£]
Example: ππ π₯π₯ = π₯π₯0 + π₯π₯1 + π₯π₯2
0 1 1 2 1 2 2 3
ππ π·π·π·π·[πΏπΏ = ππ]
0 1/8
1 3/8
2 3/8
3 1/8
Q: What is πΌπΌ[ππ]? 0 β 18
+ 1 β 38
+ 2 β 38
+ 3 β 18
=128
= 1.5
Linearity of expectationLemma: πΌπΌ ππ + ππ = πΌπΌ ππ + πΌπΌ[ππ]
Corollary: πΌπΌ π₯π₯0 + π₯π₯1 + π₯π₯2 = πΌπΌ π₯π₯0 + πΌπΌ π₯π₯1 + πΌπΌ π₯π₯2 = 1.5
Proof:
πΌπΌ ππ + ππ = 2βπποΏ½π₯π₯
(ππ π₯π₯ + ππ π₯π₯ ) = 2βπποΏ½π₯π₯
ππ π₯π₯ + 2βπποΏ½π₯π₯
ππ(π₯π₯) = πΌπΌ ππ + πΌπΌ[ππ]
Independent random variablesDef: ππ,ππ are independent if {ππ = π’π’} and {ππ = π£π£} are independent βπ’π’, π£π£
Def: ππ0, β¦ ,ππππβ1, independent if ππ0 = π£π£0 , β¦ , {ππππβ1 = π£π£ππβ1} ind. βπ’π’0 β¦π’π’ππβ1i.e. βπ£π£, β¦ , π£π£ππβ1 βππ β [ππ]
Pr οΏ½ππβππ
ππππ = π£π£ππ = οΏ½ππβππ
Pr[ππππ = π£π£ππ]
Q: Let π₯π₯ βΌ {0,1}ππ. Let ππ0 = π₯π₯0,ππ1 = π₯π₯1, β¦ ,ππππβ1 = π₯π₯ππβ1Let ππ0 = ππ1 = β― = ππππβ1 = π₯π₯0
Are ππ0, β¦ ,ππππβ1 independent?
Are ππ0, β¦ ,ππππβ1 independent?
Independence and concentrationQ: Let π₯π₯ βΌ {0,1}ππ. Let ππ0 = π₯π₯0,ππ1 = π₯π₯1, β¦ ,ππππβ1 = π₯π₯ππβ1
Let ππ0 = ππ1 = β― = ππππβ1 = π₯π₯0Let ππ = ππ0 + β―+ ππππβ1 , ππ = ππ0 + β―+ ππππβ1
Compute πΌπΌ[ππ], compute πΌπΌ[ππ]
For ππ = 100, estimate Pr[ππ β 0.4ππ, 0.6ππ ] , Pr[ππ β 0.4ππ, 0.6ππ ]
Sums of independent random variables
ππ = 5 Pr[ππ β 0.4,0.6 ππ]= 37.5%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
ππ = 10 Pr[ππ β 0.4,0.6 ππ]= 34.4%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
ππ = 20 Pr[ππ β 0.4,0.6 ππ]= 26.3%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
ππ = 50 Pr[ππ β 0.4,0.6 ππ]= 11.9%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
ππ = 100 Pr[ππ β 0.4,0.6 ππ]= 3.6%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
ππ = 200 Pr[ππ β 0.4,0.6 ππ]= 0.4%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
ππ = 400 Pr[ππ β 0.4,0.6 ππ]= 0.01%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Concentration boundsππ sum of ππ independent random variables β ββNormal/Gaussian/bell curveβ:
Pr ππ β 0.99,1.01 πΌπΌ ππ < exp(βπΏπΏ β ππ)
Otherwise: weaker bounds
In concentrated r.v.βs, expectation, median, mode, etc.. β same
Chernoff Bound: Let ππ0, β¦ ,ππππβ1 i.i.d. r.v.βs with ππππ β [0,1]. Then if ππ = ππ0 + β―+ ππππβ1 and ππ = πΌπΌ ππ for every ππ > 0,
Pr ππ β ππππ > ππππ < 2 β exp(β2ππ2 β ππ)
Independent & identically distributed
Concentration boundsππ sum of ππ independent random variables β ββNormal/Gaussian/bell curveβ:
Pr ππ β 0.99,1.01 πΌπΌ ππ < exp(βπΏπΏ β ππ)
Otherwise: weaker bounds
In concentrated r.v.βs, expectation, median, mode, etc.. β same
Chernoff Bound: Let ππ0, β¦ ,ππππβ1 i.i.d. r.v.βs with ππππ β [0,1]. Then if ππ = ππ0 + β―+ ππππβ1 and ππ = πΌπΌ ππ for every ππ > 0,
Pr ππ β ππππ > ππππ < 2 β exp(β2ππ2 β ππ)
Independent & identically distributed
< exp βππ2 β ππ if ππ >1ππ2
Simplest Bound: Markovβs InequalityQ: Suppose that the average age in a neighborhood is 20. Prove that at most 1/4 of the residents are 80 or older.
Thm (Markovβs Inequality): Let ππ be non-negative r.v. and ππ = πΌπΌ[ππ]. Then for every ππ > 1, Pr ππ β₯ ππππ β€ 1/ππProof: Same as question above
Avg > 14β 80 = 20
A: Suppose otherwise:80
0
>14
Contribute enough to
make avg>20
Variance & ChebychevIf ππ is r.v. with ππ = πΌπΌ[ππ] then ππππππ ππ = πΌπΌ ππ β ππ 2 = πΌπΌ ππ2 β πΌπΌ ππ 2
ππ ππ = ππππππ(ππ)
Chebychevβs Inequality: For every r.v. ππ
Pr ππ β₯ ππ πππππ£π£πππππ‘π‘ππππππππ ππππππππ ππ β€1ππ2
Compare with ππ = βππππ i.i.d or other r.v.βs well approx. by Normal where
Pr ππ β₯ ππ πππππ£π£πππππ‘π‘ππππππππ ππππππππ ππ β exp(βππ2)
(Proof: Markov on ππ = ππ β ππ 2 )
Next Lectures
β’ Randomized Algorithmsβ’ Some examples
β’ Randomized Complexity Classesβ’ BPTIME ππ ππ , BPPβ’ Properties of randomized computation (Reducing error β¦)