Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John...

Concentration inequalities and tail bounds

John Duchi

Prof. John Duchi

Outline

I Basics and motivation1 Law of large numbers2 Markov inequality3 Cherno↵ bounds

II Sub-Gaussian random variables1 Definitions2 Examples3 Hoe↵ding inequalities

III Sub-exponential random variables1 Definitions2 Examples3 Cherno↵/Bernstein bounds

Prof. John Duchi

Motivation

I Often in this class, goal is to argue that sequence of random(vectors) X1, X2, . . . satisfies

p! E[X].

I Law of large numbers: if E[kXk] < 1, then

6= E[X]

Prof. John Duchi

Markov inequalities

Theorem (Markov’s inequality)

Let X be a non-negative random variable. Then

P(X � t) E[X]

Prof. John Duchi

Chebyshev inequalitiesTheorem (Chebyshev’s inequality)

Let X be a real-valued random variable with E[X2] < 1. Then

P(X � E[X] � t) E[(X � E[X])

Var(X)

Example: i.i.d. sampling

Prof. John Duchi

Cherno↵ bounds

Moment generating function: for random variable X, the MGF is

(�) := E[e�X ]

Example: Normally distributed random variables

Prof. John Duchi

Cherno↵ bounds

Theorem (Cherno↵ bound)

For any random variable and t � 0,

P(X �E[X] � t) inf

��0M

X�E[X](�)e��t

��0E[e�(X�E[X])

��t

Prof. John Duchi

Sub-Gaussian random variables

Definition (Sub-Gaussianity)

A mean-zero random variable X is �2-sub-Gaussian if

✓�

◆for all � 2 R

Example: X ⇠ N(0,�2)

Prof. John Duchi

Properties of sub-Gaussians

Proposition (sums of sub-Gaussians)

be independent, mean-zero �

-sub-Gaussian. ThenPn

i=1 �2i

-sub-Gaussian.

Prof. John Duchi

Concentration inequalities

TheoremLet X be �

2-sub-Gaussian. Then for t � 0,

P(X � E[X] � t) exp

✓� t

P(X � E[X] �t) exp

✓� t

Prof. John Duchi

Concentration: convergence of an independent sum

CorollaryLet X

be independent �

-sub-Gaussian. Then for t � 0,

� nt

i=1 �2i

Prof. John Duchi

Example: bounded random variables

PropositionLet X 2 [a, b], with E[X] = 0. Then

E[e�X ] e

�2(b�a)2

Prof. John Duchi

Maxima of sub-Gaussian random variables (in probability)

�p2�

2log n

Prof. John Duchi

Maxima of sub-Gaussian random variables (in expectation)

P✓max

�p2�

2(log n+ t)

Prof. John Duchi

Hoe↵ding’s inequality

are bounded in [a

] then for t � 0,

� E[Xi

]) � t

� 2nt

i=1(bi � a

� E[Xi

]) �t

� 2nt

i=1(bi � a

Prof. John Duchi

Equivalent definitions of sub-Gaussianity

TheoremThe following are equivalent (up to constants)

i E[exp(X2/�

ii E[|X|k]1/k �

iii P(|X| � t) exp(� t

2�2 )

If in addition X is mean-zero, then this is also equivalent to i–iii

iv X is �

2-sub-Gaussian

Prof. John Duchi

Sub-exponential random variables

Definition (Sub-exponential)

A mean-zero random variable X is (⌧2, b)-sub-Exponential if

E [exp (�X)] exp

✓�

◆for |�| 1

Example: Exponential RV, density p(x) = �e

��x for x � 0

Prof. John Duchi

Sub-exponential random variables

Example: �2-random variable. Let Z ⇠ N(0,�2) and X = Z

2.Then

E[e�X ] =

[1� 2��

Prof. John Duchi

Concentration of sub-exponentials

TheoremLet X be (⌧

2, b)-sub-exponential. Then

P(X � E[X]+t) (e

� t2

2⌧2if 0 t ⌧

� t2b

if t � ⌧

� t2

2⌧2, e

� t2b

Prof. John Duchi

Sums of sub-exponential random variables

Let Xi

be independent (⌧2i

)-sub-exponential random variables.Then

i=1 ⌧2i

, b⇤)-sub-exponential, whereb⇤ = max

Corollary: If Xi

satisfy above, then

P ��

� E[Xi

�� t

! 2 exp

�min

i=1 ⌧2i

Prof. John Duchi

Bernstein conditions and sub-exponentialsSuppose X is mean-zero with

E[e�X ] exp

✓�

2(1� b|�|)

Prof. John Duchi

Johnson-Lindenstrauss and high-dimensional embedding

Question: Let u1, . . . , um 2 Rd be arbitrary. Can we find amapping F : Rd ! Rn, n ⌧ d, such that

(1� �)

��u

i � u

��22��F (u

)� F (u

��22 (1 + �)

��u

i � u

��22

Theorem (Johnson-Lindenstrauss embedding)

For n & 1✏

2 logm such a mapping exists.

Prof. John Duchi

Proof of Johnson-Lindenstrauss continued

P ��

kXuk22n kuk22

�� t

! 2 exp

✓�nt

◆for t 2 [0, 1].

Prof. John Duchi

Reading and bibliography

1. S. Boucheron, O. Bousquet, and G. Lugosi. Concentrationinequalities.In O. Bousquet, U. Luxburg, and G. Ratsch, editors, AdvancedLectures in Machine Learning, pages 208–240. Springer, 2004

2. V. Buldygin and Y. Kozachenko. Metric Characterization of

Random Variables and Random Processes, volume 188 ofTranslations of Mathematical Monographs.American Mathematical Society, 2000

3. M. Ledoux. The Concentration of Measure Phenomenon.American Mathematical Society, 2001

4. S. Boucheron, G. Lugosi, and P. Massart. ConcentrationInequalities: a Nonasymptotic Theory of Independence.Oxford University Press, 2013

Prof. John Duchi

Concentration inequalities and tail bounds€¦ · Concentration inequalities and tail bounds John...

Documents

Leaps & Bounds

A comparative study of some proofs of Cherno ’s bound with ... · Concentration inequalities provide bounds on the tail of the distribution of random variables. This thesis focuses

Alexis Duchi

Duchi catalogue

First order covariance inequalities via Stein’s method · the rst to obtain upper variance bounds via properties of Stein operators is due to Saumard [64], by combining generalized

Efron Stein Inequalities for Random Matrices · polynomial and exponential moments. This section describes the mechanism by which we convert bounds for matrix moments into concentration

Eigenvalues of graphs and Sobolev inequalities€¦ · We derive bounds for eigenvalues of the Laplacian of graphs using the discrete versions of the Sobolev inequalities and heat

Inter bounds

Diﬀerential inequalities for Riesz means and Weyl …Diﬀerential inequalities for Riesz means and Weyl-type bounds for eigenvalues1 Evans M. Harrell II School of Mathematics, Georgia

Moment bounds and concentration inequalities for slowly mixing …maslaq/papers/moment.pdf · 2014-10-01 · Moment bounds and concentration inequalities xand yare not in the same

Inequalities and bounds for some cumulative distribution ... · Inequalities and bounds for some cumulative distribution functions Javier Segura Departamento de Matemáticas, Estadística

Inequalities Properties of Inequalities Solving Inequalities Critical Value Method of Solving Inequalities Polynomial Inequalities Rational Inequalities

Quadratic Inequalities Solving Quadratic Inequalities

Solving Linear Inequalities Included in this presentation: Solving Linear Inequalities Solving Compound Inequalities Linear Inequalities Applications

Chapter 8: Inequalities 8.1 Equations and Inequalities 8.2 Solving One- step Inequalities 8.3 Solving two step inequalities

Guaranteed cost inequalities for robust stability and ...dsbaero/library/OsburnGuarCostIJRNC2002… · The guaranteed cost bounds that we consider are either parameter independent

Generalisation Bounds (5): Regret bounds for online learningjaven/talk/bounds_slide5.pdf · Generalisation Bounds (5): Regret bounds for online learning Qinfeng (Javen) Shi The Australian

Discrepancy-Based Theory and Algorithms for Forecasting ...mohri/pub/tsj.pdfstationary ergodic sequences.Agarwal and Duchi[2013] gave generalization bounds for asymptotically stationary

Chapter 5 Bounds on Performancelazowska/qsp/Images/Chap_05.pdf · mance bounds: asymptotic bounds and balanced system bounds. Asymp- totic bounds hold for a wider class of systems

STUDIO LEGALE AVV. NINO DUCHI per... · studio legale avv. nino duchi via del cavallino n. 3 – asti tel. 0141 532307 fax. 350697 avv.nino.duchi@pec.it procedura per eseguire le