Randomness Extractors & their Many Guises Salil Vadhan Harvard University to be posted at salil

Randomness Extractors& their Many Guises

Salil Vadhan

Harvard University

to be posted at http://eecs.harvard.edu/~salil

I. Motivation

Original Motivation[SV84,Vaz85,VV85,CG85,Vaz87,CW89,Zuc90,Zuc91]

• Randomization is pervasive in CS– Algorithm design, cryptography, distributed computing, …

• Typically assume perfect random source.– Unbiased, independent random bits– Unrealistic?

• Can we use a “weak” random source?– Source of biased & correlated bits.– More realistic model of physical sources.

• (Randomness) Extractors: convert a weak random source into an almost-perfect random source.

Applications of Extractors

• Derandomization of BPP [Sip88,GZ97,MV99,STV99]

• Derandomization of logspace algorithms [NZ93,INW94,RR99,GW02]

• Distributed & Network Algorithms[WZ95,Zuc97,RZ98,Ind02].

• Hardness of Approximation [Zuc93,Uma99,MU01]

• Cryptography [CDHKS00,MW00,Lu02]

• Data Structures [Ta02]

The Unifying Role of Extractors

Extractors can be viewed as types of:

• Hash Functions• Expander Graphs• Samplers• Pseudorandom Generators• Error-Correcting Codes

Unify the theory of pseudorandomness.

This Tutorial

• Is framed around connections between extractors & other objects. We’ll use these to:– Gain intuition for the definition.– Describe a few applications.– Hint at the constructions.

• Many omissions. For further reading:– N. Nisan and A. Ta-Shma. Extracting randomness: a survey

and new constructions. Journal of Computer & System Sciences, 58 (1):148-173, 1999.

– R. Shaltiel. Recent developments in explicit constructions of extractors. Bulletin of EATCS, 77:67-95, June 2002.

– S. Vadhan. Course Notes for CS225: Pseudorandomness. http://eecs.harvard.edu/~salil

Outline

I. Motivation

II. Extractors as extractors

III. Extractors as hash functions

IV. Extractors as expander graphs

V. Extractors as pseudorandom generators

VI. Extractors as error-correcting codes

VII. Concluding remarks & open problems

II. Extractors as Extractors

Weak Random Sources

• What is a source of biased & correlated bits?– Probability distribution X on {0,1}n.– Must contain some “randomness”.– Want: no independence assumptions ) one sample

• Measure of “randomness”– Shannon entropy:

No good:

– Better [Zuckerman `90]: min-entropy

Min-entropy• Def: X is a k-source if H1(X)¸ k.

i.e. Pr[X=x] · 2-k for all x

• Examples:– Unpredictable Source [SV84]: 8 i2[n], b1, ..., bi-12 {0,1},

– Bit-fixing [CGH+85,BL85,LLS87,CW89]: Some k coordinates of X uniform, rest fixed (or even depend arbitrarily on others).

– Flat k-source: Uniform over S µ {0,1}n, |S|=2k

• Fact [CG85]: every k-source is convex combination of flat ones.

Extractors: 1st attempt

• A function Ext : {0,1}n ! {0,1}m s.t.

8 k-source X, Ext(X) is “close” to uniform.

• Impossible! 9 set of 2n-1 inputs x on which first bit of

Ext(x) is constant ) flat (n-1)-source X, bad for Ext.

EXT

k-source of length n

m almost-uniform bits

Extractors [Nisan & Zuckerman `93]

• Def: A (k,)-extractor is Ext : {0,1}n £{0,1}d ! {0,1}m

s.t. 8 k-source X, Ext(X,Ud) is -close to Um.

d random bits

“seed”

• Key point: seed can be much shorter than output.

• Goals: minimize seed length, maximize output length.

EXT

k-source of length n


Definitional Details

• Ut = uniform distribution on {0,1}t

• Measure of closeness: statistical difference (a.k.a. variation distance)

– T = “statistical test” or “distinguisher”

– metric, 2 [0,1], very well-behaved

• Def: X, Y -close if (X,Y)·.

The Parameters• The min-entropy k:

– High min-entropy: k = n-a, a =o(n)– Constant entropy rate: k = (n)– Middle (hardest) range: k = n, 0<<1– Low min-entropy: k = no(1)

• The error :– In this talk: =.01 (for simplicity)

– Very small sometimes important.

• The output length m:– Certainly m· k+d.– Can this be achieved?

The Optimal ExtractorThm [Sip88,RT97]: For every k · n, 9 a (k,)-extractor w/

– Seed length d= log(n-k)+O(1)

– Output length m = k+d - O(1)– “extract almost all the min-entropy w/logarithmic seed”

• Pf sketch: Probabilistic Method.

Show that for random Ext, Pr[Ext not (k,)-extractor] < 1.

– Use capital letters: N=2n, M=2m, ...– For fixed flat k-source X and T µ {0,1}m,

– # choices of X and T =

(Chernoff)

(¼ log n except high min-ent.)

The Optimal Extractor

• Thm: For every k · n, there exists a (k,)-extractor w/

– Seed length d= log(n-k)+O(1)

– Output length m = k+d-O(1)

• Thm [NZ93,RT97]: Above tight up to additive constants.

• For applications, need explicit extractors:– Ext(x,y) computable in time poly(n).– Random extractor requires space ¸ 2n to even store!

• Long line of research has sought to approach above bounds with explicit constructions.

Application: BPP w/a weak source[Zuckerman `90,`91]

accept/rejectRandomized Algorithminput xerrs w.p. · 2( )

• Run algorithm using all 2d seeds & output majority.

• Only polynomial slowdown, provided d=O(log n) and Ext explicit.

k-source

m uniform bits

d-bit seed

+

almost

EXT

III. Extractors as Hash Functions

Strong extractors

• Output looks random even after seeing the seed.

• Def: Ext is a (k,) strong extractor if Ext0(x,y)=y±Ext(x,y) is a (k,) extractor

• i.e. 8 k-sources X, for a 1-0 frac. of y2{0,1}d

Ext(X,y) is 0-close to Um

• Optimal: d= log(n-k)+O(1), m= k-O(1)

• Can obtain strongness explicitly at little cost [RSW00].

Extractors as Hash Functions

{0,1}n

{0,1}m

flat k-source,i.e. set of size 2k À 2m

•For most y, hy maps sets of size K almost uniformly

onto range.

Extractors from Hash Functions

• Leftover Hash Lemma [ILL89]: universal (ie pairwise independent) hash functions yield strong extractors– output length: m= k-O(1)– seed length: d= O(n)– example: Ext(x,(a,b))=first m bits of a¢x+b in GF(2n)

• Almost pairwise independence [SZ94,GW94]:

– seed length: d= O(log n+k)

IIb. Extractors as Extractors

Composing Extractors

• We have some nontrivial basic extractors.

• Idea: compose them to get better extractors

• Original approach of [NZ93] & still in use.

Increasing the Output [WZ93]

m1-bit output

• Intuition: if m1¿ k, source still has a lot of randomness

d1 bits EXT1

k-source

d2 bits

m2-bit output

EXT2

Increasing the Output Length [WZ93]

• Prop: If Ext1 is a (k,)-extractor &

Ext2 a (k-m1-O(1),)-extractor, then

Ext is a (k,3)-extractor.

• Key lemma: (X,Z) (correlated) random vars. X a k-source

and |Z|=sw.p. ¸ 1- over zÃZ,

X|Z=z is a (k-s-log(1/))-source.

• Compare w/Shannon entropy:

Proof of Key Lemma

Key lemma: (X,Z) (correlated) random vars,

Proof: Let BAD = { z : Pr[Z=z] · ¢ 2-s}. Then

•

•

X a k-source

and |Z|=sw.p. ¸ 1- over zÃZ,

X|Z=z is a (k-s-log(1/))-source.

Pf of Prop:• Z1 -close to uniform (because Ext1 an extractor)

• w.p. ¸ 1- over zÃ Z1

• X|Z1=z a (k1-m1-O(1))-source (by Key Lemma)

• Z2 |Z1=z -close to uniform (because Ext2 an extractor)

• ) (Z1,Z2) 3-close to uniform.

Increasing the Output [WZ93]

m1-bit Z1

d1 bits EXT1

k-source X

d2 bits

m2-bit Z2

EXT2

An Application [NZ93]:Pseudorandom bits vs. Small Space

) Output looks uniform to observer.

Small space s

00000000

00111011101000100000000100001100001

01100001

0100000100010101100000010

seedEXT

length n

• Applications:

– derandomizing logspace algorithms [NZ93]

– cryptography in bounded-storage model [Lu02]

• Start w/source of

truly random bits.

• Conditioned on observer’s state, have

(k-s-O(1))-source w.h.p.

(by Key Lemma)

0100000100010101100000010

Shortening the Seed

• Ext2 may have shorter seed (due to shorter output).

• Problem: Ext1 only guaranteed to work when seed independent of source.

m1-bit output

d1 bitsEXT1

k-source

d2 bits

zzzz

EXT2

• Idea: use output of one extractor as seed to another.

Block Sources [CG85]

• Def: (X1,X2) is a (k1,k2) block source if– X1 is a k1-source

– is a k2-source

m1-bit output

d1 bitsEXT1

X1

d2 bits

X2

EXT2

• Q: When does this work?

The [NZ93] Paradigm

An approach to constructing extractors:1. Given a general source X

2. Convert it to a block source (X1,X2) – can use part of the seed for this

– may want many blocks (X1,X2 , X3,...) 3. Apply block extraction (using known extractors, e.g.

almost pairwise independence)

• Still useful today – it “improves” extractors, e.g. [RSW00]

• How to do Step 2?? – get a block by randomly sampling bits from source...– harder as min-entropy gets lower.

Outline

I. Motivation II. Extractors as extractors III. Extractors as hash functions IV. Extractors as expander graphs

V. Extractors as pseudorandom generators

VI. Extractors as error-correcting codes

VII. Concluding remarks & open problems

IV. Extractors as Expander Graphs

Expander Graphs

• Measures of Expansion: – Vertex Expansion: Every subset S of

size n has at least |S| neighbors for constants > 0, > 1.

– Eigenvalues: 2nd largest eigenvalue of random walk on G is for constant < 1.

(equivalent for constant-degree graphs

[Tan84,AM85,Alo86])

• Informally: Sparse graphs w/ very strong connectivity.

• Goals: – Minimize the degree.– Maximize the expansion.

• Random graphs of degree 3 are expanders [Pin73], but explicit constructions of constant-degree expanders much harder [Mar73,...,LPS86,Mar88].

S

Neighbors(S)

KExtractors & Expansion [NZ93]

• Connect x{0,1}n and y{0,1}m

if Ext(x,r)=y for some r {0,1}d

• Short seed low degree

• Extraction expansion

[N] ´{0,1}n

[M] ´{0,1}m

D

¸ (1-) M

n-bit k-source


d-bit seed EXT

x

y

Extractors vs. Expander Graphs

Main Differences:

1. Extractors are unbalanced, bipartite graphs.

2. Different expansion measures (extraction vs. e-value).– Extractors graphs which “beat the e-value bound”

[NZ93,WZ93]

3. Extractors polylog degree, expanders constant degree.

4. Extractors: expansion for sets of a fixed sizeExpanders: expansion for all sets up to some size


Main Differences: Extractors are unbalanced, bipartite graphs.

2. Different expansion measures (extraction vs. e-value).– Extractors graphs which “beat the e-value bound”

[NZ93,WZ95]


4. Extractors expansion for sets of a fixed sizeExpanders expansion for all sets up to some size

KExpansion Measures — Extraction

• Extractors: Start w/min-entropy k, end -close to min-entropy m ) measures how much min-entropy increases (or is not lost)

• Eigenvalue: similar, but for “2-entropy” (w/o -close)

[N] ´{0,1}n

[M] ´{0,1}m

D

¸ (1-) M

n-bit k-source


d-bit seed EXT

• Let G = D-regular, N-vertex graphA = transition matrix of random walk on G = (adj. mx)/D

• Fact: A has 2nd largest e-value iff prob. distribution X || A X - UN ||2 || X - UN ||2

• Fact:

e-value measures how random step increases 2-entropy

Expansion Measures — The Eigenvalue

,N

UX XN

12

1)(H

2

2 2

[Renyi] entropy"-"2 wheredef

]Pr[E

1log)(H 22 xX

XXx



Different expansion measures (extraction vs. e-value).– Extractors graphs which “beat the e-value bound”

[NZ93,WZ95]



The Degree

• Constant-degree expanders viewed as “difficult”.

• Extractors typically nonconstant degree, “elementary”– Optimal: d log (n-k) truly random bits.

– Typically: k = n or k = n d=(log n)

– Lower min-entropies viewed as hardest.

• Contradictory views?– Easiest extractors highest min-entropy k = n–O(1)

d=O(1) constant degree

– Resolved in [RVW01]: high min-entropy extractors & constant-degree expanders from same, simple “zig-zag product” construction.

High Min-Entropy Extractors [GW94]

length n, (n-a)-source

n1 n2

• Observe: If break source into two blocks.) (close to) a (n1-a, n2-a-O(1))-block source! (by Key Lemma)

d1EXT1

m1

EXT2d2

• Do block-source extraction!

Zig-Zag Product [RVW00]

n1 n2

d1EXT1

m1

EXT2d2

length n, min-entropy n-a Problem: Lose a bitsof min-entropy.

a a Solution:• Collect “buffers” which retainunextracted min-entropy [RR99]EXT3

d3

a• Extract from buffers at end.

zig-zag product



Different expansion measures (extraction vs. e-value).– Extractors graphs which “beat the e-value bound”

[NZ93,WZ95]

Extractors polylog degree, expanders constant degree.


Randomness Conductors [CRVW02]

• Six parameters: n, m, d, k, ,

• For every k k and every input k-source, output is -close to a (k+)-source.

• m = k+ : extractor with guarantee for smaller sets.

= d : “Lossless expander” [TUZ01]– Equivalent to graphs with vertex expansion (1-)degree!– Explicitly: very unbalanced case w/polylog degree [TUZ01],

nearly balanced case w/const. deg [CRVW02]

n-bit input

m-bit output

d-bit seed CON

V. Extractors as Pseudorandom Generators

PRG

Pseudorandom Generators [BM82,Y82,NW88]

• Generate many bits that “look random” from short random seed.

m bits indistinguishable fr. uniform

• Distributions X, Y computationally indistinguishable if for all efficient T (circuit of size m),

d-bit seed

TYTX PrPr

Hardness vs. Randomness

• Any function of high circuit complexity PRGs [BM82,Y82,NW88,BFNW93,IW97,...]

• Current state-of-the-art [SU01,Uma02]:

• Thm [IW97]: If E=DTIME(2O()) requires circuits of size 2(), then P=BPP.

f : {0,1}{0,1} circuit complexity k

PRGf : {0,1}O(){0,1}m m k(1)

Extractors & PRGs

Thm [Trevisan `99]: Any “sufficiently general” construction of PRGs from hard functions is also an extractor.

Extractors & PRGs

PRGf

d-bit seed

EXTd-bit seed

statistically close to Um

n bits

w/min-entropy k

comp. indisting. from Um

Extractors & PRGs

PRGf

d-bit seed

EXT

d-bit seed


n bits

w/min-entropy k


Step I: View extractors as “randomness multipliers” [NZ93,SZ94]

Extractors & PRGs

PRGf

d-bit seed

EXT

d-bit seed


n bits

w/min-entropy k


Step II: View hard function as an input to PRG.

f : {0,1}log n{0,1}

circuit complexity k

PRG

d-bit seed


f : {0,1}log n{0,1}


1. f from dist. of min-entropy k whp f has circuit complexity k – O(1) (even “description size” k – O(1))

2. Statistical closeness computational indistinguishability “relative to any distinguisher”

3. (1) holds “relative to any distinguisher”.

Analysis (intuition)

min-entropy

statistically close

EXT

PRG

d-bit seed


f : {0,1}log n{0,1}


1. Fix a statistical test T µ{0,1}m.

2. Suppose T distinguishes PRGf (Ud) from Um.

PRG correctness proof ) circuit C of size k0 s.t. CT ´ f.

3. But w.h.p., f|C is a (k-k0-O(1))-source (by Key Lemma), so

undetermined if k0¿ k. )(

Analysis (“formal”)

min-entropy

statistically close

EXT

“reconstruction paradigm”

When does this work?

• When PRG has a “black-box” proof: for any function f and any statistical test T,

i.e. if PRG construction “relativizes” [Mil99,KvM99]

• Almost all PRG constructions are of this form.

• Partial converse: If EXT is an explicit extractor and f has high description (Kolmogorov) complexity relative to T, then EXT( f , ) is pseudorandom for T.

fC

CTT

f computes

s.t. circuit small

uniform from

hesdistinguis

PRG

Consequences

• Simple (and very good) extractor based on NW PRG [Tre99] (w/ subsequent improvements [RRV99,ISW00,TUZ01])

• More importantly: new ways of thinking about both objects.

• Benefits for extractors:– Reconstruction paradigm: Given T distinguishing Ext(x,Ud) from

Um, “reconstruct” x w/short advice (used in [TZS01,SU01])– New techniques from PRG literature.

• Benefits for PRGs:– Best known PRG construction [Uma02], finer notion of optimality.– Distinguishes information-theoretic vs. efficiency issues.– To go beyond extractor limitations, must use special properties of

hard function or distinguisher (as in [IW98,Kab00,IKW01,TV02]).

VI. Extractors as Error-Correcting Codes

Error-Correcting Codes

• Classically: Large pairwise Hamming distance.• List Decoding: Every Hamming ball of rel. radius ½- in

{0,1}D has at most K codewords.

• Many PRG [GL89,BFNW93,STV99,MV99] and extractor [Tre99,...,RSW00] constructions use codes.

• [Ta-Shma & Zuckerman `01]: Extractors are a generalization of list-decodable codes.

ECC

n-bit message x

D-bit codeword ECC(x)

d-bit y

Strong 1-bit Ext’s List-Decodable Codes

EXT

n-bit x

ECC

n-bit x

D=2d bits -closeto Ud+1

The Correspondence:ECC(x)y = EXT(x,y)

Claim: EXT (k,) extractor ECC has < 2k codewords in any (½-)-ball

Pf: Suppose 2k codewords within distance (½-) of z2{0,1}D.

• Feed extractor uniform dist. on corresponding msgs.

• Consider statistical test T={ y± : y2{0,1}d, =zy }

–Pr[extractor outputT] > ½+

–Pr[uniform distributionT] = ½.


d-bit y EXT

n-bit x

ECC

n-bit x


ECC(x)y = EXT(x,y)

Claim: ECC has < 2k codewords in any (½-)-ball EXT (k,2) extractor

Pf: Suppose on k-source X, output 2-far from Ud+1.

P : {0,1}d {0,1} s.t. PrX,Y[P(Y)=EXT(X,Y)] > ½+2.

EX[dist(P,ECC(X))] < ½-2.

But only 2k codewords in (½-)-ball around P.


d-bit y EXT

n-bit x

ECC

n-bit x


ECC(x)y = EXT(x,y)

Extractors & Codes

• Many-bit extractors list-decodable codes over large alphabets (size 2m) [TZ01]

• “Reconstruction proof” in PRG view $ Decoding algorithm in code view

• Trevisan’s extractor has efficient decoding alg. [TZ01].– several applications (data structures for set storage [Ta02]...)

• Idea [Ta-Shma, Zuckerman, & Safra `01]: Exploit codes more directly in extractor construction.

Extractors from Codes

• Existing codes give extractors with short output.

• Q: How to get many bits?

• Use seed to select m positions inencoding of source.

• Positions independent: works but seed too long.

• Dependent positions? – [Tre99] gives one way.

ECC

n-bit x

ECC(x)

seed y

m-bit output

Dependent projections

Naive: consecutive positions

Analysis attempt (reconstruction): • Goal: given T µ {0,1}m

which distinguishes Ext(x,Ud) from Um, reconstruct x with few (i.e. k) bits of advice.

ECC

n-bit x

seed y

m bits

P

ECC(x)i

• From T, get next-bit predictor P : {0,1}i! {0,1} s.t.

[Yao82]

The Reconstruction

• Easy case: P always correct.

– Advice: first consecutive i· m positions.

P

ECC(x)

• P correct for ½+ fraction of positions.

i

ECC(x)i

– Repeatedly applying P, reconstruct all of ECC(x)

& hence x.

PPPPPPPPPP P

Dealing with ErrorsQ: How to deal with errors?• Use error-correcting properties of code

– Suffices to reconstruct “most” positions.– ½- errors requires list-decoding ) additional advice– But one incorrect prediction can ruin everything!

• Idea: error-correct with each prediction step– Need “consecutive” to be compatible w/decoding, reuse advice.

• Reed-Muller code: ECC(x) =low-degree poly. Fm! F– [Ta-Shma,Zuckerman, & Safra `01]: consecutive = along line.

– [Shaltiel-Umans `01]: consecutive = according to linear map

which “generates” Fmn {0}– [Umans `02]: PRG from this.

VII. Concluding Remarks

Towards Optimality

• Recall: For every k · n, 9 a (k,)-extractor w/ Seed d= log(n-k)+O(1) & Output m = k+d-O(1)

• Thm [...,NZ93,WZ93,GW94,SZ94,SSZ95,Zuc96,Ta96,Ta98,Tre99,RRV99, ISW00,RSW00,RVW00,TUZ01,TZS01,SU01,LRVW02]For every k · n, 9 an EXPLICIT (k,)-extractor w/ Seed d= O(log(n-k)) & Output m = .k Seed d= O(log2(n-k)) & Output m = k+d-O(1)

• Not there yet! Optimize up to additive constants.

– In many apps, efficiency » D=2d

– Often entropy loss k+d-m significant, rather than m itself.

– Dependence on error

Conclusions

• The many guises of randomness extractors– extractors, hash fns, expanders, samplers, pseudorandom

generators, error-correcting codes– translating ideas between views very powerful!– increases impact of work on each object

• The study of extractors– many applications– many constructions:

“information theoretic” vs. “reconstruction paradigm”– optimality important

Some Research Directions

• Exploit connections further.

• Optimality up to additive constants.

• Single, self-contained construction for all ranges of parameters. ([SU01] comes closest.)

• Study randomness conductors.

• When can we have extractors with no seed?– important for e.g. cryptography w/imperfect random sources.– sources with “independence” conditions [vN51,Eli72,Blu84,SV84,

Vaz85, CG85,CGH+85,BBR85,BL85,LLS87,CDH+00]

– for “efficient” sources [TV02]

Further Reading

• N. Nisan and A. Ta-Shma. Extracting randomness: a survey and new constructions. Journal of Computer & System Sciences, 58 (1):148-173, 1999.

• R. Shaltiel. Recent developments in explicit constructions of extractors. Bulletin of EATCS, 77:67-95, June 2002.

• S. Vadhan. Course Notes for CS225: Pseudorandomness. http://eecs.harvard.edu/~salil

• many papers...

Documents

Randomness Extractors & their Many Guises Salil Vadhan Harvard University to be posted at salil