27
1 Numerical Integration Monte Carlo style Numerical Integration Monte Carlo style Mark Huber Dept. of Mathematics and Inst. of Statistics and Decision Sciences Duke University [email protected] www.math.duke.edu/~mhuber

Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

1

Numerical Integration Monte Carlo styleNumerical Integration Monte Carlo style

Mark HuberDept. of Mathematics and Inst. of Statistics and Decision Sciences

Duke [email protected]

www.math.duke.edu/~mhuber

Page 2: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

2

Integration is hard

Nature laughs at the difficulties of integration.

Pierre-Simon de Laplace

Page 3: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

3

Darwin

Darwin visited the Galapagos in 1835

Page 4: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

4

Finches

Darwin noted 14 species of finches

(these 11 photographedby Dr. Robert Rothman)

Page 5: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

5

Darwin's Finches

Not all finches on all islands!

A B C D E ... Sumslarge ground 0 0 1 1 1 14medium ground 1 1 1 1 1 13small ground 1 1 1 1 1 14sharp-beaked 0 0 1 1 1 10...sums 4 4 11 10 8

14 types of finches, 17 islands

Page 6: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

6

The Question

Is this data random? Or is it evidence of evolution?

To answer deterministically, sum over all tables with same row and column sums

2.2×1016 tables!

Is this data random? Or is it evidence of evolution?

To answer deterministically, sum over all tables with same row and column sums

Page 7: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

7

The Oldest Problem

B

What is the area of ?B

Page 8: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

8

Counting versus Integration

How many integer points in ?

B

B

Page 9: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

9

Why is this hard?

These problems have very high dimension

ExamplesStatistical problems

dimension is number of data pointsNetwork (graph) problems

dimension is number of nodesPhysics problems

dim. is number of interacting entities

Page 10: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

10

Curse of Dimensionality

Deterministic methods existDirectly count the integer pointsRunning time grows exponential with dim.Trapezoidal Rule, Simpson's Rule, etcetera

Effectively reduce dimension by 1

#P hardCounting the proper colorings of a graphCounting Hamiltonian cycles in a graph

Page 11: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

11

Basic Monte Carlo

Acceptance/Rejection1) Generate samples from bounding region2) Find percentage lie in

3) Multiply by area of bounding region

B

B

Page 12: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

12

Why rarely used

The ProblemNeed “tight” bounding boxOtherwise need lots of samples for good estimateDifficult to get in high dimensions

Research Area #1Find good bounding boxes for actual high

dimensional problems of interest.

Page 13: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

13

Better Idea

Many times, problem reducibleJerrum, Valiant, Vazirani, 1986Example: convex regions

Page 14: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

14

Estimating volume

AB∖ A

vol B=vol A×vol Bvol A

28% 72%

Estimate vol B/vol A

Page 15: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

15

Suppose convex and fairly nice

r

R

= rRlarge

(even with this help, can't come within factor of 2 efficiently with deterministic methods [Elekes 86])

Page 16: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

16

Center box

Inside inner ball, box half edge length

a

a=r /dim

Page 17: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

17

Slicin' and Dicin'

a

Slice off region to right of boxGenerate lots of random samplesEstimate percent of area in sliced region

Page 18: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

18

Case I

a

If region with box at least 50%use as reduced problem

If region with box at least 50%use as reduced problem

62%

Page 19: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

19

Case II

a

Elsefind median, use that instead

50%

Page 20: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

20

Approximating the median

1) grab samples from body2) project onto one dimension3) take median of projections

Page 21: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

21

Either way...

a

Either1) Match one facet of box or2) Volume of body reduced by 1/2

50%

Page 22: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

22

How many steps?

Note

2Rdim≥vol original B

For center box

vol center box =2adim≥[2R/dim ]dim

Volume of body after many steps

2Rdim1/2n≥vol B after n steps

So most steps that can be taken

M :=2dd log d // log 2

Page 23: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

23

How many samples?

To get median need [Cohen 97][Huber 98]

O log 1//2 samples

To get within of answer with probability 1−

Overall, if steps taken needM '=/M

O M 3 log M / total samples

O dim3 log2dim / total samples

Polynomial in the dimension!

Page 24: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

24

To get samplesMost used method: Markov chains

Pick a direction uniformly at randomMove to a uniform point staying inside body

O dim7 time [Kannen, et. al. 94]

Page 25: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

25

Some questions

Can bound for Markov chain be improved?

OriginallyO dim27steps

Can perfect sampling methods be used for this problem?

Research Area #2

Research Area #3

Page 26: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

26

Currently working on

Some of my current research questions:Data from unknown mixtures of distributions(ex: responders versus nonresponders to drugs)Perfect matchings in a graph(ex: astronomical data is doubly truncated)Multinormal distribution on positive orthantContingency tables with extra constraints(ex: perhaps columns represent age)The many worlds version of the Ising modelSelf organizing lists(because who has time to organize their own lists?)

Page 27: Numerical Integration Monte Carlo style · The power of Monte Carlo Monte Carlo methods are the only known way to handle high dimensional numerical integration Many interesting questions

27

The power of Monte Carlo

Monte Carlo methods are the only known way to handle high dimensional numerical integration

Many interesting questions remain:Better envelopes for acceptance/rejectionBetter Markov chainsPerfect sampling algorithms instead of MC