Using Deterministic Chaos for Superefficient Monte Carlo ...chaosken.amp.i.kyoto-u.ac.jp/_src/sc2311/yang_yao_umeno_biglieri-2013-q4.pdfdomain X of the integrand Ax() , which make

Feature

26 IEEE cIrcuIts and systEms magazInE 1531-636X/13/$31.00©2013IEEE fourth QuartEr 2013

Digital Object Identifier 10.1109/MCAS.2013.2283966

Date of publication: 19 November 2013

cheng-an yang, Kung yao, Ken umeno, and Ezio Biglieri

Abstractmonte carlo (mc) simulation methods are widely used to solve complex engineering and scientific problems. unlike other deterministic methods, mc methods use statistical sampling to produce approximate solutions. as the processed sample size N growths, the uncertainty of the solution is reduced. It is well known that the mean-square approximation error decreases as 1/ .N however, for large problems like high-dimensional integra-tions and computationally intensive simulations, mc methods may take months or even years to obtain a solution with accept-able tolerance. the super-Efficient (sE) monte carlo simula-tion method, originated by umeno, produces a solution whose approximation error decreases as fast as 1/ .N2 however, it only applies to a small class of problems possessing certain proper-ties. We describe an approximate sE monte carlo simulation method that is applicable to a wider class of problems than the original sE method, and yields a convergence rate as fast as 1/N a for 1 2.# #a

I. Introduction

ulam and von Neumann first formulated the Monte Carlo (MC) simulation methodology as one using random sequences to evaluate high-

dimensional integrals [1]. Since then, MC simulations have been used in many applications to evaluate the performance of various systems that are not analyti-cally tractable.

The simplest, yet most important, form of MC simula-tion is used to approximate the integral

( ) ,I A x dx=X# (1)

where the integrand ( )A x is defined on the domain [ , ]a bX = for some real number .a b1 To do this,

we first choose a probability density function (pdf) ( )x 0!t in ,X and define the function

( ) : ( )( )

.B x xA xt

= (2)

Using Deterministic Chaos for Superefficient Monte Carlo Simulations

© dIgItal vIsIon

fourth QuartEr 2013 IEEE cIrcuIts and systEms magazInE 27

Cheng-An Yang, Kung Yao, and Ezio Biglieri are with the Electrical Engineering Department, UCLA, Los Angeles, California, USA. E-mails: {rhymer123,yao,biglieri}@ee.ucla.edu. Ken Umeno is with the Graduate School of Informatics, Kyoto University, Kyoto, Japan. E-mail: [email protected]. Ezio Biglieri is also with King Saud University, Riyadh, KSA. His work was supported by a research grant from King Saud University.

The integral (1) is approximated by calculating the N -sample average

( ) ( ) , , , , ,N B X B X I j N1 1 2Ei

N

i j1

f. = ==

6 @/ (3)

where N is the sample size,X si \ are independent identi-cally distributed (i.i.d.) random samples whose common pdf is ( ),xt and .][E denotes the expectation operator with respect to ( ) .xt

By the Strong Law of Large Numbers, the summation (3) converges almost surely to I if the random sam-ples are independent Furthermore, the variance of the approximation decreases at rate / .N1 That is,

( ) ( ) , , , , .N B X N B X j N1 1 1 2Var Vari

N

i j1

f= ==

= 6G @/

(4)

Note that (4) holds regardless of the dimension of the domain X of the integrand ( ),A x which make Monte Carlo simulation suitable for performing multidimen-sional integrations.

Umeno’s Super-Efficient Monte Carlo (SEMC) algo-rithm [6] is a variation of standard MC based on chaotic sequences, and exhibits a superior rate of convergence. Umeno and Yao’s approximate SEMC [11] removes some restriction in the original method to make the concept of superefficiency applicable to more general situations.

In the following sections, we review the pseudo- random number generation used in conventional MC simulation, and describe the concept of chaotic sequences and cha-otic MC simulation. The correlation between samples of the chaotic sequence gives rise to the super-efficient con-vergence rate, which makes chaotic MC simulation super-efficient. We illustrate how to generate chaotic sequences from the practical point of view, and how to apply super-efficient simulation methods to a wide class of integrands using the notion of approximate SEMC. In the last section, we provide some concluding remarks and point to the directions for future research.

II. Pseudo-Random Number and Chaotic SequenceA fundamental question of implementing Monte Carlo simulation is how to generate random samples. It turned out that the generation of truly random

sequences in a controlled manner is a nontrivial prob-lem. Fortunately, in many applications it suffices to use pseudo-random (PR) sequences [2]. A PR sequence is generated deterministically by some transformations, and it appears to be random from the statistical point of view [4, Chapter 3].

Example 1. Linear Congruential Generator (LCG). The sequence of linear congruential PR numbers (LCPRN) , ,x x0 1 f is produced by the recursion

The basic idea of Monte Carlo simulation is to estimate a

certain quantity based on random samples. Suppose we

want to estimate the blue shaded area inside of a unit square

shown in Fig. S1. We do so by generating N uniformly dis-

tributed samples in the square, and calculate the ratio

.A NNumber of samples in the blue area

= (S1)

As the number of samples N increases, we expect that the

ratio would converge to the true value / . .4 0 7854.r For

example, using random samples, we obtained

N A A 4-r

103 0.7790 0.0064104 0.7885 0.0031105 0.7843 0.0011106 0.7857 0.0003107 0.7855 0.0001

Monte Carlo simulations have been used in many applications to evaluate the performance of various systems that are not analytically tractable.

Figure S1. Estimating the area of a quadrant of a circle in a unit square by means of monte carlo method. the dots represent random samples. the shaded area is approximately the ratio of the number of blue dots to the total number of dots.

28 IEEE cIrcuIts and systEms magazInE fourth QuartEr 2013

,modx ax c mn n1 = ++ ^ h (5)

where a m0 1 1 and the seed x0 is taken from x m0 0 11 [1]. When the parameters , ,a c m and x0 are

properly selected, the linear congruential recursion can produce a sequence of period at most .m

LCG is one of the oldest and popular algorithm for gen-erating PR sequences due to its simplicity and well-under-stood properties. Although an LCPRN sequence passes many randomness tests, LCG has some serious defects. Most notably, it exhibits correlation between successive samples. The Mersenne Twister algorithm [3] is a better choice for generating high quality PR numbers for redu-cing this correlation. For example, Matlab uses the Mer-senne Twister algorithm as the default uniform random number generator starting from its version 7.4 in 2007 [5].

Another way of generating PR sequences is through dynamical systems. Formally, the initial state x0 of a dynamical system at time 0 is a point in the domain ,X and the evolution of the state is governed by a mapping T such that

( ), , , .x T x i 0 1i i1 f= =+ (6)

Birkhoff theorem [7] says that, under some mild techni-cal conditions, the “time average” of the integrand ( )B x will converge to the desired integral I in (1). That is,

( ) ( ) ,N B x B X N1 pointwise asEi

N

i1

" " 3=

6 @/ (7)

where the expectation is taken with respect to the pdf ( ),xt referred to as the invariant pdf of the dynamical system.

In MC simulations, a special role is played by “cha-otic” sequences. These are generated deterministically from the dynamical system (6), so that orbits associ-ated with identical initial x0 are the same. What makes a system chaotic is the fact that orbits arising from dif-ferent ,x0 even if arbitrarily close to each other, grow apart exponentially.

Example 2. The doubling map

( ) . , . ,T x

xx

xx

22 1

0 0 50 5 1

ififd

11

#

#=

-* (8)

defined on [ , )0 1X = is known to be chaotic. The invari-ant pdf ( )xt is the uniform distribution on ,X that is

( )x 1t = on x0 11# and 0 elsewhere. The doubling map is related to many other chaotic dynamical systems, like the Chebyshev dynamical system in Example 3.

Implementing the doubling map in a digital computer is very simple, because the action of doubling ( )T xd on x is simply a left-shift of the binary representation of .x However, there is a caveat: if the computer uses n bits to store the binary representation of a number, repeat-edly doubling n times any seed x0 will result in a zero state ( )T x 0n

0 = because all the bits are left-shifted. There is a simple workaround to this problem: each time we apply the doubling map on the current state

,x we append a randomly generated bit in the least sig-nificant position. The interpretation of the procedure is that we are doubling an irrational number. Indeed, if we think of the seed x0 as a truncated irrational number with n bits, the randomly generated least significant bit can be seen as the ( )n 1+ -th digit of .x0

Example 3. The Chebyshev dynamical system of order p is defined on the domain [ , ]1 1X = - with the mapping

( ) ( ( )),cos arccosT y p yp = (9)

where p is a positive integer. The mapping ( )T yp is in fact the p -th order Chebyshev polynomial of the first kind. See Fig. 1. Examples of Chebyshev polynomials are

( ) ,

( ) ,

( ) ,

( ) ,

( ) ,

T y y

T y y

T y y y

T y y y

T y y y y

2 1

4 3

8 8 1

16 20 5

1

22

33

44 2

55 3

h

=

= -

= -

= - +

= - +

Mathematical Definition of a Dynamical System

If we think of the PR numbers as the “states” of some dy-

namical system, the process of generating PR numbers can

be seen as applying a map T on the state xn of the system to

produce the next state .xn 1+ Formally, a dynamical system is

the quadruple ( , , , ),TA nX where X is the state space, A

is the v -algebra on ,X ( ) ( )dx x dxn t= is a measure on A

with density ( )xt and T is a mapping from X to itself. The

sequence ( , , )x x1 2 f with seed ,x0 ! X defined by

( ), , , , ,x T x i 0 1 2 f roi i1 f= =+ (S2)

is called the orbit of the dynamical system under .T

The correlation between samples of the chaotic sequence gives rise to the super-efficient convergence rate, which makes chaotic Monte Carlo simulation super-efficient.


Chebyshev dynamical system is chaotic, and its invariant pdf is

( ) .yy1

12

tr

=-

(10)

Note that the variable y of the Chebyshev dynamical system is related to the variable x of the doubling map in Example 2 by

( ) .cosy x2r= (11)

Therefore, we can generate a Chebyshev chaotic sequence , ,y y0 1 f by generating a sequence , ,x x0 1 f using the doubling map, and applying (11) to each of xi to obtain .yi

III. Chaotic Monte Carlo SimulationA chaotic MC simulation is an MC simulation with a PR sequence replaced by a chaotic sequence [6]. More spe-cifically, let T be a chaotic mapping, and ( )xt its invariant pdf. We first draw a seed x0 from the invariant pdf ( ),xt and use the chaotic mapping T to generate the sequence

, , ,x x xN1 2 f by ( )x T xi i1 =+ for , , , ,i N0 1 1f= - where N is the sample size. Recalling (7), the “time-average”

N( ) : ( )B x N B x1i

i

N

i1

G H ==

/ (12)

will converge to the integral I defined in (1) as N approaches infinity [7] (see Algorithm 1).

A. Statistical and Dynamical CorrelationThe greatest distinction between conventional and cha-otic MC simulation is that the chaotic sequence has correl-ation between samples. For conventional MC simulation,

good PR number generators produce near i.i.d. samples. Correlation between samples is generally considered to be a bad thing, because it may decrease the convergence rate of the simulation. However, if we select the chaotic mapping carefully, the correlation between samples may actually improve the convergence rate for certain inta-grands. In the following, we show how the correlation can affect the variance of the approximation error.

For simplicity, denote ( )B xk by Bk and ( )B xi NG H by .BG HN Define the autocorrelation function

( ) ( ( ) ,R k B I B IE k i i= - -+ )6 @ (13)

where the expectation is taken with respect to the invariant pdf ( ),xt and , ,i 1 2 f= is arbitrary because

( )B xk is stationary. The variance of the approximation error N I-BG H is given by

N )I-: ( ( ) ( ) .B N BN

N k R k1 2VarENk

N2 2

21

G Hv = = + -=

8 6B @ /

(14)

−1 −0.5 0 0.5 1

−1

−0.5

0

0.5

1

y

Tp

(y)

p = 1p = 2

p = 3p = 4

p = 5

Figure 1. chebyshev polynomials of degree 1 to 5.

Algorithm 1. Chaotic Monte Carlo Simulation

1: procedure Chaotic Monte carlo Simulation( N ) Input: sample size .N

2: ( ) ( )/ ( )B x A x x! t Defined as in (2). 3: ( )x x! t Generate a seed from the distribution ( ) .xt 4: S 0! Initialization. 5: i 0! 6: while i N1 do 7: ( )x T x! Generate the next state of the chaotic dynamical system. 8: ( )S S B x! + Calculate the cumulative sum. 9: i i 1! +

10: end while11: return /S NN The time average.12: end procedure

Detailed Derivation of (14).

: ( ) ( )B I N B I1E EN Ni

N

i2 2

1

2

G Hv = - = -=

e o8 =B G/

( ) ( ) ( )N

B I B I B I1 2Ei

N

ii j

i j21

2= - + - -2=

= G/ /

( ) ( ( ) ( ))N BN

R i j R k1 2 13Var is defined ini j

2= + -2

6 @ /

( ) ( ) .N BN

N k R k1 2Vark

N

21

= + -=

6 @ /


The first term on the right-hand side in (14) is called the statistical correlation, which depends on the inte-grand ( )B x and the pdf ( ) .xt The second term is called the dynamical correlation, which depends on the inte-grand as well as on the chaotic sequence [6].

Clearly, for i.i.d. random samples , , ,x x1 2 f we have ( ) ,R k 0= and hence (14) reduces to the conventional

case (4), where the convergence rate is / .N1 If there are positive correlations between samples, the variance of the approximation error N

2v will increase. On the other hand, negative correlations between samples might decrease .N

2v It is therefore natural to ask what is the best achievable convergence rate of chaotic MC simula-tion. This leads to the notion of super-efficiency of the chaotic MC simulation detailed in next section.

B. Super-Efficient Chaotic MC SimulationRewrite the variance of the approximation error (14) as

( ) ( ) .N B R kN

kR k1 2 2VarNk

N

k

N2

12

1v = + -

h

= =

e o6 @1 2 344444 44444

/ / (15)

This shows that the convergence rate of N2v has two con-

tributors, one decaying as /N1 and the other as / .N1 2 Eventually the convergence rate will be dominated by

/ ,N1 which suggests that the chaotic MC simulation has the same performance as the standard MC simulation. However, if the dynamical system introduces the right amount of negative correlation such that ,0h = the con-vergence rate becomes / ,N1 2 which is a huge improve-ment over the conventional MC simulation.

To obtain ,0h = and hence convergence rate / ,N1 2 one should suitably combine the sequence correlation with the integrand [6]. We say the chaotic MC simula-tion is Super-Efficient (SE) if the variance of the approxi-mation error decays as /N1 2 for .N " 3 Umeno [6] also showed that the condition 0h = for super-efficiency is necessary as well as sufficient.

Example 4. The integrand [6, p. 1447]

( )A xx

x x x1

8 8 12

4 2

r=

-

- + + - (16)

satisfies the SE condition under Chebyshev dynamical systems (see Example 3) of order p 2= and .p 4= Fig. 2 shows the results of applying the chaotic MC simulation algorithm (see Algorithm 1) to find the integral of ( )A x using Chebyshev chaotic mappings, and compares their convergence rates with the conventional MC simula-tion using uniform PR samples. The numerical results verify that the chaotic MC simulation is super-efficient when p 2= and .p 4= On the other hand, both con-ventional and chaotic MC simulation with p 5= have convergence rate / .N1 Using the properties of the Che-byshev polynomials, it can be shown that the variances

N2v under p 2= and p 4= are /N2 2 and / ,N1 2 respec-

tively, which indicate super-efficiency. On the other hand, the case p 5= yields a nonzero h in (15). In next section, we present a very powerful characterization of super-efficiency.

C. Condition for Super-EfficiencyThe super-efficiency condition 0h = arising from (15) does not explicitly suggest any way to achieve it. Umeno [6] first gave a characterization of super-efficiency in terms of the coefficients of the generalized Fourier series of the modified integrand ( ) ( )/ ( )B x A x xt= for

Super-Efficient MC Simulation

Super-Efficient MC simulations have /N1 2 convergence

rate for sample size ,N " 3 while the conventional MC

simulations have /N1 convergence rate.

103 104 10510−10

10−8

10−6

10−4

10−2

σ N2

N

Conventional

p = 5

p = 2

p = 4

Figure 2. variance of the approximation error N2v versus

the sample size .N We compare the convergence rate of the chaotic mc simulation using chebyshev mapping of order

,p 2 4= and 5, and the conventional mc simulation.

A fundamental question of implementing Monte Carlo simulation is how to generate random samples.

Super-Efficient Integrand

an integrand ( )A x is Super-Efficient under the dynami-

cal system with mapping T and invariant pdf t if (20)

holds for ( ) ( ) / ( ) .B x A x xt=


Chebyshev and piecewise linear dynamical systems, but the results in [6] did not made clear whether those conclusions are also applicable to other dynamical systems. Yao [11] established the connection between Super-Efficiency and the Lebesgue spectrum of ergo-dic theory [7], which puts the super-efficient condition derived in Umeno’s work in a general framework, and generalizes Umeno’s result to a wide range of dynamical systems, namely those with a Lebesgue spectrum. This observation helps us explain the super-efficiency sys-tematically, and hopefully leads to practical algorithms as detailed in section IV.

In this section, we briefly introduce the concept of Lebesgue spectrum and the important characterization of super-efficiency in terms of Lebesgue spectrum.

Definition 1. Let K and F be index sets. A dynami-cal system with mapping T is said to have a Lebesgue spectrum if there exists an orthogonal basis containing the constant function 1 and the collection of functions { ( ) , }f x j F,j ; ! !m Km such that

( ( )) ( )f T x f x, ,j j 1=m m + (17)

for all m and ,j where the index m labels the classes and j labels the functions within each class.

Example 5. The Chebyshev dynamic systems defined in Example 3 has a Lebesgue spectrum [8]–[10]. Recall that the chaotic mapping Tp is the Chebyshev polyno-mial with degree ,p defined on [ , ]1 1X = - as

( ) ( ( )) .cos arccosT x p xp = (18)

The ( , )jm -th basis function is given by the degree p jm

Chebyshev polynomial:

( ) ( ), , ,f x T x j F,j p j 6 ! !m K=m m (19)

where K is the set of nonnegative integers relatively prime to ,p and { , , , } .F 0 1 2 f= Note that 0m = corre-sponds to the class with a single function ( ) .T x 10 =

Theorem 1. If the dynamical system has a Lebesgue spec-trum { ( ) | , }f x j F,j ! !m Km indexed by the sets K and ,F then the chaotic MC simulation is super-efficient if and only if

: ,d b 0 for all,jj 0

! Km= =3

m m

=

/ (20)

where

( ) ( )B x b b f x, ,jj

j00

= +3

!

m

m

m

K =

// (21)

is the Generalized Fourier series of ( ) ( )/ ( ) .B x A x xt=

Thus, the explicit condition for super-efficiency is that the sum of coefficients in each class m be zero. We say that an integrand ( )A x is Super-Efficient (under the dynamical system with mapping T and invariant pdf )t if (20) holds for ( ) ( )/ ( ) .B x A x xt=

From previous example, it is now clear why the inte-grand ( )A x in Example 4 under the chaotic mappings T2 and T4 (but not )T5 are super-efficient. More specifically, the modified integrand under the Chebyshev dynami-cal system can be written as ( ) ( ) ( ) .B x T x T x1 4= - When p 2= and ,p 4= the sum of coefficients in the class 1m = is ,d 1 1 01 = - = and all other d s\m are zero. Hence, the chaotic MC simulations under both chaotic mappings T2 and T4 are super-efficient. On the other hand, when ,p 5= the chaotic MC simulation has the same convergence rate as the conventional MC simu-lation, because d 11 = does not satisfy the super-effi-ciency condition.

Correlation between samples is generally considered to be a bad thing, because it may decrease the convergence rate of the simulation. However, if we select the chaotic mapping carefully, the correlation between samples

may actually improve the convergence rate for certain integrands.

Conventional

103 104 105 106

10−10

10−8

10−6

10−4

10−2

σ N2

N

= 1

= 0

= 0.5= 0.2= 0.1= 0.05= 0.02= 0.01= 0.001

Figure 3. the variance of the approximation error N2v ver-

sus the number of samples .N the slope of the conventional mc simulation curve is ,1- indicating its /N1 behavior. on the other hand, the slope of the super-efficient mc simula-tion is 2- because N

2v decays like / .N1 2 Between these two extremes are the mismatched sE mc simulations with differ-ent size of .e for . ,0 001e = the curve is almost identical to the super-efficient curve. as e becomes larger, the slope of the mismatched sE mc simulations gradually increase as N becomes larger.


Example 6. Consider the integrand [6, p. 1447]

( )(

( ) ( ) .)

A xx

x x xB x x

18 8 1 1

2

4 2

rt

e=

-

- + + +=

-e (22)

Under the Chebyshev dynamical system ( , , , ),TA ptX ( )B xe can be expanded as

( ) ( ) ( ) ( ) .B x T x T x1 1 4e= + -e (23)

If ,p 2= the coefficients of the generalized Fourier series are ,b 1,1 0 e= + b 1,1 2 =- and zero otherwise. The sum of coefficients are d1 e= and d 0=m for .1!m There-fore ( )A x is super-efficient if and only if .0e = When

0!e we have “mismatched” SE MC simulation, which appears to be super-efficient for small N but gradually loses super-efficiency as N increases [12]. See Fig. 3.

IV. Approximate Super-Efficient Monte Carlo Simulation

Applying chaotic MC simulation on super-efficient integrands yields superior convergence rate of /N1 2 in contrast to the conventional convergence rate / .N1 However, most integrands are not SE. This implies that chaotic MC simulation has no advantage over conven-tional MC simulation in general.

While most integrands do not satisfy the SE condi-tion, Yao proposed the Approximate Super-Efficient (ASE) algorithm [11] that modifies the integrand so that it is approximately SE, and by applying chaotic MC simu-lation on the modified integrand, we get a much faster convergence rate of /N1 a for convergence exponent a between 1 and 2 (a concept equivalent to ASE was pro-posed by Umeno in 2002 [13]).

A crucial step here consists of adding to ( )B x a func-tion that has zero mean. This will not change the integral of ( )B x [11]. Therefore, if we know the sum of coeffi-cients dm in (20) for each class ,m then the new integrand

( ) ( ) ( )x B x f x= -B d ,0!

m

m

m

K

l / (24)

will be super-efficient without changing the integral of ( )B x (recall the basis functions ( )f x s,j \m have zero mean

for all m and ) .j We call the function ( )d f x,0m m the com-pensator associated with class .m By subtracting com-pensators from ( ),B x we introduce negative dynamical correlation, and make the chaotic MC simulation nearly super-efficient.

In practice, since we do not know the sum of coef-ficients, it is not possible to construct infinitely many compensators to achieve perfect super-efficiency. The idea of ASE algorithm is to approximate the sum of coef-ficients dm by its Lm -term partial sum

b b bd , , ,L0 1 g. + + +m m m m mt (25)

using conventional MC or chaotic MC simulations, where Lm is some (hopefully not too large) positive integer. Then we form the modified integrand

( ) ( ) ( ),B x B x d f x,0L

= -!

m

m

m

K

u t/ (26)

where the index set LK contains L classes. If the sum of coeffi-cients d dd = -m m m

+ t of Bu is close to zero, and d 02h e= =

! mKmu u/

is small, then from (15) the vari-ance of the approximation error can be written as

N NN2

2

2v

ge= + (27)

Algorithm 2. Approximate Super-Efficient Monte Carlo Simulation

1: procedure ASE monte carlo Simulation , , , { })(N n LLK m

2: for L!m K do Estimate d s\m for .L!m K

3: define ( ) ( )F x f x,jj

L

0=m m=

m/ 4: ( ) ( )d F x B xi i n!G Hm m

t Use conventional or chaotic MC simulation. 5: end for 6: define ( ) ( )/ ( )B x A x xt=

7: define ( ) ( ) ( )B x B x d f x,0L

= -! m mKm

u t/ Subtract the compensators as in (26). 8: define ( ) ( )B x B x= u Redefine ( )B x 9: Go to line 3 of Algorithm 1. Apply chaotic MC simulation on ( ) .B xu

10: end procedure

100 101 102 103 104 10510−10

10−8

10−6

10−4

10−2

100

N

Conventional

ASE

SE

σ N2

Figure 4. Illustration of the convergence rate of N2v versus the

sample size .N the convergence exponent a is the negative slope of the curve. conventional mc simulation has .1a = sEmc simulation has .2a = asE algorithm has 2.a when N is small, and will gradually decrease to 1.a when N becomes large. note that even though decay exponent a of the asE algorithm ultimately goes to 1, the error variance N

2v is significantly smaller than the conventional case.


103 104 105 106

1

1.2

1.4

1.6

1.8

2

N

α

Conventional

SE

PASE

n = 100n = 1000

n = 10000

n = 100000

103 104 105 106 107

10−10

10−8

10−6

10−4

N

Conventional

SE

n = 100

n = 1000

n = 10000

n = 100000

σ N2

PASE

for some .g The effective convergence rate can be expressed as / ,N1 a where [ , ]1 2!a is referred to as the convergence exponent, and it is defined as the negative slope of the N - N

2v curve (see Fig. 4). For ASE algorithm, a will decrease as the sample size N increases. Indeed, when N increases, a will gradually decrease to 1, because the term /N2e will

eventually dominate the convergence rate. The more accurate the estimates d s\mt are, the slower a decreases to 1 (Fig. 4).

A. Fixed-Accuracy ASE AlgorithmYao first proposed the following two-stage ASE algo-rithm [11] (see Algorithm 2):

Figure 5. variance of approximation error N2v versus the

number of samples .N throughout the entire simulation, both the conventional and super-efficient mc simulation constantly has /N1 and /N1 2 behavior, respectively. the asE mc simulations have /N1 2 behavior at first, but gradu-ally degrade to / .N1 asE simulations with larger values of n have better accuracy than the estimates d s\mt , and lose super-efficiency later. the Progressive asE simulation has

/N1 behavior at first but gradually improves to / ,N1 2 because the estimates d s\mt get more accurate as N increases.

Figure 6. to see the decay exponent a more clearly, we use least square method to find the slope of the curves in fig. 5. the decay exponent for the super-efficient mc simulation is 2 for the entire simulation. on the other hand, the exponent for the conventional mc simulation is 1. for the asE mc simula-tion with sample size , ,n 100 000= it is super-efficiency at N 104= but the decay exponent gradually decreases to 1.1 in the end of the simulation. With asE mc, simulations with lower sample size n have smaller decay exponents, all of which decrease to 1 very fast. on the other hand, the PasE simulation has decay exponent around 1.4 in the beginning and gradually increases to nearly 1.9 in the end, indicating that the quality of the estimates d s\m is getting better.

Algorithm 3. Progressive Approximate Super-Efficient Monte Carlo Simulation

1: procedure PASE monte carlo Simulation , , { })(N LLK m

2: define ( ) : ( )/ ( )B x A x xt= Defined as in (2).

3: define ( ) ( )F x f x,jj

L

0=m m=

m/ for L!m K

4: ( )x x! t Generate a seed from the distribution ( ) .xt

5: ,i 0! ,S 0! D 0!m for L!m K

6: while i N1 do

7: ( )x T x! Generate the next state of the chaotic dynamical system.

8: ( )S S B x! + Calculate the cumulative sum.

9: ( ) ( )D D F x B x! +m m m for L!m K

10: i i 1! +

11: end while

12: return )/NN(S D f ,0LG H-

! m mKm/ The time average.

13: end procedure


1) Approximate the sum of coefficients d s\mt in (25) using n -sample conventional or chaotic MC simu-lation for each L!m K .

2) Subtract the compensators from the integrand ( )B x to form

( )B xu as defined in (26), and apply chaotic MC sim-ulation on ( ) .B xu

Note that we need to spend n samples to estimate dm for each L!m K in stage 1. The quality of the estimates will affect how well the chaotic MC simulation performs in the second stage. To illustrate this point, in Example 7 we apply ASE using different sizes of ,n and compare their performance.

Example 7. Consider the Chebyshev dynamical sys-tem of order p 2= (see Example 3) and the integrand [12]

( ) ( ) ( ) ( ) ( ) .expA x x x B x x1 2 2 t= - - = (28)

Unlike the previous examples, where the integrand ( )B x could be expressed as finite sum of basis functions, the integrand ( )B x in (28) has infinitely many terms in its generalized Fourier series expansion. We choose

{ , , , , }1 3 5 7 9LK = and .L 5=m We perform chaotic MC simulation for N 106= samples using conventional, ASE and Progressive ASE (PASE) MC algorithms (to be defined shortly), see Fig. 5. As a benchmark, we compute the sum of coefficients using accurate numerical integra-tion for the super-efficient case. For ASE MC simulations, we use different number of samples n to estimate d s\m to demonstrate the effect of inaccurate estimates and convergence rate. For Progressive ASE MC simulation we estimate d s\m at the same time as the chaotic MC simulation runs.

To better visualize the decay exponent ,a we use least square method to find the slope of the curves in Fig. 5 (recall that a is the negative slope of the curve in the log-log plot). From (15), if the integrand is nearly super-efficient, then the decay exponent a will be around 2. For conventional MC simulation, .1a = See Fig. 6.

Note that for ASE simulations, we need to spend n random samples in the first stage for each class in .LK The effective number of samples for ASE simulations should take those extra samples into consideration. On the other hand, the Progressive ASE algorithm (see next section) does not have this overhead, and its con-vergence rate is improving as N increases because the estimates of dm are getting more and more accurate.

B. Progressive ASE AlgorithmASE simulation is approximately super-efficient for mod-erate sizes of .N However, from (27) it is clear that ASE simulation will eventually lose the /N1 2 convergence rate as long as .0!e

In [12], Biglieri suggested computing d s\mt iteratively to improve the accuracy of the estimation. As opposed to the original ASE algorithm, which has fixed accuracy for the entire simulation, we proposed a Progressive ASE (PASE) algorithm that keeps improving the accuracy of d s\mt as the chaotic MC simulation goes on. The idea is to use the samples ( )B xi generated in the main chaotic MC simulation to estimate d s\mt continuously. Therefore we get progressively better estimates of d s\mt and improve the decay rate. See Algorithm 3.

V. Conclusions and Future WorksWhile conventional MC simulation yields the convergence rate / ,N1 SE MC has superior convergence rate /N1 2 for integrands of the SE type. Since most integrands are not SE, we introduce the concept of ASE. The ASE and Progressive ASE algorithms are general, and at least as fast as conven-tional MC simulation, while sometimes yielding near super-efficient convergence rate. Furthermore, the introduction of the Lebesgue spectrum concept from ergodic theory allows us to systematically study the SE MC simulation. The above discussions are applicable also to multi-dimen-sional integrands. It is of great interest to find more applica-tions to exploit the concept of SE and ASE.

Cheng-An Yang received the B.Sc. degree in Electrical Control Engineering from National Chiao Tung University, Hsinchu, Taiwan, in 2009. He received the M.Ss. degree in Communication Engineering from Chalmers, Göteborg, Sweden, in

2010. He is currently pursuing his Ph.D. degree in Elec-trical Engineering Department at University of California, Los Angeles. His research interests include optimization in signal processing and wireless communications.

Kung Yao received the B.S. (Summa Cum Laude), M.A. and Ph.D. degrees in electrical engineering all from Princ-eton University, Princeton, N.J. Pres-ently, he is a Distinguished Professor in the Electrical Engineering Department

The introduction of the Lebesgue spectrum concept from ergodic theory allows us to systematically study the

super-efficient Monte Carlo simulations.


at UCLA. His research and professional interests include: 4G cellular network system, digital communi-cation theory and system, and beamforming in sensor array systems. He received the IEEE Signal Processing Society’s 1993 Senior Award in VLSI Signal Processing and the 2008 IEEE Communications Society/Informa-tion Theory Society Joint Paper Award. He is the lead co-author of Detection and Estimation in Communication and Radar Systems, Cambridge Press, in 2012. He is a Life Fellow of the IEEE.

Ken Umeno received his B.Sc. degree in electronic communication from Waseda University, Japan in 1990. He received his M.Sc. and Ph.D. degrees in physics from the University of Tokyo, Japan in 1992 and 1995, respectively. Currently,

he is a professor at Graduate School of Informatics, Kyoto University. Prior to joining Kyoto University in 2012, he worked for the Ministry of Posts and Telecom-munications, Communications Research Laboratory (currently National Institute of Information and Com-munications Technology of Japan, NICT). From 2004 to 2012, he was the CEO and the president of ChaosWare, Inc, a first spin-off company of NICT as well as a princi-pal investigator of NICT. He received the LSI IP Award in 2003, the Telecom-System Award in 2003 and 2008 respectively. He holds 46 registered Japanese patents and 23 registered U.S. patents in the fields of telecom-munications, security, and financial engineering. His research interests include ergodic theory, statistical computing, coding theory, chaos theory and its applica-tions to communications and computing.

Ezio Biglieri was born in Aosta (Italy). He received his training in Electrical Engineering from Politecnico di Torino (Italy), where he received his Dr. Engr. degree in 1967. He is an Adjunct Pro-fessor with the Electrical Engineering

Department of University of California at Los Angeles (UCLA), with Department TIC, Universitat Pompeu Fabra, Barcelona, Spain, and with King Saud University, Riyadh, KSA. He was elected three times to the Board of Governors of the IEEE Information Theory Society, and in 1999 he was the President of the Society. He is serving on the Scientific Board of the French company Sequans Communications, and, till 2012, he was a member of

the Scientific Council of the “Groupe des Écoles des Télécommunications” (GET), France. Since 2011 he has been a member of the Scientific Advisory Board of CHIST-ERA (European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net). In the past, he was Editor-in-Chief of the IEEE Transactions on Information Theory, the IEEE Communications Letters, the European Transactions on Telecommunications, and the Journal of Communications and Networks. He is a Life Fellow of the IEEE. Among other honors, in 2000 he received the IEEE Donald G. Fink Prize Paper Award and the IEEE Third-Mil-lennium Medal. In 2001 he received the IEEE Communi-cations Society Edwin Howard Armstrong Achievement Award, He received twice (in 2004 and 2012) the Journal of Communications and Networks Best Paper Award. In 2012 he received from the IEEE Information Theory Soci-ety the Aaron D. Wyner Distinguished Service Award. In 2013 he received from EURASIP the Athanasios Papou-lis Award for outstanding contributions to education in communications and information theory.

References[1] R. Y. Rubinstein, Simulation and the Monte Carlo Method (Wiley Series in Probability and Mathematical Statistics). New York: Wiley, 1981.

[2] J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods. Methuen & Co., 1964.

[3] M. Matsumoto and T. Nishimura, “Mersenne twister: A 623-dimen-sionally equidistributed uniform pseudo-random number generator,” ACM Trans. Model. Comput. Simul., vol. 8, pp. 3–30, 1998.

[4] D. E. Knuth, The Art of Computer Programming. Volume 2: Seminu-merical Algorithms, 2nd ed. Reading, MA: Addison-Wesley, 1981.

[5] C. Moler, “Random numbers,” in Numerical Computing with MATLAB. Philadelphia, PA: SIAM, 2004, ch. 9.

[6] K. Umeno, “Chaotic Monte Carlo computation: A dynamical effect of random-number generations,” Jpn. J. Appl. Phys., vol. 39, pp. 1442–1456, Mar. 2000.

[7] V. I. Arnold and A. Avez, Ergodic Problems of Classical Mechanics, vol. 50. New York: Benjamin, 1968.

[8] D. S. Broomhead, et al., “Codes for spread spectrum applications generated using chaotic dynamical systems,” Dyn. Stab. Syst., vol. 14, no. 1, 1999.

[9] K. Umeno, “SNR analysis for orthogonal chaotic spreading sequenc-es,” Nonlinear Anal., vol. 47, pp. 5753–5763, Aug. 2001.

[10] C. C. Chen, et al., “Design of spread-spectrum sequences using cha-otic dynamical systems and ergodic theory,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 48, no. 9, pp. 1110–1114, Sept. 2001.

[11] K. Yao, “An approximate superefficient MC method is better than classical MC method,” Unpublished Report, 2009.

[12] E. Biglieri, “Some notes on ‘superefficient’ Monte Carlo methods,” Unpublished Report, 2009.

[13] K. Umeno, Jpn. Patent 3711270, 2005.

[14] J. L. Walsh, “A closed set of normal orthogonal functions,” Amer. J. Math., vol. 45, pp. 5–24, 1923.

Documents

Using Deterministic Chaos for Superefficient Monte Carlo ...chaosken.amp.i.kyoto-u.ac.jp/_src/sc2311/yang_yao_umeno_biglieri-2013-q4.pdfdomain X of the integrand Ax() , which make