54
Comparison Between Traditional Statistics and Computational Statistics [Wegman, 1988]. Reprinted with permission from the Journal of the Washington Academy of Sciences. Traditional Statistics Computational Statistics Small to moderate sample size Large to very large sample size Independent, identically distributed data sets Nonhomogeneous data sets One or low dimensional High dimensional Manually computational Computationally intensive Mathematically tractable Numerically tractable Well focused questions Imprecise questions Strong unverifiable assumptions: Relationships (linearity, additivity) Error structures (normality) Weak or no assumptions: Relationships (nonlinearity) Error structures (distribution free) Statistical inference Structural inference Predominantly closed form algorithms Iterative algorithms possible Statistical optimality Statistical robustness

Chapter 1: Introduction 3 Comparison Between Traditional ...yxie77/isye6416/Lecture2.pdf · putational statistics. ... ples that illustrate the use of the algorithms in data analysis

  • Upload
    vandieu

  • View
    226

  • Download
    0

Embed Size (px)

Citation preview

Chapter 1: Introduction 3

include a section containing references that explain the theoretical conceptsassociated with the methods covered in that chapter.

In this book, we cover some of the most commonly used techniques in com-putational statistics. While we cannot include all methods that might be apart of computational statistics, we try to present those that have been in usefor several years.

Since the focus of this book is on the implementation of the methods, weinclude algorithmic descriptions of the procedures. We also provide exam-ples that illustrate the use of the algorithms in data analysis. It is our hopethat seeing how the techniques are implemented will help the reader under-stand the concepts and facilitate their use in data analysis.

Some background information is given in Chapters 2, 3, and 4 for thosewho might need a refresher in probability and statistics. In Chapter 2, we dis-cuss some of the general concepts of probability theory, focusing on how they

Comparison Between Traditional Statistics and Computational Statistics [Wegman, 1988]. Reprinted with permission from the Journal of the Washington Academy of Sciences.

Traditional Statistics Computational Statistics

Small to moderate sample size Large to very large sample size

Independent, identically distributed data sets

Nonhomogeneous data sets

One or low dimensional High dimensional

Manually computational Computationally intensive

Mathematically tractable Numerically tractable

Well focused questions Imprecise questions

Strong unverifiable assumptions:Relationships (linearity, additivity)Error structures (normality)

Weak or no assumptions:Relationships (nonlinearity)Error structures (distribution free)

Statistical inference Structural inference

Predominantly closed form algorithms

Iterative algorithms possible

Statistical optimality Statistical robustness

© 2002 by Chapman & Hall/CRC

Lecture 2 Review of Basics

ISyE 6416, Yao Xie

Review of Linear Algebra

Vector spaces

a vector space or linear space (over the reals) consists of

• a set V

• a vector sum + : V × V → V

• a scalar multiplication : R× V → V

• a distinguished element 0 ∈ V

which satisfy a list of properties

Linear algebra review 3–2

• x+ y = y + x, ∀x, y ∈ V (+ is commutative)

• (x+ y) + z = x+ (y + z), ∀x, y, z ∈ V (+ is associative)

• 0 + x = x, ∀x ∈ V (0 is additive identity)

• ∀x ∈ V ∃(−x) ∈ V s.t. x+ (−x) = 0 (existence of additive inverse)

• (αβ)x = α(βx), ∀α,β ∈ R ∀x ∈ V (scalar mult. is associative)

• α(x+ y) = αx+ αy, ∀α ∈ R ∀x, y ∈ V (right distributive rule)

• (α+ β)x = αx+ βx, ∀α,β ∈ R ∀x ∈ V (left distributive rule)

• 1x = x, ∀x ∈ V

Linear algebra review 3–3

Examples

• V1 = Rn, with standard (componentwise) vector addition and scalarmultiplication

• V2 = {0} (where 0 ∈ Rn)

• V3 = span(v1, v2, . . . , vk) where

span(v1, v2, . . . , vk) = {α1v1 + · · ·+ αkvk | αi ∈ R}

and v1, . . . , vk ∈ Rn

Linear algebra review 3–4

Subspaces

• a subspace of a vector space is a subset of a vector space which is itselfa vector space

• roughly speaking, a subspace is closed under vector addition and scalarmultiplication

• examples V1, V2, V3 above are subspaces of Rn

Linear algebra review 3–5

Vector spaces of functions

• V4 = {x : R+ → Rn | x is differentiable}, where vector sum is sum offunctions:

(x+ z)(t) = x(t) + z(t)

and scalar multiplication is defined by

(αx)(t) = αx(t)

(a point in V4 is a trajectory in Rn)

• V5 = {x ∈ V4 | x = Ax}(points in V5 are trajectories of the linear system x = Ax)

• V5 is a subspace of V4

Linear algebra review 3–6

Independent set of vectors

a set of vectors {v1, v2, . . . , vk} is independent if

α1v1 + α2v2 + · · ·+ αkvk = 0 =⇒ α1 = α2 = · · · = 0

some equivalent conditions:

• coefficients of α1v1 + α2v2 + · · ·+ αkvk are uniquely determined, i.e.,

α1v1 + α2v2 + · · ·+ αkvk = β1v1 + β2v2 + · · ·+ βkvk

implies α1 = β1,α2 = β2, . . . ,αk = βk

• no vector vi can be expressed as a linear combination of the othervectors v1, . . . , vi−1, vi+1, . . . , vk

Linear algebra review 3–7

Basis and dimension

set of vectors {v1, v2, . . . , vk} is a basis for a vector space V if

• v1, v2, . . . , vk span V , i.e., V = span(v1, v2, . . . , vk)

• {v1, v2, . . . , vk} is independent

equivalent: every v ∈ V can be uniquely expressed as

v = α1v1 + · · ·+ αkvk

fact: for a given vector space V , the number of vectors in any basis is thesame

number of vectors in any basis is called the dimension of V , denoted dimV

(we assign dim{0} = 0, and dimV = ∞ if there is no basis)

Linear algebra review 3–8

Nullspace of a matrix

the nullspace of A ∈ Rm×n is defined as

N (A) = { x ∈ Rn | Ax = 0 }

• N (A) is set of vectors mapped to zero by y = Ax

• N (A) is set of vectors orthogonal to all rows of A

N (A) gives ambiguity in x given y = Ax:

• if y = Ax and z ∈ N (A), then y = A(x+ z)

• conversely, if y = Ax and y = Ax, then x = x+ z for some z ∈ N (A)

Linear algebra review 3–9

Zero nullspace

A is called one-to-one if 0 is the only element of its nullspace:N (A) = {0} ⇐⇒

• x can always be uniquely determined from y = Ax(i.e., the linear transformation y = Ax doesn’t ‘lose’ information)

• mapping from x to Ax is one-to-one: different x’s map to different y’s

• columns of A are independent (hence, a basis for their span)

• A has a left inverse, i.e., there is a matrix B ∈ Rn×m s.t. BA = I

• det(ATA) = 0

(we’ll establish these later)

Linear algebra review 3–10

Interpretations of nullspace

suppose z ∈ N (A)

y = Ax represents measurement of x

• z is undetectable from sensors — get zero sensor readings

• x and x+ z are indistinguishable from sensors: Ax = A(x+ z)

N (A) characterizes ambiguity in x from measurement y = Ax

y = Ax represents output resulting from input x

• z is an input with no result

• x and x+ z have same result

N (A) characterizes freedom of input choice for given result

Linear algebra review 3–11

Range of a matrix

the range of A ∈ Rm×n is defined as

R(A) = {Ax | x ∈ Rn} ⊆ Rm

R(A) can be interpreted as

• the set of vectors that can be ‘hit’ by linear mapping y = Ax

• the span of columns of A

• the set of vectors y for which Ax = y has a solution

Linear algebra review 3–12

Onto matrices

A is called onto if R(A) = Rm ⇐⇒

• Ax = y can be solved in x for any y

• columns of A span Rm

• A has a right inverse, i.e., there is a matrix B ∈ Rn×m s.t. AB = I

• rows of A are independent

• N (AT ) = {0}

• det(AAT ) = 0

(some of these are not obvious; we’ll establish them later)

Linear algebra review 3–13

Interpretations of range

suppose v ∈ R(A), w ∈ R(A)

y = Ax represents measurement of x

• y = v is a possible or consistent sensor signal

• y = w is impossible or inconsistent; sensors have failed or model iswrong

y = Ax represents output resulting from input x

• v is a possible result or output

• w cannot be a result or output

R(A) characterizes the possible results or achievable outputs

Linear algebra review 3–14

Inverse

A ∈ Rn×n is invertible or nonsingular if detA = 0

equivalent conditions:

• columns of A are a basis for Rn

• rows of A are a basis for Rn

• y = Ax has a unique solution x for every y ∈ Rn

• A has a (left and right) inverse denoted A−1 ∈ Rn×n, withAA−1 = A−1A = I

• N (A) = {0}

• R(A) = Rn

• detATA = detAAT = 0

Linear algebra review 3–15

Interpretations of inverse

suppose A ∈ Rn×n has inverse B = A−1

• mapping associated with B undoes mapping associated with A (appliedeither before or after!)

• x = By is a perfect (pre- or post-) equalizer for the channel y = Ax

• x = By is unique solution of Ax = y

Linear algebra review 3–16

Matrix structure and algorithm complexity

cost (execution time) of solving Ax = b with A ∈ Rn×n

• for general methods, grows as n3

• less if A is structured (banded, sparse, Toeplitz, . . . )

flop counts

• flop (floating-point operation): one addition, subtraction,multiplication, or division of two floating-point numbers

• to estimate complexity of an algorithm: express number of flops as a(polynomial) function of the problem dimensions, and simplify bykeeping only the leading terms

• not an accurate predictor of computation time on modern computers

• useful as a rough estimate of complexity

Numerical linear algebra background 9–2

vector-vector operations (x, y ∈ Rn)

• inner product xTy: 2n− 1 flops (or 2n if n is large)

• sum x+ y, scalar multiplication αx: n flops

matrix-vector product y = Ax with A ∈ Rm×n

• m(2n− 1) flops (or 2mn if n large)

• 2N if A is sparse with N nonzero elements

• 2p(n+m) if A is given as A = UV T , U ∈ Rm×p, V ∈ Rn×p

matrix-matrix product C = AB with A ∈ Rm×n, B ∈ Rn×p

• mp(2n− 1) flops (or 2mnp if n large)

• less if A and/or B are sparse

• (1/2)m(m+ 1)(2n− 1) ≈ m2n if m = p and C symmetric

Numerical linear algebra background 9–3

Rank of a matrix

we define the rank of A ∈ Rm×n as

rank(A) = dimR(A)

(nontrivial) facts:

• rank(A) = rank(AT )

• rank(A) is maximum number of independent columns (or rows) of Ahence rank(A) ≤ min(m,n)

• rank(A) + dimN (A) = n

Linear algebra review 3–18

Application: fast matrix-vector multiplication

• need to compute matrix-vector product y = Ax, A ∈ Rm×n

• A has known factorization A = BC, B ∈ Rm×r

• computing y = Ax directly: mn operations

• computing y = Ax as y = B(Cx) (compute z = Cx first, theny = Bz): rn+mr = (m+ n)r operations

• savings can be considerable if r ≪ min{m,n}

Linear algebra review 3–21

Full rank matrices

for A ∈ Rm×n we always have rank(A) ≤ min(m,n)

we say A is full rank if rank(A) = min(m,n)

• for square matrices, full rank means nonsingular

• for skinny matrices (m ≥ n), full rank means columns are independent

• for fat matrices (m ≤ n), full rank means rows are independent

Linear algebra review 3–22

(Euclidean) norm

for x ∈ Rn we define the (Euclidean) norm as

∥x∥ =!

x21 + x2

2 + · · ·+ x2n =

√xTx

∥x∥ measures length of vector (from origin)

important properties:

• ∥αx∥ = |α|∥x∥ (homogeneity)

• ∥x+ y∥ ≤ ∥x∥+ ∥y∥ (triangle inequality)

• ∥x∥ ≥ 0 (nonnegativity)

• ∥x∥ = 0 ⇐⇒ x = 0 (definiteness)

Linear algebra review 3–26

RMS value and (Euclidean) distance

root-mean-square (RMS) value of vector x ∈ Rn:

rms(x) =

!

1

n

n"

i=1

x2i

#1/2

=∥x∥√n

norm defines distance between vectors: dist(x, y) = ∥x− y∥x

y

x− y

Linear algebra review 3–27

Inner product

⟨x, y⟩ := x1y1 + x2y2 + · · ·+ xnyn = xTy

important properties:

• ⟨αx, y⟩ = α⟨x, y⟩

• ⟨x+ y, z⟩ = ⟨x, z⟩+ ⟨y, z⟩

• ⟨x, y⟩ = ⟨y, x⟩

• ⟨x, x⟩ ≥ 0

• ⟨x, x⟩ = 0 ⇐⇒ x = 0

f(y) = ⟨x, y⟩ is linear function : Rn → R, with linear map defined by rowvector xT

Linear algebra review 3–28

Cauchy-Schwarz inequality and angle between vectors

• for any x, y ∈ Rn, |xTy| ≤ ∥x∥∥y∥

• (unsigned) angle between vectors in Rn defined as

θ = (x, y) = cos−1 xTy

∥x∥∥y∥

x

y

θ!

xTy∥y∥2

"

y

thus xTy = ∥x∥∥y∥ cos θ

Linear algebra review 3–29

special cases:

• x and y are aligned : θ = 0; xTy = ∥x∥∥y∥;(if x = 0) y = αx for some α ≥ 0

• x and y are opposed : θ = π; xTy = −∥x∥∥y∥(if x = 0) y = −αx for some α ≥ 0

• x and y are orthogonal : θ = π/2 or −π/2; xTy = 0denoted x ⊥ y

Linear algebra review 3–30

interpretation of xTy > 0 and xTy < 0:

• xTy > 0 means (x, y) is acute

• xTy < 0 means (x, y) is obtuse

x x

y yxTy < 0xTy > 0

{x | xTy ≤ 0} defines a halfspace with outward normal vector y, andboundary passing through 0

0

{x | xTy ≤ 0}

y

Linear algebra review 3–31

MATLAB overview

• MATLAB is a high-level technical computing language and interactive environment for algorithm development, data visualization, data analysis, and numerical computing

• Fast in prototyping, good for us! — research, teaching, learning…

• A wide range of applications: signal and image processing, optimization, financial modeling and analysis, and computational biology

• Add-on toolboxes (collection of special purposed MATLAB functions, available separately), e.g. statistics toolbox

• MATLAB language is based on vector and matrix operation

• With MATLAB can program faster: not necessary to declare variable, allocating memory etc.

MATLAB tutorialhttps://www.mathworks.com/academia/student_center/tutorials/mltutorial_launchpad.html?confirmation_page

Explorative data analysis• explore data qualitatively

• histogram

• quantile-quantile plot (q-q plot)

• box plot

• scatter plot

• surface plot

Histogram• obtained by creating a set of bins or intervals that

cover the range of the data set

x = randn(1000,1);

figure; hist(x)

y = [-4:0.1:4];z = normpdf(y)*1000;

hold on; plot(y, normpdf(z)*1000)

Frequency histogram vs relative frequency histogram

114 Computational Statistics Handbook with MATLAB

These plots are shown in Figure 5.1. Notice that the shapes of the histogramsare the same in both types of histograms, but the vertical axis is different.From the shape of the histograms, it seems reasonable to assume that the dataare normally distributed.

One problem with using a frequency or relative frequency histogram is thatthey do not represent meaningful probability densities, because they do notintegrate to one. This can be seen by superimposing a corresponding normaldistribution over the relative frequency histogram as shown in Figure 5.2.

A density histogram is a histogram that has been normalized so it will inte-grate to one. That means that if we add up the areas represented by the bars,then they should add up to one. A density histogram is given by the follow-ing equation

, (5.1)

where denotes the k-th bin, represents the number of data points thatfall into the k-th bin and h represents the width of the bins. In the following

On the left is a frequency histogram of the forearm data, and on the right is the relativefrequency histogram. These indicate that the distribution is unimodal and that the normaldistribution is a reasonable model.

16 18 20 220

5

10

15

20

25

30Frequency Histogram

Length (inches) 16 18 20 22

0

0.05

0.1

0.15

0.2

0.25Relative Frequency Histogram

Length (inches)

f x( ) νk

nh------= x in Bk

Bk νk

© 2002 by Chapman & Hall/CRC

q-q plot

• visually compare two distributions by graphing the quantiles of one versus the quantiles of the other

• order statistics for the first data

• order statistics of the second data

• plot one versus the other

Chapter 5: Exploratory Data Analysis 119

one could plot a stem-and-leaf with one and with two lines per stem as a wayof discovering more about the data. The stem-and-leaf is useful in that itapproximates the shape of the density, and it also provides a listing of thedata. One can usually recover the original data set from the stem-and-leaf (ifit has not been rounded), unlike the histogram. A disadvantage of the stem-and-leaf plot is that it is not useful for large data sets, while a histogram isvery effective in reducing and displaying massive data sets.

If we need to compare two distributions, then we can use the quantile plot tovisually compare them. This is also applicable when we want to compare adistribution and a sample or to compare two samples. In comparing the dis-tributions or samples, we are interested in knowing how they are shifted rel-ative to each other. In essence, we want to know if they are distributed in thesame way. This is important when we are trying to determine the distributionthat generated our data, possibly with the goal of using that information togenerate data for Monte Carlo simulation. Another application where this isuseful is in checking model assumptions, such as normality, before we con-duct our analysis.

In this part, we discuss several versions of quantile-based plots. Theseinclude quantile-quantile plots (q-q plots) and quantile plots (sometimescalled a probability plot). Quantile plots for discrete data are discussed next.The quantile plot is used to compare a sample with a theoretical distribution.Typically, a q-q plot (sometimes called an empirical quantile plot) is used todetermine whether two random samples are generated by the same distribu-tion. It should be noted that the q-q plot can also be used to compare a ran-dom sample with a theoretical distribution by generating a sample from thetheoretical distribution as the second sample.

The q-q plot was originally proposed by Wilk and Gnanadesikan [1968] tovisually compare two distributions by graphing the quantiles of one versusthe quantiles of the other. Say we have two data sets consisting of univariatemeasurements. We denote the order statistics for the first data set by

.

Let the order statistics for the second data set be

,

with .

x 1( ) x 2( ) … x n( ), , ,

y 1( ) y 2( ) … y m( ), , ,

m n≤

© 2002 by Chapman & Hall/CRC

Chapter 5: Exploratory Data Analysis 119

one could plot a stem-and-leaf with one and with two lines per stem as a wayof discovering more about the data. The stem-and-leaf is useful in that itapproximates the shape of the density, and it also provides a listing of thedata. One can usually recover the original data set from the stem-and-leaf (ifit has not been rounded), unlike the histogram. A disadvantage of the stem-and-leaf plot is that it is not useful for large data sets, while a histogram isvery effective in reducing and displaying massive data sets.

If we need to compare two distributions, then we can use the quantile plot tovisually compare them. This is also applicable when we want to compare adistribution and a sample or to compare two samples. In comparing the dis-tributions or samples, we are interested in knowing how they are shifted rel-ative to each other. In essence, we want to know if they are distributed in thesame way. This is important when we are trying to determine the distributionthat generated our data, possibly with the goal of using that information togenerate data for Monte Carlo simulation. Another application where this isuseful is in checking model assumptions, such as normality, before we con-duct our analysis.

In this part, we discuss several versions of quantile-based plots. Theseinclude quantile-quantile plots (q-q plots) and quantile plots (sometimescalled a probability plot). Quantile plots for discrete data are discussed next.The quantile plot is used to compare a sample with a theoretical distribution.Typically, a q-q plot (sometimes called an empirical quantile plot) is used todetermine whether two random samples are generated by the same distribu-tion. It should be noted that the q-q plot can also be used to compare a ran-dom sample with a theoretical distribution by generating a sample from thetheoretical distribution as the second sample.

The q-q plot was originally proposed by Wilk and Gnanadesikan [1968] tovisually compare two distributions by graphing the quantiles of one versusthe quantiles of the other. Say we have two data sets consisting of univariatemeasurements. We denote the order statistics for the first data set by

.

Let the order statistics for the second data set be

,

with .

x 1( ) x 2( ) … x n( ), , ,

y 1( ) y 2( ) … y m( ), , ,

m n≤

© 2002 by Chapman & Hall/CRC

120 Computational Statistics Handbook with MATLAB

We look first at the case where the sizes of the data sets are equal, so. In this case, we plot as points the sample quantiles of one data set

versus the other data set. This is illustrated in Example 5.4. If the data setscome from the same distribution, then we would expect the points to approx-imately follow a straight line.

A major strength of the quantile-based plots is that they do not require thetwo samples (or the sample and theoretical distribution) to have the samelocation and scale parameter. If the distributions are the same, but differ inlocation or scale, then we would still expect the quantile-based plot to pro-duce a straight line.

Example 5.4We will generate two sets of normal random variables and construct a q-qplot. As expected, the q-q plot (Figure 5.6) follows a straight line, indicatingthat the samples come from the same distribution.

% Generate the random variables.x = randn(1,75);y = randn(1,75);% Find the order statistics.xs = sort(x);ys = sort(y);% Now construct the q-q plot.plot(xs,ys,'o')xlabel('X - Standard Normal')ylabel('Y - Standard Normal')axis equal

If we repeat the above MATLAB commands using a data set generated froman exponential distribution and one that is generated from the standard nor-mal, then we have the plot shown in Figure 5.7. Note that the points in this q-q plot do not follow a straight line, leading us to conclude that the data arenot generated from the same distribution.

We now look at the case where the sample sizes are not equal. Without lossof generality, we assume that . To obtain the q-q plot, we graph the ,

against the quantile of the other data set. Note thatthis definition is not unique [Cleveland, 1993]. The quantiles ofthe x data are usually obtained via interpolation, and we show in the nextexample how to use the function csquantiles to get the desired plot.

Users should be aware that q-q plots provide a rough idea of how similarthe distribution is between two random samples. If the sample sizes aresmall, then a lot of variation is expected, so comparisons might be suspect. Tohelp aid the visual comparison, some q-q plots include a reference line. Theseare lines that are estimated using the first and third quartiles ofeach data set and extending the line to cover the range of the data. The

m n=

m n< y i( )i 1 … m, ,= i 0.5–( ) m⁄

i 0.5–( ) m⁄

q0.25 q0.75,( )

© 2002 by Chapman & Hall/CRC

Chapter 5: Exploratory Data Analysis 121

MATLAB Statistics Toolbox provides a function called qqplot that displaysthis type of plot. We show below how to add the reference line.

Example 5.5This example shows how to do a q-q plot when the samples do not have thesame number of points. We use the function csquantiles to get therequired sample quantiles from the data set that has the larger sample size.We then plot these versus the order statistics of the other sample, as we didin the previous examples. Note that we add a reference line based on the firstand third quartiles of each data set, using the function polyfit (seeChapter 7 for more information on this function).

% Generate the random variables.m = 50;n = 75;x = randn(1,n);y = randn(1,m);% Find the order statistics for y.ys = sort(y);% Now find the associated quantiles using the x.% Probabilities for quantiles:p = ((1:m) - 0.5)/m;

This is a q-q plot of x and y where both data sets are generated from a standard normaldistribution. Note that the points follow a line, as expected.

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

X − Standard Normal

Y −

Sta

ndar

d N

orm

al

© 2002 by Chapman & Hall/CRC

box plot• display the distribution of the sample

• five values from a data sets are used to construct a box plot

• 3 sample quantiles

• min valule

• max value

• IQR: interquartile range

132 Computational Statistics Handbook with MATLAB

Box plots (sometimes called box-and-whisker diagrams) have been in use formany years [Tukey, 1977]. As with most visualization techniques, they areused to display the distribution of a sample. Five values from a data set areused to construct the box plot. These are the three sample quartiles

, the minimum value in the sample and the maximum value.There are many variations of the box plot, and it is important to note that

they are defined differently depending on the software package that is used.Frigge, Hoaglin and Iglewicz [1989] describe a study on how box plots areimplemented in some popular statistics programs such as Minitab, S, SAS,SPSS and others. The main difference lies in how outliers and quartiles aredefined. Therefore, depending on how the software calculates these, differentplots might be obtained [Frigge, Hoaglin and Iglewicz, 1989].

Before we describe the box plot, we need to define some terms. Recall fromChapter 3, that the interquartile range (IQR) is the difference between thefirst and the third sample quartiles. This gives the range of the middle 50% ofthe data. It is estimated from the following

. (5.5)

This shows the binomialness plot for the data in Table 5.2. From this it seems reasonable touse the binomial distribution to model the data.

0 1 2 3 4 5 6 7 8 9 10

−10

−9.5

−9

−8.5

−8

−7.5

−7

−6.5

−6

−5.5

−5

1

1

1

Number of Females − k

φ (n

* k)

q0.25 q0.5 q0.75, ,( )

IQRˆ q0.75 q0.25–=

© 2002 by Chapman & Hall/CRC

132 Computational Statistics Handbook with MATLAB

Box plots (sometimes called box-and-whisker diagrams) have been in use formany years [Tukey, 1977]. As with most visualization techniques, they areused to display the distribution of a sample. Five values from a data set areused to construct the box plot. These are the three sample quartiles

, the minimum value in the sample and the maximum value.There are many variations of the box plot, and it is important to note that

they are defined differently depending on the software package that is used.Frigge, Hoaglin and Iglewicz [1989] describe a study on how box plots areimplemented in some popular statistics programs such as Minitab, S, SAS,SPSS and others. The main difference lies in how outliers and quartiles aredefined. Therefore, depending on how the software calculates these, differentplots might be obtained [Frigge, Hoaglin and Iglewicz, 1989].

Before we describe the box plot, we need to define some terms. Recall fromChapter 3, that the interquartile range (IQR) is the difference between thefirst and the third sample quartiles. This gives the range of the middle 50% ofthe data. It is estimated from the following

. (5.5)

This shows the binomialness plot for the data in Table 5.2. From this it seems reasonable touse the binomial distribution to model the data.

0 1 2 3 4 5 6 7 8 9 10

−10

−9.5

−9

−8.5

−8

−7.5

−7

−6.5

−6

−5.5

−5

1

1

1

Number of Females − k

φ (n

* k)

q0.25 q0.5 q0.75, ,( )

IQRˆ q0.75 q0.25–=

© 2002 by Chapman & Hall/CRC

Chapter 5: Exploratory Data Analysis 133

Two limits are also defined: a lower limit (LL) and an upper limit (UL). Theseare calculated from the estimated IQR as follows

(5.6)

The idea is that observations that lie outside these limits are possible outliers.Outliers are data points that lie away from the rest of the data. This mightmean that the data were incorrectly measured or recorded. On the otherhand, it could mean that they represent extreme points that arise naturallyaccording to the distribution. In any event, they are sample points that aresuitable for further investigation.

Adjacent values are the most extreme observations in the data set that arewithin the lower and the upper limits. If there are no potential outliers, thenthe adjacent values are simply the maximum and the minimum data points.

To construct a box plot, we place horizontal lines at each of the three quar-tiles and draw vertical lines to create a box. We then extend a line from thefirst quartile to the smallest adjacent value and do the same for the third quar-tile and largest adjacent value. These lines are sometimes called the whiskers.Finally, any possible outliers are shown as an asterisk or some other plottingsymbol. An example of a box plot is shown in Figure 5.14.

Box plots for different samples can be plotted together for visually compar-ing the corresponding distributions. The MATLAB Statistics Toolbox con-tains a function called boxplot for creating this type of display. It displaysone box plot for each column of data. When we want to compare data sets, itis better to display a box plot with notches. These notches represent theuncertainty in the locations of central tendency and provide a rough measureof the significance of the differences between the values. If the notches do notoverlap, then there is evidence that the medians are significantly different.The length of the whisker is easily adjusted using optional input argumentsto boxplot. For more information on this function and to find out whatother options are available, type help boxplot at the MATLAB commandline.

Example 5.10In this example, we first generate random variables from a uniform distribu-tion on the interval , a standard normal distribution, and an exponen-tial distribution. We will then display the box plots corresponding to eachsample using the MATLAB function boxplot.

% Generate a sample from the uniform distribution.xunif = rand(100,1);% Generate sample from the standard normal.xnorm = randn(100,1);% Generate a sample from the exponential distribution.

LL q0.25 1.5 IQRˆ⋅–=

UL q0.75 1.5 IQR .ˆ⋅+=

0 1,( )

© 2002 by Chapman & Hall/CRC

134 Computational Statistics Handbook with MATLAB

% NOTE: this function is from the Statistics Toolbox.xexp = exprnd(1,100,1);boxplot([xunif,xnorm,xexp],1)

It can be seen in Figure 5.15 that the box plot readily conveys the shape of thedistribution. A symmetric distribution will have whiskers with approxi-mately equal lengths, and the two sides of the box will also be approximatelyequal. This would be the case for the uniform or normal distribution. Askewed distribution will have one side of the box and whisker longer thanthe other. This is seen in Figure 5.15 for the exponential distribution. If theinterquartile range is small, then the data in the middle are packed aroundthe median. Conversely, if it is large, then the middle 50% of the data arewidely dispersed.

An example of a box plot with possible outliers shown as points.

1

−3

−2

−1

0

1

2

3

Val

ues

Column Number

Quartiles

Possible Outliers

AdjacentValues

© 2002 by Chapman & Hall/CRC

smallest data point within 1.5 IQR from

the first quantile

largest data point within 1.5 IQR from the third quantile

x = randn(1000,1); figure; boxplot(x)

scatter plot• display each pair of data (xi, yi) as points using some

plotting symbol in a two-dimensional coordinate system

x = randn(100,1); y = randn(100,1); figure; scatter(x,y)

x = randn(100,1); y = x*2 + randn(100,1); figure; scatter(x,y,'r*')

multiple variables

4.4 Logistic Regression 123

sbp

0 10 20 30

ooo

o

ooooo

o

o

ooo

oo ooo

oo

ooo ooo

ooooo

ooo

oo

oo ooooo

oo

o

oooooo

ooo

o ooooo

ooo ooo oooo ooooooo

oooo o

oo oooooooo

o

oooo

oo

o

ooo

o

o

o

ooo

o

o

o

ooo

oooooooo

ooo o

oooo ooooooo

o

oooo

o

ooo

oooo

oooo

o

oo

o

o

ooo o

o

o

o

o

o

oo

o

o ooooo

oooo

ooo

oooooo

oooo

oooo oooo

oo

oooooooo

oo

oo

o

ooo ooo

ooo

o

oooo

o oo

o

oooo

o

ooo

ooo

o

o

o

ooo

ooo o

o

oooo o

ooo

oooooo

oo

o

oooo

o

o

o oooooooo

oo

oooo o

o

oo

o

oooo

ooo

o

o

ooo

ooo

o

oo o

ooo

oooo

oo

oooo

o

oooooo

ooo

o

ooo

o

oo

o

oooo ooo

ooo

o oooooo ooo o

oooooo

o

ooo o

oo oo

o

ooo o

ooo

o

oo oo

o

o

o

o

oo

oo

oo o

o

oo

oo

oo o

oo

o

oooo

oooooooooooooo

o

oooooooo

o

oooo

o

o

o

ooo

ooo

o

o oo

ooo

o

ooooo o

oo

ooo

ooo

ooooooo

ooo

oo

oo o

oo

oooo

o

oo o ooo

oo o

oooo

o oooooo

o ooooo ooo oo

o

ooo oo

oooooooooo

o

ooo o

oo

o

ooo

o

o

o

oo o

o

o

o

ooo

ooo ooooo

ooo ooooo oo

ooooo

o

oooo

oo

oooo

oooo

oo

o

ooo

o

ooooo

o

o

o

o

o o

oo ooo oo

ooo

o

oo

oooo

o oo

oooo

o ooo oooo

oo

ooo ooo

ooo

o

oo

o

o

oooo

o

ooo

o

oooo

ooo

o

oooo

o

ooo o

o o

o

oo

oo

o ooo oo

ooooo

o ooo

ooooo

oo

o

oooo

o

o

o oooo oooo

oo

oooo o

oo o

o

oooo

ooo

o

o

oo

o

o

o o

oooooooo

ooo

ooo oo oo

o oo

o oo

ooo

o

oo o

o

oo

o

o ooo ooooo

oo ooo

o oo oooo

oooooo

oo ooo

oo ooo

oo

oo

ooo

o

oooo

oo

o

o

oo

oo

oo o

o

ooo

o

ooo

ooo

oo ooooooo

oooo ooo

oo

o

ooo oo oo

oo

o

oo

o

o

o

o

o oo

0.0 0.4 0.8

oo

o

o

ooo

ooo

o

ooo

ooooo

oo

oo

oooo

o oooo

ooooo

oooo

oo ooo

o

ooo ooo

ooo

ooo

oo o

ooo ooo ooooo oooo

o

o

ooo oo

ooo ooooo

oo

o

o ooo

oo

o

ooo

o

o

o

ooo

o

o

o

o ooo ooooooo

oo

oooo

oooo oooo o

o

oooo

ooooo

oooo

oo

o

o

ooo

o

ooooo

o

o

oo

oo

o

oooooo

oooo

oo

oo ooo

oo

oo oo

o ooo

ooo o

o o

oo oo oooooo

oo

o

oo

o ooo

oo o

o

ooooooo

o

ooo

o

o

ooo ooo

o

o

o

oooooooo

o oooooo

ooooo o

o

oo

o

oooo

o

o

oo

ooooooo

oo

ooooo

ooo

o

oo oo

oo o

o

o

ooo

o

oo

oo ooo

ooo

ooo

oooo

ooo

oo ooo o

ooo

o

oo oo

oo

o

ooooo o

oooooo

oo

ooo o

oo oo

oo oo

o

o

oooo

ooooo

oo

oo

ooo

o

oooo

oo

o

o

oo

oo

ooo

o

ooo

o

ooo

ooo

oo oooooo

oo oo ooooo o

o

oo oooooo

o

o

ooo

o

o

o

oo o

ooo

o

o oo

ooo

o

oooooo

oo

oooooo

o oo oooo

ooo

oo

ooo

ooo o

oo

o

oooooo

ooo

ooooooooo o oo oooooo oo oo

o

ooooo

o ooo

o oo oo

o

o

o ooo

oo

o

ooo

o

o

o

oo o

o

o

o

ooo

oooooo

oo

oo

oooo

oo oooo

oo o

o

o o oo

oo

oooooooooo

o

ooo

o

ooooo

o

o

oo

oo

o

o ooo oo

oo

oo

ooo

oooo

oo

ooo o

ooooo ooo

oo

oo oo

ooooo

o

oo

o

oo

oooo

ooo

o

oooo

ooo

o

oooo

o

oooooo

o

o

o

oooo

oooo

o ooooooo

oo o ooo

oo

o

oo oo

o

o

oo

o oooooo

oo

oooo o

o

o o

o

oo oo

o o o

o

o

oo

o

o

oo

oooooooooo

o

oooo

ooo

oooooo

ooo

o

oo o

o

oo

o

ooo o ooooo

ooooo

oooo

o ooo

ooo o

o

ooo

o o

oo ooo

oo

o o

ooo

o

ooo o

o

o

o

o

oo

oo

o oo

o

oooo

ooo

oo

o

oo

oooooo

oooo oooo

oo

o

o oo ooo

ooo

o

oo

o

o

o

o

ooo

0 50 100

oo

o

o

oooooo

o

oooo ooo o

ooo

oooooooooo

ooo

oo

oo ooooooo

o

o ooo oo

oo o

oooooo

ooo ooooooo ooooooo

oooooo ooooooooo

o

o ooo

o

oo

ooo

o

o

o

ooo

o

o

o

ooo

o ooooooo

oooo

o ooo o oooooo

o

oo oo

ooooooooo

ooo

o

oo

o

o

ooooo

o

o

o

o

oo

o

oooooo

oooo

oooo ooooooooo

oooo

oooo

oo

oo oooooooo

oo

o

oooooo

ooo

o

oooo

ooo

o

ooo

o

o

oooooo

o

o

o

oo

oooo oo

ooooooo

oooooo

o

oo

o

ooo o

o

o

oo

ooooooo

oo

oooo o

ooo

o

ooooo oo

o

o

o oo

o

oo

oo oo

ooooooo

oooo

ooo

ooo

ooo

ooo

o

oooo

oo

o

oo ooooooo

oooo

ooooo

ooo oo oo o

o

oooo o

ooo o

o

oooo

ooo

o

ooo o

o

o

o

o

oo

oo

ooo

o

ooo

o

o o o

oo

o

o o oooooooooooooooo

o

o ooooooo

o

o

oo

o

o

o

o

oo

o

100

160

220

oo

o

o

ooo

ooo

o

ooo

oooo o

oo

oooo

o ooo oo

o

ooo

oo

oo o

oo

o oo o

o

ooo o o

oo

oo

o oo

oo o

ooo ooo ooo o oo o o oo

o

oo

oooo oooo oo o

oo

o

oo oo

oo

o

ooo

o

o

o

ooo

o

o

o

oo

o o oo ooo

oo

oo o o

oooo oo

oooo o

oo o o

o

oo

ooo

oooo

ooo

o

oo

o

o

o oo o

o

o

o

o

o

o o

o

oooo o

o

oo

oo

oo

oo ooo oo

oooo

o oo

o oo ooo

o

oooooo

ooo

o

oo

o

oo

oo oo

oo o

o

oooo

o o o

o

oooo

o

oooooo

o

o

o

ooo o

o o oo

oo oo ooo

oo

o oooo

o

o

o

ooo o

o

o

oo

oooo ooo

oo

o ooo o

o

oo

o

oo oo

ooo

o

o

ooo

ooo

oo

ooooo

ooo

o

ooo o

ooo

oooo o o

oo o

o

ooo

o

oo

o

oooo oo

ooo

oo

oooo o

o ooo o

ooo oo

o

ooo

o o

oooo

o

oo

oo

ooo

o

oooo

o

o

o

o

oo

oo

ooo

o

ooo

o

ooo

ooo

oooo

oooo

oooooo o o

oo

o

ooo ooo

ooo

o

oo

o

o

oo

ooo

010

2030

o

ooo

oo oo

o oo

o

o o

o

oo o

o

o

oo

oo

o

oooooo o

oo oo ooo

o

o ooooo o

ooooo oo

oooo

o o oo

o

oo

ooooooooooooo

ooo

ooo

oo

o

ooooo oo ooooo

oo

ooooooo

ooo

o

o o

o

oooooooo

ooooo

ooo

ooooooooo

o

oooo oo oo

o

oooooo

o

o

oo

o

o

ooo

oo

ooo ooo oooooo

o

oooo

oo

o

ooo ooo oooooooo oooo

oo

o oo

ooo oooooo

oo ooooooo

o oo

o

oo o

o

o ooooo oo

ooo

ooo oo oo oo

ooo

oooooo

oo

o

ooooo

oo

ooo

ooo oo

o

o

ooooooo o ooooo

o

ooo oooo

oooo

oo

ooo

ooo

o

ooo ooo ooo

ooooo

o

oooo

oooo

ooo ooo

oo

oo o

o

o oooo o

o

oo

ooo oooo

oooooo

oooooooo

o

ooooo

o o oo

o

oo

oo

ooo

o ooo

oo

oo

o

oo

oo

o

o

oooo

o

oo

oo ooo

ooo oo oooooo

ooo

oooooooooo

o oo o oooo

o

tobaccoo

ooo

o

oooo ooo

oo

o

oo

oo

o

oo

oo

o

oooooo ooooo oo

o

o

ooo ooo o

ooooo o

oo

oooooo o

o

oo

ooo

oooooooo oooooooo

oo

o

ooooo

o o ooooo

oo

ooooooooo o

oo o

o

oooo

o ooooo

ooo

ooo

oooo ooooo

o

oooooooo

o

oooo oo

o

o

ooo

o

o oooo

ooooo ooo

ooo o

o

o oooo

o

o

oooooo ooooo

oooo ooooooo

o

ooooo

oo oo

oo oo

oo o oo

ooo

o

ooo

o

oooo

oooo

oooo

oo oo

oooo

ooo

oooooo

o o

o

oo

oooo

o

oo oooo

ooo

o

ooo ooo oo ooooo

o

oo oo ooo

oo

ooo

o

ooo

oo o

o

ooooooo oooooo

o

o

ooo oo

o ooooo o

ooo

ooo o

o

ooo

ooo

o

ooo

oooooo

ooooo ooooo

oo

oo

o

oo

oo

oo ooo

o

oo

o o

ooooo

oo

oo

oo

o

oo

oo

o

o

o oooo

o

ooo ooo

ooo ooooo ooo ooo

ooo oo oooo

ooooo ooo o

o

o

o ooo

oo ooo

o

o

oo

oo

o o

o

o

oo

oo

o

ooo oooo

ooooooo

o

ooo ooo o

ooo

ooo

oooooo oo o

o

oo

ooo

ooooo

oooo oooo

o o

o

oo

o

ooooooooo oo

oo

oo ooooo

oooo

oo o

o

ooo ooooooo oo oo oo

oooo oo oo o

o

ooooo ooo

o

oooo o

o

o

o

o

oo

o

ooooo

ooooooo ooooo

o

o ooooo

o

oooo oooo oooooo ooo oo ooo o

o oooooooooo oo

ooo oo

ooo

o

ooo

o

o oooo ooo

oo o

oooo

ooooo

oo

o

oo oooo

oo

o

oooo o

oo

ooo

oooooo

o

oooooooo ooooo

o

oooo oo o

ooo o

oo

oooo oo

oo

ooo o

oo ooooo o

o

o

ooooo

oo oooo o

ooo

oooo

o

oooo oo

o

oooooo oo

o

ooo o

ooo oooo

ooo

o

oo

oo

ooooo

o

oo

o o

ooo

ooo o

oo

oo

o

oo

ooo

o

ooooo

o

ooo oo

oooooo oo ooooo oooo oo oo

ooooooo oo

ooo

o

o

ooo

oooo

o oo

o

oo

ooo o

o

o

oooo

o

o oo

oooooooo o o

o

o

ooo o ooooooooo

ooooo

oooo

o

oo

ooo

ooooo

o oo oooo

oo

o

o

o o

o

oo oooo

o oo ooo

oo

o ooooo

oooo

oo o

o

ooo o oo oooo

oo oo

oo

oooooooo o

o

o o oooooo

o

oooooo

o

o

ooo

o

ooooo

ooo

ooo oo ooo o

o

o oooo

o

o

oooo oo ooo o ooooo ooo ooooo

o ooooo

ooo

o

o oooo ooo

ooo

o

ooo

o

ooooooo oo

oooooo

ooooo

ooo

oo ooo

o

oo

o

ooo oooo

oooo oo

o oo

o

o ooo ooo o o oooo

o

oo o ooo o

oo

o ooo

o ooooo

o

ooooo

o oooo

oooo

o

ooooooo

oooo ooooooo o

o

oo oooo

o

ooooo ooo

oooo

ooo

oo oo

oo

oo

o

oooooo ooo

o

oo

oo

oo

ooo

oo

oo

oo

o

o

oo

oo

o

ooooo

o

ooo ooo

ooooooo oooo

oooo oo oooooooooo oooo o

o

o

ooo

oooooo

o

o

oo

o

oo

o

o

o

oo

oo

o

ooo

ooo oooo oooo

o

ooooo

ooo ooo oo

oo

ooooooo

o

oooo

oo

oooo

oooooo

oooo

o

o o

o

ooooo oooo ooo

oo

oooooo

oooo

ooo

o

oo

o ooooooo

oo o

oo o

ooo oooooo

o

oo oooooo

o

oooo

oo

o

o

oo

o

o

ooo

oo

oooooooooooo

o

ooo o

oo

o

ooooooooooooo o

oooooo

oo o

ooooo

ooooo

ooooooo

o

oo o

o

oo o

o

oooooo oo

oooooo oooo o

o

ooo

ooooo

o

oo

o

oo

ooooo

ooooo o ooo

o

oooo ooooo

o ooo

o

oooooo o

oo

oooo

o oo

oo o

oo

oo oooooo

ooo oo

o

oo ooo

ooo ooo o

ooooooo

o

o o oo oo

o

oooo ooooo

o ooo

oo oo ooo

ooo

o

ooo

oo

oooo

o

o

ooo

ooooooo

oo

oo

o

oo

oo

o

o

oo oo

o

o

oo o oooooooooooooooo

oo ooooo

oooooo ooo

oooo

o

ooo

ooo o

o oo

o

oo

oooo

o

o

ooo o

o

o oooooo

oooo o o

o

o

oooooo o

ooo

o ooo

oo

oo

o oo o

o

oooo

oooo oo

o o o ooo

o oooo

o o

o

oo oo oo

o o oo o o

oo

o ooooo

oooo

oo o

o

oo

o oo ooo

oo o

o oooo

oooo oooo o

o

o o oooo oo

o

oooo

oo

o

o

oo

o

o

o oooo

ooo

oo o ooooo o

o

oooo

oo

o

oooo oooooo

o ooo oo ooooooo

oo o oo

oooo

oo o o

o oooo

oo o

o

ooo

o

oooo

oooooooo ooo

o ooo

o

o oo

ooo oo

o

oo

o

oo

ooo oo

ooo

oo oooo

o

oooo ooo oo

o ooo

o

ooo ooo o

oo

ooo

o

ooo

ooo

o

oo

o oooo oo

ooo o

o

o

oo oo

oo

ooo

o ooo o

oo

oo o

o

oooooo

o

o o oooo o oo

ooo oo oo oo

ooo

oo

o

oo

oo

oo o oo

o

oo

o o

ooo

oooo

oo

oo

o

oo

oo

o

o

oooo

o

o

oooo o

oooooooooo o ooo

oooo ooo

oooo

ooo ooooo

o

oooo

oooooo

ooo ooo

o

o oo

o o

ooo

oo

oooooooo

oo

ooooo

ooo o

oo

oooo

o

o o

o

ooo ooo oooooo

o

ooooooooo ooo ooo

ooo ooo

oo oo

oooo

o

o

o

oo oo oo ooo

o o

o

o

o

o

o oo

ooooo o

o

oo

ooooooo

o

ooooooooo

oo

ooo ooooooo

oo

oooo

oooo

oo

o o

oo

ooooo

oo o

o oo

o

o

ooooooo oooo o

o

oooooo oo

ooo

ooooo

oo

ooo

o

ooo

oo

ooo o

o

ooo ooo

ooo oo o

oo

oooo

oooo

oooo

ooo

ooooo

oo oo

o

oooo

oo

oo

ooo o oooooo

oo

o

ooo

ooooo ooo

o

o

oooooo

oo o o

o

ooo oo oo ooo oo

ooooo

o

oooo

oo

o

oo

o

oo

o

ooo oo o

o

oooo

ooo

oo o o

oo ooooo

oooo

oooo oo

o

o

ooo

o

o

o ooo oooo

o

ooooo o o

o

oo ooo

o

o

o

oo

oo

oooo oo ooo

o

oooo

oo

o oooooo

oo

o

oooo

oo ooo o

oo

oo

oo

o

o

o ooo

oo

oooo

oo

ooo oo

o

oo o

o o

oo o

oo

ooooo oooooo

ooo

oooo o

oo

oo

oo

o

oo

o

o ooo

oo ooo ooo

o

ooo ooooooooooooooooooooo

ooooo

o

oo

oooooo ooo

o o

o

o

o

o

oo oooooo o

o

oo

ooooo

oo

o

ooooooooooo

oo ooooo

oooo

ooo

oooo oo

o

o

oo

ooooooo

ooo

ooo

oo

oooo

oooooooo

o

ooooooo

oooo

ooo

oo

o

ooo

o

o

oo o

ooo

oooo

oooo ooooo ooo

o

o

ooo

ooo

oo

oooo

o oo

oooo o

oo oo

o

oooo

oo

oo

ooooo o

oooooo

o

ooooooo

ooooo

o

ooo

oooo

oooo

ooo ooo oooooo

o ooo o

o

oo

ooo

oo

ooo

oo

o

oo

o ooo

o

oooo

ooo

oo ooooooooooo o

oo ooo oo

o

o

o ooo

o

ooo

ooo oo

o

oo

o oo oo

o

ooooo

o

o

o

oo

oo

ooo oo oooo

o

o oooooooooooo

oo

o

ooooooooo o

ooooooo

o

oldl

oo oooo

oooo

oo

oo oo

o

ooo

o o

ooo

oo

o oooo oooooo

ooo

oo

ooo

oo

oo

oo

o

oo

o

oooo

oooo

o oo

o

o

oooooo

oo oo ooo oo ooo

ooooooooo oo

o

o

o

o ooooo ooo

o o

o

o

o

o

ooo

oooooo

o

o o

ooo

oooo

o

o oo oo ooooo o

ooooooo

ooo

oooooooooo

o

o

oo

ooo o

ooo

ooo

ooo

oo

oooo

o oooo ooo

o

oo

ooo oo o

ooo

oo

ooo

o

oooo

o

oo o

ooo

o ooo

oooooo oooo oo

oo

o ooo

oooo

oooo

oo

o

o oooo

ooo o

o

oo ooo

ooo

oooooo

ooooooo

oo

oooooo ooo

o

o

o oo

oo oo

o ooo

o oo oo ooo ooo

o

oooo o

o

oo

ooo

oo

ooo

oo

o

oo

oooo

o

oooo

o oo

oo oo

ooo oooooo oo

oo oooo

o

o

oooo

o

oooooooo

o

oo

ooo oo

o

ooooo

o

o

o

ooo

oooooo oooo

o

oooo

ooo oo oooo

o

o

o

oo oo

ooooo o

ooo

oooo

o

o o ooo

oo

oooo

oooooo

o

ooo

o oooo

oo

o oooo o

ooo

oo

ooo

ooo oo

o o

oooo

o

oo

o

oooooooo

o ooo

o

ooo oo oo

ooooo

oooo oo oo oooooo

o oo

o

o

o

o ooooo oo

o

oo

o

o

o

o

oooo

o oooo

o

o o

o

ooooo

o

o

ooo ooo o oo

ooooo

ooooooo

ooooooooooo

o

oo

oooo

ooo

ooo

ooo

o

ooooo

o oo oooo o

o

ooo ooo ooooo

oo

oo

o

o

ooo

o

o

ooo

o ooooo

o

oo

oooooo

o ooo

oo

oooooo

ooooo ooo

o

o oooo

oo oo

o

o ooo

ooo

o

o ooo oooo oo

o o

o

oooooo o

o oooo

o

o oo

o o oooo oo

ooo ooooo oo o

o

ooo

o o

o

oooo

ooo

ooo

oo

o

oo

oooo

o

oooo

ooooooo

oo oo

o ooo oo

oo

oo ooo

o

o

oo o

o

o

ooo

oo o oo

o

o

ooo oo o

o

ooooo

o

o

o

oo

ooo

oooooooo

o

oo oooooo o oooo

oo

o

o ooo

ooo

ooo

oo

oo

oo

o

o

o oooo

oooooo

ooooo o

o

o oo

oo

ooo

oo

oooo

oooo

oo

oooo

ooooo

oo

ooo o

o

oo

o

oooo

o

o ooo ooo

o

ooo oooooo o oooooo o

oooooooooo oo

o

o

o

ooooo

oooo

oo

o

o

o

o

oo ooooooo

o

oo

oo o ooo

o

o

ooooo oo ooo oooooooo

ooo

oo

oooooo ooo

o

oo

ooooooo

oooooo

oooo oooooo

ooo o

o

oo

ooo oooooo

oo

ooo

ooooo

o

ooo

oooooo

o

ooo ooo oo

o oo o

oo

ooooo

oooo

ooo

ooo

ooooo

oooo

o

ooooooo

o

ooo ooo

ooooo o

o

oo

oo ooo

ooooo

o

ooo o ooo

oo oo

ooooo oo oooo

ooooo o

o

oo

o o

oo

o

oo

o

oo

o

oooooo

o

ooo o

o oooo ooo oooooo

oooo

ooo oo

oo

o

o ooo

o

ooo

ooooo

o

oo

oo oo o

o

oo ooo

o

o

o

oo

oo

oo o oo ooo o

o

ooooooooooooo

oo

o

o ooooo ooo o

oo

ooo oo

o

o

26

1014

o ooo

oo

o ooooo

oo oo

o

o oo

o o

ooo

oo

oo ooo oo o

oo

oo

ooo

ooo o

oo

oo

oo

o

o o

o

o oo o

o

o ooo ooo

o

oo o oo o

ooooo oooo

o oooo oo o

ooooo o

o

o

o

o ooooo

ooo

o o

o

o

o

o

o o oo

oo ooo

o

o o

ooooo

oo

o

ooo ooo o oo

oooo oooooo

ooo

ooo

oo oo oo

o

ooo

oo

ooo

oo

oooo o

o

o

ooo oo

oo

oo

oooo

o

oooo oooo

ooo

oooo

o

o

ooo

o

o

ooo

ooo

o ooo

oo

oo o ooo

o ooo

oo

oo o

oo

oo

o

oo oo

oo

o

oo oo ooo oo

o

oooo

ooo

o

ooo o oo

oooooo

o

oo

oo

oooo oo

oo

o

o oo

oooo

o ooo

ooo oo oo

ooooo

o oo

o o

o

oo

o o

ooo

ooo

o o

o

oo

o ooo

o

oooo

ooo

oo o o

ooo o ooo

oo oo

oo oo oo

o

o

oo o

o

o

oooo oooo

o

o

oooo oo

o

ooooo

o

o

o

oo

oo

oooo oo ooo

o

ooooooooooo o o

oo

o

ooo ooo ooo o

oo

ooooo

o

o

0.0

0.4

0.8

o

o

o ooo

o

oo o

o

o

o o

oo

o

o oo

o

o o

oo

o

o o

o

oo o

o

o oo oo

ooo

oo

o

oo

oo

o o

o

o o

o

ooooo

o

o

o o

oo

oo

o

o

oo oo

o

ooo

o

o

oo

o

o

o

o

oo

oo

ooo oo oo

ooo

o

o

o

o

oo oo

o

ooo

o

o

o

o

o

o o

o

oooo o o

o

o

o

o

o

ooooo

o

o

o

o

o

o

o

ooo o

o o

oo oooo

o o

o oo oo oo oo oo

oo o o

oo o

ooo

ooo o

o oooo

oo

oo

o

oo

oo

o

o o

o

oo

o

oo

o

o

oo

o

o

o

o

o oo o ooo oo

o

o

o

oo

o

o

o oo

ooo ooo

o

oo o

o

oo

o

o

o o

o

o

oo o

o oooo

o

o

o

oooo

oo o

o oo

o

o

oo

o o

oooo o oo

o

ooooooo o

o

ooooo

oo

o o

o

o

o o

oo

o o

o

o

o oo

o

o

o

o

oo

o

o

o o

o

oo oo

oo

oo

oo

o

o

oo

o o

oo

oo

o

o

o

o o oooo

oo

o

o

o

o

o oo

o o

oo

o

oo

o

o o

oo

o

o oo

oo

oo

oo

oo

oo o

oooo

o

o

o

ooo

o o

o

o o o

oo ooo

ooo

o

oo

oo o

o o

oo

o

o

o

ooo oo

o

o

o

oooo

o

ooo

o

o

oo o

oo

oo

oo

o

oooo

o o

o

o o oo

o

ooo

o

o

oo

oo

o

oo o

o

oo

o o

o

o o

o

ooo

o

ooooo

o oo

oo

o

o o

oo

o o

o

oo

o

o oo oo

o

o

o o

oo

oo

o

o

ooo o

o

ooo

o

o

oo

o

o

o

o

o o

oo

oooo ooo

oo o

o

o

o

o

oooo

o

ooo

o

o

o

o

o

oo

o

oooo o o

o

o

o

o

o

ooo oo

o

o

o

o

o

o

o

oooo

oo

o ooooo

o o

o oo oo oooo o o

oo oo

ooo

ooo

oo oo

o oo oo

oo

oo

o

oo

oo

o

oo

o

oo

o

oo

o

o

o o

o

o

o

o

ooo oooo oo

o

o

o

oo

o

o

ooo

oooo oo

o

oo o

o

oo

o

o

o o

o

o

oo o

oo ooo

o

o

o

oo o o

oo o

o oo

o

o

oo

oo

ooooooo

o

oooooooo

o

oooo o

oo

oo

o

o

o o

oo

oo

o

o

o oo

o

o

o

o

o o

o

o

oo

o

o ooo

o o

oo

oo

o

o

oo

oo

oo

oo

o

o

o

ooo ooo

o o

o

o

o

o

ooo

oo

oo

o

oo

o

oo

o o

o

o oo

oo

o o

oo

oo

ooo

oo oo

o

o

o

o oo

oo

o

o oo

oo oo o

o oo

o

oo

o oo

oo

oo

o

o

o

ooooo

o

o

o

oo oo

o

ooo

o

o

ooo

oo

oo

oo

o

o ooo

o o

o

o oo o

o

oo o

o

o

oo

oo

o

ooo

o

o o

oo

o

oo

o

oo o

o

ooo oo

o oo

oo

o

oo

oo

o o

o

o o

o

o ooo o

o

o

oo

oo

oo

o

o

oooo

o

oo o

o

o

oo

o

o

o

o

o o

oo

oo oo o oo

oo o

o

o

o

o

ooo o

o

oo o

o

o

o

o

o

oo

o

o ooooo

o

o

o

o

o

ooo oo

o

o

o

o

o

o

o

oooo

o o

oo oooo

oo

o oo ooooooo o

oo oo

o oo

o oo

o ooo

ooo oo

oo

oo

o

oo

oo

o

oo

o

oo

o

oo

o

o

o o

o

o

o

o

ooo oo oo oo

o

o

o

oo

o

o

ooo

oooooo

o

ooo

o

o o

o

o

oo

o

o

ooo

ooooo

o

o

o

oooo

o oo

o oo

o

o

oo

o o

ooooooo

o

ooo ooo oo

o

oooo o

oo

oo

o

o

oo

oo

o o

o

o

o oo

o

o

o

o

oo

o

o

oo

o

oo oo

o o

o o

oo

o

o

o o

oo

o o

oo

o

o

o

oo oooo

o o

o

o

o

o

oo o

oo

oo

o

oo

o

oo

oo

o

ooo

o o

oo

oo

oo

oo o

oooo

o

o

o

ooo

o o

o

ooo

oo oo o

o oo

o

oo

ooo

oo

oo

o

o

o

ooo oo

o

o

o

ooo o

o

ooo

o

o

o o o

oo

o o

oo

o

ooo o

o

famhist

o

o

o oo o

o

oo o

o

o

oo

oo

o

ooo

o

oo

oo

o

oo

o

ooo

o

ooo o o

oo o

oo

o

oo

o o

o o

o

oo

o

o oooo

o

o

oo

oo

o o

o

o

ooo o

o

oo o

o

o

oo

o

o

o

o

oo

oo

oo ooo oo

ooo

o

o

o

o

oooo

o

oo o

o

o

o

o

o

oo

o

oooooo

o

o

o

o

o

o oo oo

o

o

o

o

o

o

o

o ooo

o o

oooooo

oo

o ooooooooo o

oo oo

oo o

o oo

o ooo

ooo oo

oo

oo

o

oo

oo

o

o o

o

oo

o

oo

o

o

oo

o

o

o

o

o oo oooo oo

o

o

o

o o

o

o

ooo

oooooo

o

oo o

o

o o

o

o

oo

o

o

ooo

ooo oo

o

o

o

oo oo

oo o

o o o

o

o

oo

oo

oo ooo oo

o

o ooo ooo o

o

oooo o

oo

o o

o

o

oo

o o

oo

o

o

ooo

o

o

o

o

oo

o

o

o o

o

oo oo

oo

oo

oo

o

o

oo

oo

oo

oo

o

o

o

oo oooo

o o

o

o

o

o

ooo

o o

oo

o

oo

o

o o

oo

o

ooo

oo

o o

oo

oo

oo o

oo oo

o

o

o

oo o

o o

o

o oo

ooo oo

o oo

o

oo

ooo

oo

oo

o

o

o

ooooo

o

o

o

ooo o

o

oo o

o

o

oo o

o o

o o

oo

o

ooo o

o o

o

o o oo

o

ooo

o

o

oo

o o

o

o oo

o

oo

oo

o

o o

o

oo o

o

oo ooo

o oo

oo

o

oo

oo

oo

o

oo

o

o oooo

o

o

o o

oo

oo

o

o

ooo o

o

ooo

o

o

oo

o

o

o

o

oo

oo

ooooooo

ooo

o

o

o

o

oo oo

o

ooo

o

o

o

o

o

oo

o

oooooo

o

o

o

o

o

o ooo o

o

o

o

o

o

o

o

o ooo

oo

oooooo

oo

o oo oo oooo oo

oooo

ooo

ooo

oooo

ooo oo

oo

oo

o

oo

oo

o

oo

o

o o

o

oo

o

o

o o

o

o

o

o

ooooooooo

o

o

o

oo

o

o

oo o

ooo ooo

o

oo o

o

oo

o

o

oo

o

o

ooo

o oooo

o

o

o

oo oo

ooo

ooo

o

o

oo

oo

ooo o ooo

o

oooo oooo

o

o ooo o

oo

oo

o

o

oo

o o

oo

o

o

o oo

o

o

o

o

oo

o

o

oo

o

oooo

oo

oo

o o

o

o

oo

o o

oo

oo

o

o

o

ooo oo o

oo

o

o

o

o

ooo

oo

oo

o

o o

o

o o

o o

o

ooo

oo

o o

oo

o o

ooo

oooo

o

o

o

oo o

o o

o

oo o

ooo oo

oo o

o

oo

o oo

oo

o o

o

o

o

ooooo

o

o

o

oooo

o

oo o

o

o

oo o

oo

oo

o o

o

o ooo

o o

o

o ooo

o

oo o

o

o

oo

oo

o

o oo

o

o o

oo

o

o o

o

ooo

o

o ooo o

o oo

oo

o

o o

oo

o o

o

oo

o

ooo oo

o

o

o o

oo

oo

o

o

oo o o

o

o o o

o

o

o o

o

o

o

o

oo

oo

oo oo o o o

o o o

o

o

o

o

oo oo

o

ooo

o

o

o

o

o

o o

o

o oo o oo

o

o

o

o

o

ooo oo

o

o

o

o

o

o

o

o ooo

o o

o ooooo

o o

o oooo oooo o o

oo oo

o o o

ooo

o o oo

o oo oo

oo

oo

o

oo

oo

o

o o

o

oo

o

o o

o

o

o o

o

o

o

o

o oo oooo oo

o

o

o

oo

o

o

oo o

oooo o o

o

oo o

o

o o

o

o

oo

o

o

oo o

oo oo o

o

o

o

o oo o

oo o

o o o

o

o

o o

oo

ooo o o oo

o

oooo ooo o

o

o ooo o

oo

o o

o

o

oo

oo

o o

o

o

o o o

o

o

o

o

oo

o

o

oo

o

o o oo

o o

oo

o o

o

o

o o

oo

o o

o o

o

o

o

oo oooo

o o

o

o

o

o

o oo

oo

o o

o

oo

o

oo

oo

o

o oo

oo

o o

oo

oo

oo o

oooo

o

o

o

ooo

oo

o

o oo

oo ooo

ooo

o

oo

oo o

o o

oo

o

o

o

ooooo

o

o

o

o o oo

o

ooo

o

o

oo o

oo

o o

oo

o

oooo

o

ooo ooo

oooo oooooooooo

oo ooo

o

o

o

ooo o

oo o

oo

ooo

ooo

o

o

o oo

o

oo o oo ooooo o oo oo

ooo

oo

oo o

o

o

oooo

ooo

o

oooo

o

oooo

o oo

o

oooo

oooooo o

o

o

ooo o

oo

o

oo ooooo

o ooo

oooo

ooooooo oo

o o

ooo

o oooo

o oooo o oo

oo oo ooooo

o oo

oo

ooo

oooooo o

o

oo

oo

oooo o

o

oo

ooo

oooo o

ooo

oooooooo oo

o

oooo

o

o

ooooo oo oo oo

o ooo ooooo o

oooo ooo oo oo oo

o

oo

o

oooooooo

oooo

o ooo oo

ooo o

ooo

oooo

ooo

oo

o

oooo ooo

o

ooo o

oo

o

oo o

oo

ooooo oo o

o

o

oo

oooo

ooooooo

oooo o

ooo

ooo oo o

ooo oooo oooo oooo

oo

ooooooo

ooo o

o

oo

oo

o

ooo oo

o

oo

o

oo

ooooo

oo

oo

o ooo o

oo

o

oo o

oooo o

oooo

ooo

ooo oooooooo

oo

ooo

o

o

ooo

oo

oo oo

ooo

oo

o

ooo o

oo

ooo

oo

ooo oo oo

o o

ooo

o oo

o

o

ooooooo

o

ooo ooooo

o

o oo

o

oooooooo oooo

o ooo oo

oo

ooo

oo

oo

oo ooo

o

o ooo

o

oooo

oooo

ooo o

oooooooo

o

oooo

oo

o

oooooooo oo

ooooo

ooo

oooooo

o o

ooo

ooooo

ooooo o ooo

o ooo

oooo ooo

ooooo

ooo

oo oo

o

ooo

oooooo

o

oo

ooooo

oooooooo oo

ooo

ooo

o

oooo

o

o

o oooooooooo

ooooo

ooooo

oo

oooooo

oo oo o

o

o o

o

oooooooo

oo

oo

oooooo

oooooo

o

ooooooooo

o

oooo

oooo

ooo o

ooo

ooo

o

oooo

ooo ooo

o

oo

o ooooo

oo oooo ooooooo o

oo o ooo

oooo

ooo

ooo

ooooooo ooo oooooo o

o

oo

oo

o

ooo oo

o

oo

o

ooooo o

o

oo

oo

oooo o

oo

o

oo oooo oo

ooo

ooo

oooooooooooo

oo

ooo

o

o

ooo

ooo

ooo

oo o

oo

o

ooo

oo

o

oooo

o ooooo o

ooo

oo ooo o

o

o

ooo oooo

oo

oo ooooo

o

o oo

o

o oo ooo oooooo oooooo

oo

ooo

o

o

oooooo o

o

oooo

o

oo

oo

oooo

ooo oo ooooooo

o

oooo

oo

o

oooooooooo

ooo

oo

oooo

o oooo

o o

ooooooooooooo ooo

oo ooooooo o oo

oo

o ooo

ooo

ooo

o

oo

oo

ooooo

o

oo

oooo

ooo oo

oooo ooo

oo

ooo

o

o ooo

o

o

oooo ooooo ooooooooo

oooo

oooo oo ooo oo o

o

o o

o

oooooo ooo

ooo

o ooo o o

ooooooo

ooo oo

o oo

oo

ooooooo

o

ooo o

ooo

ooo

oo

oo ooooooo

o

o o

ooo

oo ooooo ooo

ooo

ooo

oo

ooooo

oooo

ooooo

oo ooo

ooo oooo

o ooooo

o

o o

ooo

ooooo

o

oo

o

oo

o oooo

oo

oooo

oo oo

oo

oo oo

ooooo

ooo

oooooo ooo

oo oooo

oooo

o

o

o oo

ooo

ooo

ooo

o

o

o

oo ooo

o

o ooooooo oooooo

ooo

ooo

o

ooo

ooooo

o

oo

ooo

ooo

o

ooo

o

o ooooooooo oo ooo

o oooo

ooo

o

o

ooo oo o

o

o

ooo

oo

oo

oo

oooo

ooooo o

o ooooo

o

oooo

oo

o

ooo ooooooo

oo

oo

o

ooooo oo oo

oo

ooooo o

ooooooo ooo

oooooooooo oo

oo

ooo

ooo

oooo

o

oooo

oo ooo

o

oo

o ooo

ooo ooo oo ooo

oo

oooo

o

ooooo

o

o ooo oo ooooooooo oooo oo

oo

o ooo oooo ooo

o

oo

o

ooo

oo

ooo

ooo

o

ooo

ooooooooo

o

oooooooo

oo

oooo o

ooo

oo

oo

oo

o

oo o

o

oo oo oo ooo

o

o

o o

oooo

oooo ooo ooo o

ooo

ooo

oo oooooo

oo

oo oo ooooo

oo

ooo

o oooo o

ooo

oo

oo

o

ooo oo

o

oo

o

oo

o oooo

oo

oooo

ooooo

o

oooo

oooo

ooooooo

ooooo oo

ooooo

oooo

o

o

ooo

oo o

ooo

ooo

oo

o

obesityo

oo oo

o

oooo

ooooo

ooo oo

ooo

ooo

o

o

ooo oooo

o

ooo oo

ooo

o

ooo

o

oo oooo

ooooooooo

o oooo

oooo

o

oooo o oo

o

ooo

o

o

oooo

oooo

oooooo

oooo oo

o

oooooo

o

ooo o

oooooooooo

o

oo

oo

ooooo

oo

oo

ooooooooooo ooo

oo oo oooooooo

ooooo

ooooooo

o

ooo

ooo ooo

o

oo

oooooo oooo ooo oo

ooo

ooo

o

ooooo

o

oooooooo o

ooo oooooooo o

ooooooo oooo

oo

o

o o

o

oooo

oooooo

oo

oooooooo o ooo

o

oooo

oooo

oo

ooooo

ooo

oooo

oo

o

ooo

oo

ooooo oo oo

o

oo

oooooo

ooo ooo o

oo ooo

oooooooo

oo o oooo oo ooo oooo

oo ooo

oo oo

oo

ooo

oo

o

oo ooo

o

oo

o

oo

ooooo

oooooo

ooooo

o

oo o

oo o oo

ooo

ooo

oo

oooooooooooo

oo o

o

o

ooo

oo ooo

o

oo o

oo

o 1525

3545

ooo o

oo

o ooo o

ooo oooo oo

oo o

ooo

o

o

o ooo

oo o

o

oo

o oo

ooo

o

o oo

o

o o ooo ooo oo oo o oo

o oooo

oo o

o

o

oo

oooo o

o

ooo

o

o

oo

oo

oooo

oo o oo oo

ooo oo

o

ooo o

ooo

oo o

oo

ooo ooo

oo

oo

oo

ooo o

ooo

oo

oo

oooooo

ooooo o oo

oo oo o

o oo oooo

oo

o ooooo

oo oo

o

oo

oo

oo ooo

o

oo

ooo

o ooo oo

ooo

o oooo

o

o oo

o

oooo

o

o

ooo

oo o oo

o oo

oo o oooo

ooo

oooooooo

o ooo o

o

o o

o

oo o

oo

ooo

oo

oo

o ooooo

oo o o

ooo

oooo

ooo o

oo

ooo

o oo

oo

oo

oo

ooo

oo o

oo

ooo oo oo oo

o

o o

o oo

oo o

ooo oooo

o ooo o

o o

oo ooo

oooo

oooo

oo o oooooo

ooo

o oo oo oo o

o

oo

oo

o

ooo oo

o

oo

o

oo

o oooo

ooo

ooo

oo o

ooo

oo ooooo ooooo

o ooo

oooooooo o o

oo

ooo

o

o

ooo

oo o

ooo

ooo

oo

o

050

100

o

ooo

o

oooo o

o

oo oooo

o

o

oo o

oooooooooooo o

o oo

oo

o ooooo oo

o

oooo

o o

o

ooo o ooo

oo

o

ooooooo

oooooo

o

o oooooo ooooooo ooooo

oo oooo

o

ooooo oo o

o

oo oo

ooooo o

o ooooo

o

ooo

ooooo oooo

o oo ooo oooo

oo

o

o

oo

oo

ooooo oo o ooo ooooooo o

o oo

o

o oooo ooo oooo

ooo

oo

ooo

oooooo

oo oo oooo oo oo

ooo

ooo o

o

ooo

o

ooooo

oo

o

oooo ooo

o

o oo

ooooo o

ooooooo o

o oooo oo o oooo

oo

oo

o

ooooooo o

ooooo

o

ooo oooo

oo

oo

oo

oo

oo

oo

oo

o

oo

oo oooo oooo oo

oo

oo

o

oo

ooo

o

ooo oo o oo

ooo

oo

oooo oo

o

ooo oo

o

oo

o

o

o

o

o

oo

o

oo

o

oooo o

o o oooo o

oo

ooo o

oooo

ooo o

oooo

oo

ooo

o

o

o

ooo

oo

oooo oo oooooooo

o

o

ooooo

o

oo

o

o o

o

o ooo o

o

o

ooo

o

ooooo

o

ooo oo o

o

o

oo ooo ooo

oooo

o ooo

oooo

oooooo oooo

oooo

oo

o

o oooooooo

o

ooooooo

oooooo

o

o oo ooo

ooooooo ooooo o

oooooo

o

o oooo ooo

oo ooo

ooooo ooooooo

o

ooo

ooooo oooo

ooooo ooooo

oo

o

o

oo

oo

oooo ooo oooooooooo

ooo oo o

o oooooooooooooo

oo

oooooooo

ooooo oooo oooo oooo

ooo

o

ooo

o

ooooooo

o

o ooo oo o

o

oo o

o ooo

oo

ooooooo oo o

ooooo ooooooo

oo

o

ooooooooooooo

o

ooooooo

oo

ooo

o

oo

oooo oo

o

ooooooo ooo

oo oooo

oo

o

oo

ooo

o

ooo o ooo

o

ooo

ooooo

oooo

oo oo o

o

oo

o

o

o

o

o

oo

o

oo

o

oo oo ooooo oo o

oo

o oo

oooo

o

ooo o

o oo o

oo

ooo

o

o

o

ooooo

ooooooooooo ooo

o

o

ooooo

o

oo

o

oo

o

ooooo

o

o

ooo

o

oooo o

o

ooooo o

o

o

ooo

ooo oooo oooooo

o oo

ooooo ooo oo

o

o oo

o

oo

o

ooooo oooo

o

oo ooooo

ooo ooo

o

ooooo

ooooooo

o o oooo oo

ooooo

o

oooo ooo o

ooo

oo

o ooooo

oooo

o o

o

o oo

ooooo oooooo

o ooo oooo

oo

o

o

ooo

oo ooo o oo ooo ooo ooo

oooooo

o

o ooooooooooo

o oo

oo

ooooooo

ooo

oo o oo oo oo ooooo

oooo

o

ooo

o

ooooooo

o

ooooo oo

o

oo o

ooo

oo o

oooooo ooo oooo oo o o ooooo

oo

o

ooo oo

o ooo

oooo

o

oo oo ooooo

oo

oo

oo

oo

oooo

o

ooooo oooooo oo ooo

o

o

o

oo

ooo

o

oo oooo

oo

ooo

ooooo

oooo

oooo o

o

oo

o

o

o

o

o

oo

o

oo

o

oo ooo

o ooooo o

oo

ooooo ooo

ooo oo oo o

o o

o oo

o

o

o

ooo

oo

oooo ooooo ooo oo

o

o

oo oo o

o

oo

o

oo

o

o ooo oo

o

o ooo

oo ooo

o

ooo oooo

o

oo oooo oo

oooo

o ooo

ooo

ooooo ooo oo

o

o ooo

oo

o

ooo oo oooo

o

oo ooooo

oooo oo

o

oo ooo

oo ooooooooo

ooo

ooo ooo

o

o oooo oo o

o

ooo

ooooooo

oo oooo

o

ooo

oo oo oo ooooo o

ooooooo

oo

o

o

ooooooooo ooooooo ooooooo oo

oo

oooooo oooo oo

o oo

oo

ooo

o oo

oo

ooooooooooo oooo

oo

ooo

o

ooo

o

ooooo

oo

o

o oo ooo o

o

oo o

oooo

o o

o ooooooooooo ooo ooooo

oooo

o

oooooooo

ooooo

o

oooo oo oo

o

ooo

o

oo

oo

oo

oo

o

oo

ooo ooooooooooo

oo

o

oo

ooo

o

oo oo oooo

oo

o

o oo

ooooo

o

o ooo o

o

oo

o

o

o

o

o

oo

o

oo

o

oooo o

oooooo o

oo

ooo

ooo ooo

ooooo

oo

oo

ooo

o

o

o

ooo

oo

oooooo oo ooooo o

o

o

ooo oo

o

oo

o

oo

o

ooooo

o

o

ooo

o

oooo o

o

oooooo

o

o

oooo ooo o

oooo

o ooo

oooo

oooo o ooo o

o

oooo

oo

o

ooooooooo

o

oo ooooo

o oo ooo

o

o ooooo

o oo oo ooo oo

ooo

oo o ooo

o

oooo ooo o

ooo

oooo oo

ooooooo

o

o

o oo

oooo oooo

oooo ooooooo

oo

o

o

oooo

oooo ooo oooo oo ooo oo o ooo o

o ooooo oo ooo o

ooo

oo

ooo oooo

ooo

ooo oooo oo ooooooooo

o

ooo

o

ooooooo

o

oooooo o

o

ooo

oo ooo o

o ooo

ooo oo oo oooo oooo o

oo

oo

o

o oooo

oo oo

oooo

o

oo o ooooo

o

oo

oo

oo

oo

oooo

o

oooo oooo oooooooo

oo

o

oo

oooo

oo ooooo

o

ooo

ooooo

oooo

ooo oo

o

oo

o

o

o

o

o

oo

o

oo

o

oo oooo oo o

oo o

oo

oooo

o oo o

o

oooo

oo o

oo

ooo

o

o

o

ooo

oo

o ooo oooo oooo oo

o

o

oo ooo

o

oo

o

oo

o

oooo o

o

alcoholo

ooo

o

oo oo o

o

ooo oooo

o

oooo ooo o

oo oo

o oo o

o ooo

oooo

oo o ooo

o oo

o

oo

o

o oo oo oo

oo

o

oo ooo oo

o o o ooo

o

o oooooooo oo oo o o o

oo oo

oo ooo

o

ooooo oo o

ooo

oo

o oo ooooo

o ooo

o

o o

o

oooo ooo

oo

ooo oo

ooooo

oo

o

o

oo

oo

o ooo ooo ooo o ooooo o ooo o

o o

o ooooo ooo

oooo oo

oo

o oo

oooooooo oo o

ooo oo o oo oo

oo

oo

o

ooo

o

o ooo

ooo

o

o oooooo

o

o oo

o oo

ooo

oo oo

ooo oo oooo o o ooooo

oo

oo

o

ooooo

oo oo

oo

oo

o

ooo ooo oo o

ooo

o

oo

oo

oo

oo

o

ooooo oo o oo

oo oo

oo

o

o

o

oo

oo o

o

o oo ooooo

ooo

ooooo

o oo

o

o o oo o

o

oo

o

o

o

o

o

oo

o

oo

o

o ooo o

o o oooo o

oo

ooo oooo oo

oo ooooo

oo

ooo

o

o

o

ooo

oo

oooooooooo o ooo

o

o

oo ooo

o

oo

o

ooo

ooooo

o

100 160 220

oo

oo

ooo

o

o

oo

o

o o

ooo

ooo

o

oo

oo

oo

ooo

o o

o

oo

oo

o

o

o

oo

o

o

o

oo

o

o

oooo

o

oo

oo

o

o

oo

o

o

o

o

oooo

ooo

o

oo

oo o

ooooo

o

oo

oo

o

o

o

o

oo

o

oo

o

oo

o

o

oo

ooo

o

oo

o

o

ooo

oo

o

o

o

oo

oo

o

oo

o ooo

o

o

oo

o oo

o o

o

o

o

o oooo

ooo

oo

o

o

o

o

oo

o

o

o oo

oo

o

o

oo

o

o

ooo

o

oo

o ooo

oo

o

o

o

oo

oo

o

oooo

oo

oooo

oo

o

o

oooo

ooo

o

o

o

oo

o

o

o

o

oo

oo

oo

oo

oo

o o

oo o

oo

oo o

oooo oooo

o

o

ooo

ooo

o

oo

o

o

o

oo

o

o

oo

o

o

o

oo o

o

oo

o

oo

o

o

oooo

o

oo

oo

oo

oo

oo

o

o

oo

o

o o

o

oo

o

o

o

o oo

o

o

ooo

o

ooo

o

o

o

o

o

o

oooo

o

oo

ooo

o o

oo

o

oo

o

oo

o

ooo o

o

oo

o

o

o

o

oo

o ooo o

o

oo

o

o

o

o

o

o

o

ooo

o

oooo

o

o

oo

oo

o

o

o

ooooo

o ooo

o o

o

o

ooo

o

oo

o

oooo

oo

o

ooo

o

oo

oo oooooo

oo

o

o

o

oooo

oo

o

oo

oo

o

o

o ooo

oo o

o

o

ooo

o

o

o

oo

o

oo

oo ooo o

o

ooo

ooo

oo

o

oo

o

oo

oo

o

o

o

oo

o

o

o

oo

o

o

o

ooo

o

oo

oo

o

o

oo

o

o

o

o

oooo

o

oo

o

oo

oo o

oo

ooo

o

o ooo

o

o

o

o

ooo

ooo

oo

o

o

oo

ooo

o

oo

o

o

o o

o

oo

o

o

o

oo

oo

o

oo

o ooo

o

o

oo

ooo

o o

oo

o

ooo

oo

ooo

oo

o

o

o

o

o oo

o

ooo

oo

o

o

ooo

o

oo o

o

oo

oooo

oo

o

o

o

oo

ooo

oooo

o

o

oooo

oo

o

o

oooo

ooo

o

o

o

oo

o

o

o

o

oo

oo

o

o

oo

oo

oo

ooo

o

o

ooo

oooo o

ooo

o

o

ooo

oo

o

o

oo

o

o

o

oo

o

o

oo

o

o

oo

oo

o

oo

o

oo

o

o

oooo

o

oo

oo

oo

oo

oo

o

o

oo

o

o o

o

oo

o

oo

oo

o

oo

oo

o

o

ooo

o

o

o

o

o

o

oo

oo

o

oo

o oo

oo

ooo

oo

o

oo

o

o ooo

o

oo

o

o

o

o

oo

ooooo

o

oo

o

o

o

o

o

o

o

oo

oo

oo

oo

o

o

ooo o

o

o

o

oo

o oo

oooo

oo

o

o

ooo

o

oo

o

ooo o

ooo

ooo

o

oo

oooooooo

oo

o

o

o

ooo

o

oo

o

oo

oooo

oooo

oo

2 6 10 14

oo

oo

o oo

o

o

oo

o

oo

oo oo

oo

o

oo

oo

oo

ooo

o oo

oo

o o

o

o

o

oo

o

o

o

oo

o

o

o

oo

oo

oo

oo

o

o

oo

o

o

o

o

oo oo

ooo

o

oo

ooo

oo

ooo

o

oo

oo

o

o

o

o

oo

o

oo

oo

o

o

o

oo

o ooo

o o

o

o

ooo

oo

o

o

o

oo

oo

o

oo

oooo

o

o

o

o

ooo

o o

o

o

o

ooo

oo

ooo

oo

o

o

o

o

o oo

o

ooo

oo

o

o

oo

o

o

oo o

o

oo

oo oo

oo

o

o

o

oo

oo

o

oooo

o

o

oooo

oo

o

o

ooo o

ooo

o

o

o

oo

o

o

o

o

oo

oo

oo

oo oo

oo

ooo

oo

ooo

oooooo

oo

o

o

ooo

ooo

o

oo

o

o

o

o o

o

o

oo

o

o

o

oo o

o

ooo

oo

o

o

ooo o

o

o o

oo

oo

oo

ooo

o

oo

o

oo

o

oo

o

o

o

oo

o

o

o

ooo

o

ooo

o

o

o

o

o

o

oo

o o

o

o o

ooo

oo

oo

o

oo

o

o

o

o

oooo

o

oo

o

o

o

o

oo

ooooo

o

oo

o

o

o

o

o

o

ooo

oo

oo

oo

o

o

oo

oo

o

o

o

oo

ooo

o o oo

oo

o

o

ooo

o

oo

o

o ooo

ooo

oo o

o

oo

oo ooooo o

oo

o

o

o

oooo

o o

o

oo

oo

o

o

o ooo

oo

oo

oooo

o

o

o

oo

o

oo

ooo ooo

o

oo

oo

ooo oo

oo

o

oo

oo

o

o

o

oo

o

o

o

oo

o

o

oo

oo

o

oo

oo

o

o

oo

o

o

o

o

oooo

o

oo

o

oo

o oo

oo

o oo

o

oooo

o

o

o

o

ooo

ooo

oo

o

o

oo

ooo

o

oo

o

o

ooo

oo

o

o

o

oo

oo

o

oo

o ooo

o

o

oo

o oo

oo

o

o

o

ooo

oo

ooo

oo

o

o

o

o

ooo

o

ooo

oo

o

o

oo

o

o

o oo

o

oo

oooo

oo

o

o

o

oo

ooo

oo oo

o

o

oooo

oo

o

o

oo oo

ooo

o

o

o

oo

o

o

o

o

oo

oo

oo

oooo

oo

oo o

oo

o oo

o oo ooo oo

o

o

ooo

o

oo

o

o o

o

o

o

oo

o

o

oo

o

o

o

ooo

o

ooo

oo

o

o

oooo

o

oo

oo

oo

oo

ooo

o

oo

o

oo

o

o o

o

o

o

ooo

o

o

oo

o

o

o oo

o

o

o

o

o

o

oo

oo

o

oo

ooo

oo

oo

o

oo

o

oo

o

oooo

o

o o

o

o

o

ooo

ooooo

o

oo

o

o

o

o

o

o

o

ooo

o

ooo

o

o

o

oo

oo

o

o

o

oo

ooooooooo

o

o

ooo

o

oo

o

oooo

ooo

oo o

o

oo

oooo oo oo

ooo

o

o

oo oo

oo

o

oo

oo

o

o

oooo

oo

15 25 35 45

oo

oo

oo

o

o

o

oo

o

oo

ooo

ooo

o

oooo

oo

ooo

oo

o

oo

o o

o

o

o

oo

o

o

o

oo

o

o

o

ooo

o

oo

oo

o

o

oo

o

o

o

o

oooo

oo

o

o

oo

ooo

oo

ooo

o

oo

oo

o

o

o

o

oo

o

ooo

o

o

o

o

oo

ooo

o

o o

o

o

ooo

oo

o

o

o

oo

oo

o

oo

o ooo

o

o

oo

ooo

oo

o

o

o

ooo

oo

ooo

oo

o

o

o

o

oo

o

o

ooo

oo

o

o

oo

o

o

oo o

o

oo

oo oo

oo

o

o

o

oo

oo

o

ooo o

oo

ooo o

oo

o

o

oo oo

ooo

o

o

o

oo

o

o

o

o

oo

oo

o

o

oooo

oo

ooo

oo

ooo

oooooooo

o

o

ooo

ooo

o

o o

o

o

o

oo

o

o

oo

o

o

oooo

o

o o

o

oo

o

o

o ooo

o

oo

o

o

oo

oo

oo

o

o

oo

o

oo

o

o o

o

o

o

oo

o

oo

ooo

o

o oo

o

o

o

o

o

o

oo

oo

o

oo

ooo

oo

ooo

oo

o

oo

o

oooo

o

oo

o

o

o

ooo

o oo

o o

o

oo

o

o

o

o

o

o

oo

oo

o

ooo

o

o

o

oo

oo

o

o

o

oooo o

oo oo

oo

o

o

o ooo

oo

o

oooo

oo

o

oo o

o

o o

oooooo oo

oo

o

o

o

oooo

oo

o

oo

oo

oo

oooo

oo

oo

o

oo

oo

o

o

oo

o

oo

ooo

ooo

o

oooooo

ooo

o o

o

oo

oo

o

o

o

oo

o

o

o

oo

o

o

o

oo

oo

oo

oo

o

o

ooo

o

o

o

oooo

oo

o

o

oo

oo o

oooo o

o

oooo

o

o

o

o

ooo

oooo

o

o

o

oo

ooo

o

oo

o

o

o oo

oo

o

o

o

oo

o

o

o

oo

oo o o

o

o

oo

ooo

oo

o

o

o

ooo

oo

ooo

oo

o

o

o

o

o o

o

o

ooo

ooo

o

ooo

o

ooo

o

oo

oooo

oo

o

o

o

oo

ooo

oooo

oo

o ooo

oo

o

o

oo oo

ooo

o

o

o

oo

o

o

o

o

oo

oo

o

o

oooo

o o

oo o

oo

ooo

ooooooo

o

o

o

o oo

o

oo

o

oo

o

o

o

oo

o

o

oo

o

o

oooo

o

ooo

oo

o

o

oooo

o

oo

oo

oo

oo

ooo

o

oo

o

oo

o

oo

o

o

o

oo

o

o

o

oo

o

o

ooo

o

o

o

o

o

o

oooo

o

oo

o oo

o o

oo

o

oo

o

oo

o

o oo o

o

o o

o

o

o

oo

o

ooooo

o

oo

o

o

o

o

o

o

o

ooo

o

oooo

o

o

oooo

o

o

o

oo

ooo

oooo

o o

o

o

o ooo

oo

o

oo o o

oo

o

oo o

o

oo

oooooooo

ooo

o

o

ooo

o

oo

o

oo

oo

oo

oo oo

oo

20 40 60

2040

60

age

FIGURE 4.12. A scatterplot matrix of the South African heart disease data.Each plot shows a pair of risk factors, and the cases and controls are color coded(red is a case). The variable family history of heart disease (famhist) is binary(yes or no).

3D plot• use 3D plots to view surface

Contour Plot

[x, y, z] = peaks; figure; c = contour(x, y, z); clabel(c)

figure; surf(x, y, z)

figure; mesh(x, y, z)

Matlab basics

• http://www.mathworks.com/moler/intro.pdf

>> a = 1a =

1

>> b = 1;

>> c = [1 2 3]c =

1 2 3

>> c+2ans =

3 4 5

EE263 RS1 6

Matlab basics (contd...)

>> cc =

1 2 3

>> d = [4;5;6]d =

456

>> c*dans =

32

>> e = d.’e =

4 5 6

EE263 RS1 7

Matlab basics (contd...)

>> c+eans =

5 7 9

>> c.*eans =

4 10 18

>> A = [1 2; 3 4]A =

1 23 4

>> A*[1;1]ans =

37

EE263 RS1 8

Matlab basics (contd...)

>> A(2,1)ans =

3

>> A(:,1)ans =

13

>> A(2,:)ans =

3 4

>> t = 0:2:10t =

0 2 4 6 8 10

EE263 RS1 9

Table as a matrix

• nutrition chart

Vegetablex1 x2 · · · xn

y1 0.50 0.75 · · · 0.9Nutrient ... ... ... ... ...

ym 2.05 0.01 · · · 0.45

• vector x ∈ Rn is the vegetable diet; xj is amount of vegetable j

• vector y ∈ Rm is the nutrients; yi is the amount of nutrient i

• y = Ax gives the nutrients as a function of the vegetable diet

• Aij = amount of nutrient i in 1 unit of vegetable j

EE263 RS1 10

Examples

• x ∈ Rn

• find A for which y = Ax is the running average of x, i.e.,

yi =1

i

i∑

j=1

xj, i = 1, . . . , n

Solution.

y1y2...yn

=

1 0 0 0 · · · 01/2 1/2 0 0 · · · 01/3 1/3 1/3 0 · · · 0... ... ... ... . . . ...

1/n 1/n 1/n 1/n · · · 1/n

x1

x2

...xn

EE263 RS1 11

Examples

• Creating A in Matlab

n=5;A = zeros(n,n);for i=1:n

A(i,1:i) = 1/i;endA

A =

1.0000 0 0 0 00.5000 0.5000 0 0 00.3333 0.3333 0.3333 0 00.2500 0.2500 0.2500 0.2500 00.2000 0.2000 0.2000 0.2000 0.2000

EE263 RS1 12

• point estimator

– likelihood function `(✓|x) = f(x|✓)– maximum likelihood ˆ

✓(x) = argmax✓ `(✓|x)

• hypothesis testing

– hypotheses: ⇢H0 : x1, . . . , xn ⇠ f0

H1 : x1, . . . , xn ⇠ f1

– likelihood ratio test

`(x1, . . . , xn) =

nX

i=1

log

f1(xi)

f0(xi)

Claim H1 if `(x1, . . . , xn) > b where b is threshold

• confidence interval: ˆ

✓ 2 [✓ � z↵, ✓ + z↵] with probability 1� ↵

Prof. Yao Xie, ISyE 6416, Computational Statistics, Georgia Tech 6