Upload
vandieu
View
226
Download
0
Embed Size (px)
Citation preview
Chapter 1: Introduction 3
include a section containing references that explain the theoretical conceptsassociated with the methods covered in that chapter.
In this book, we cover some of the most commonly used techniques in com-putational statistics. While we cannot include all methods that might be apart of computational statistics, we try to present those that have been in usefor several years.
Since the focus of this book is on the implementation of the methods, weinclude algorithmic descriptions of the procedures. We also provide exam-ples that illustrate the use of the algorithms in data analysis. It is our hopethat seeing how the techniques are implemented will help the reader under-stand the concepts and facilitate their use in data analysis.
Some background information is given in Chapters 2, 3, and 4 for thosewho might need a refresher in probability and statistics. In Chapter 2, we dis-cuss some of the general concepts of probability theory, focusing on how they
Comparison Between Traditional Statistics and Computational Statistics [Wegman, 1988]. Reprinted with permission from the Journal of the Washington Academy of Sciences.
Traditional Statistics Computational Statistics
Small to moderate sample size Large to very large sample size
Independent, identically distributed data sets
Nonhomogeneous data sets
One or low dimensional High dimensional
Manually computational Computationally intensive
Mathematically tractable Numerically tractable
Well focused questions Imprecise questions
Strong unverifiable assumptions:Relationships (linearity, additivity)Error structures (normality)
Weak or no assumptions:Relationships (nonlinearity)Error structures (distribution free)
Statistical inference Structural inference
Predominantly closed form algorithms
Iterative algorithms possible
Statistical optimality Statistical robustness
© 2002 by Chapman & Hall/CRC
Vector spaces
a vector space or linear space (over the reals) consists of
• a set V
• a vector sum + : V × V → V
• a scalar multiplication : R× V → V
• a distinguished element 0 ∈ V
which satisfy a list of properties
Linear algebra review 3–2
• x+ y = y + x, ∀x, y ∈ V (+ is commutative)
• (x+ y) + z = x+ (y + z), ∀x, y, z ∈ V (+ is associative)
• 0 + x = x, ∀x ∈ V (0 is additive identity)
• ∀x ∈ V ∃(−x) ∈ V s.t. x+ (−x) = 0 (existence of additive inverse)
• (αβ)x = α(βx), ∀α,β ∈ R ∀x ∈ V (scalar mult. is associative)
• α(x+ y) = αx+ αy, ∀α ∈ R ∀x, y ∈ V (right distributive rule)
• (α+ β)x = αx+ βx, ∀α,β ∈ R ∀x ∈ V (left distributive rule)
• 1x = x, ∀x ∈ V
Linear algebra review 3–3
Examples
• V1 = Rn, with standard (componentwise) vector addition and scalarmultiplication
• V2 = {0} (where 0 ∈ Rn)
• V3 = span(v1, v2, . . . , vk) where
span(v1, v2, . . . , vk) = {α1v1 + · · ·+ αkvk | αi ∈ R}
and v1, . . . , vk ∈ Rn
Linear algebra review 3–4
Subspaces
• a subspace of a vector space is a subset of a vector space which is itselfa vector space
• roughly speaking, a subspace is closed under vector addition and scalarmultiplication
• examples V1, V2, V3 above are subspaces of Rn
Linear algebra review 3–5
Vector spaces of functions
• V4 = {x : R+ → Rn | x is differentiable}, where vector sum is sum offunctions:
(x+ z)(t) = x(t) + z(t)
and scalar multiplication is defined by
(αx)(t) = αx(t)
(a point in V4 is a trajectory in Rn)
• V5 = {x ∈ V4 | x = Ax}(points in V5 are trajectories of the linear system x = Ax)
• V5 is a subspace of V4
Linear algebra review 3–6
Independent set of vectors
a set of vectors {v1, v2, . . . , vk} is independent if
α1v1 + α2v2 + · · ·+ αkvk = 0 =⇒ α1 = α2 = · · · = 0
some equivalent conditions:
• coefficients of α1v1 + α2v2 + · · ·+ αkvk are uniquely determined, i.e.,
α1v1 + α2v2 + · · ·+ αkvk = β1v1 + β2v2 + · · ·+ βkvk
implies α1 = β1,α2 = β2, . . . ,αk = βk
• no vector vi can be expressed as a linear combination of the othervectors v1, . . . , vi−1, vi+1, . . . , vk
Linear algebra review 3–7
Basis and dimension
set of vectors {v1, v2, . . . , vk} is a basis for a vector space V if
• v1, v2, . . . , vk span V , i.e., V = span(v1, v2, . . . , vk)
• {v1, v2, . . . , vk} is independent
equivalent: every v ∈ V can be uniquely expressed as
v = α1v1 + · · ·+ αkvk
fact: for a given vector space V , the number of vectors in any basis is thesame
number of vectors in any basis is called the dimension of V , denoted dimV
(we assign dim{0} = 0, and dimV = ∞ if there is no basis)
Linear algebra review 3–8
Nullspace of a matrix
the nullspace of A ∈ Rm×n is defined as
N (A) = { x ∈ Rn | Ax = 0 }
• N (A) is set of vectors mapped to zero by y = Ax
• N (A) is set of vectors orthogonal to all rows of A
N (A) gives ambiguity in x given y = Ax:
• if y = Ax and z ∈ N (A), then y = A(x+ z)
• conversely, if y = Ax and y = Ax, then x = x+ z for some z ∈ N (A)
Linear algebra review 3–9
Zero nullspace
A is called one-to-one if 0 is the only element of its nullspace:N (A) = {0} ⇐⇒
• x can always be uniquely determined from y = Ax(i.e., the linear transformation y = Ax doesn’t ‘lose’ information)
• mapping from x to Ax is one-to-one: different x’s map to different y’s
• columns of A are independent (hence, a basis for their span)
• A has a left inverse, i.e., there is a matrix B ∈ Rn×m s.t. BA = I
• det(ATA) = 0
(we’ll establish these later)
Linear algebra review 3–10
Interpretations of nullspace
suppose z ∈ N (A)
y = Ax represents measurement of x
• z is undetectable from sensors — get zero sensor readings
• x and x+ z are indistinguishable from sensors: Ax = A(x+ z)
N (A) characterizes ambiguity in x from measurement y = Ax
y = Ax represents output resulting from input x
• z is an input with no result
• x and x+ z have same result
N (A) characterizes freedom of input choice for given result
Linear algebra review 3–11
Range of a matrix
the range of A ∈ Rm×n is defined as
R(A) = {Ax | x ∈ Rn} ⊆ Rm
R(A) can be interpreted as
• the set of vectors that can be ‘hit’ by linear mapping y = Ax
• the span of columns of A
• the set of vectors y for which Ax = y has a solution
Linear algebra review 3–12
Onto matrices
A is called onto if R(A) = Rm ⇐⇒
• Ax = y can be solved in x for any y
• columns of A span Rm
• A has a right inverse, i.e., there is a matrix B ∈ Rn×m s.t. AB = I
• rows of A are independent
• N (AT ) = {0}
• det(AAT ) = 0
(some of these are not obvious; we’ll establish them later)
Linear algebra review 3–13
Interpretations of range
suppose v ∈ R(A), w ∈ R(A)
y = Ax represents measurement of x
• y = v is a possible or consistent sensor signal
• y = w is impossible or inconsistent; sensors have failed or model iswrong
y = Ax represents output resulting from input x
• v is a possible result or output
• w cannot be a result or output
R(A) characterizes the possible results or achievable outputs
Linear algebra review 3–14
Inverse
A ∈ Rn×n is invertible or nonsingular if detA = 0
equivalent conditions:
• columns of A are a basis for Rn
• rows of A are a basis for Rn
• y = Ax has a unique solution x for every y ∈ Rn
• A has a (left and right) inverse denoted A−1 ∈ Rn×n, withAA−1 = A−1A = I
• N (A) = {0}
• R(A) = Rn
• detATA = detAAT = 0
Linear algebra review 3–15
Interpretations of inverse
suppose A ∈ Rn×n has inverse B = A−1
• mapping associated with B undoes mapping associated with A (appliedeither before or after!)
• x = By is a perfect (pre- or post-) equalizer for the channel y = Ax
• x = By is unique solution of Ax = y
Linear algebra review 3–16
Matrix structure and algorithm complexity
cost (execution time) of solving Ax = b with A ∈ Rn×n
• for general methods, grows as n3
• less if A is structured (banded, sparse, Toeplitz, . . . )
flop counts
• flop (floating-point operation): one addition, subtraction,multiplication, or division of two floating-point numbers
• to estimate complexity of an algorithm: express number of flops as a(polynomial) function of the problem dimensions, and simplify bykeeping only the leading terms
• not an accurate predictor of computation time on modern computers
• useful as a rough estimate of complexity
Numerical linear algebra background 9–2
vector-vector operations (x, y ∈ Rn)
• inner product xTy: 2n− 1 flops (or 2n if n is large)
• sum x+ y, scalar multiplication αx: n flops
matrix-vector product y = Ax with A ∈ Rm×n
• m(2n− 1) flops (or 2mn if n large)
• 2N if A is sparse with N nonzero elements
• 2p(n+m) if A is given as A = UV T , U ∈ Rm×p, V ∈ Rn×p
matrix-matrix product C = AB with A ∈ Rm×n, B ∈ Rn×p
• mp(2n− 1) flops (or 2mnp if n large)
• less if A and/or B are sparse
• (1/2)m(m+ 1)(2n− 1) ≈ m2n if m = p and C symmetric
Numerical linear algebra background 9–3
Rank of a matrix
we define the rank of A ∈ Rm×n as
rank(A) = dimR(A)
(nontrivial) facts:
• rank(A) = rank(AT )
• rank(A) is maximum number of independent columns (or rows) of Ahence rank(A) ≤ min(m,n)
• rank(A) + dimN (A) = n
Linear algebra review 3–18
Application: fast matrix-vector multiplication
• need to compute matrix-vector product y = Ax, A ∈ Rm×n
• A has known factorization A = BC, B ∈ Rm×r
• computing y = Ax directly: mn operations
• computing y = Ax as y = B(Cx) (compute z = Cx first, theny = Bz): rn+mr = (m+ n)r operations
• savings can be considerable if r ≪ min{m,n}
Linear algebra review 3–21
Full rank matrices
for A ∈ Rm×n we always have rank(A) ≤ min(m,n)
we say A is full rank if rank(A) = min(m,n)
• for square matrices, full rank means nonsingular
• for skinny matrices (m ≥ n), full rank means columns are independent
• for fat matrices (m ≤ n), full rank means rows are independent
Linear algebra review 3–22
(Euclidean) norm
for x ∈ Rn we define the (Euclidean) norm as
∥x∥ =!
x21 + x2
2 + · · ·+ x2n =
√xTx
∥x∥ measures length of vector (from origin)
important properties:
• ∥αx∥ = |α|∥x∥ (homogeneity)
• ∥x+ y∥ ≤ ∥x∥+ ∥y∥ (triangle inequality)
• ∥x∥ ≥ 0 (nonnegativity)
• ∥x∥ = 0 ⇐⇒ x = 0 (definiteness)
Linear algebra review 3–26
RMS value and (Euclidean) distance
root-mean-square (RMS) value of vector x ∈ Rn:
rms(x) =
!
1
n
n"
i=1
x2i
#1/2
=∥x∥√n
norm defines distance between vectors: dist(x, y) = ∥x− y∥x
y
x− y
Linear algebra review 3–27
Inner product
⟨x, y⟩ := x1y1 + x2y2 + · · ·+ xnyn = xTy
important properties:
• ⟨αx, y⟩ = α⟨x, y⟩
• ⟨x+ y, z⟩ = ⟨x, z⟩+ ⟨y, z⟩
• ⟨x, y⟩ = ⟨y, x⟩
• ⟨x, x⟩ ≥ 0
• ⟨x, x⟩ = 0 ⇐⇒ x = 0
f(y) = ⟨x, y⟩ is linear function : Rn → R, with linear map defined by rowvector xT
Linear algebra review 3–28
Cauchy-Schwarz inequality and angle between vectors
• for any x, y ∈ Rn, |xTy| ≤ ∥x∥∥y∥
• (unsigned) angle between vectors in Rn defined as
θ = (x, y) = cos−1 xTy
∥x∥∥y∥
x
y
θ!
xTy∥y∥2
"
y
thus xTy = ∥x∥∥y∥ cos θ
Linear algebra review 3–29
special cases:
• x and y are aligned : θ = 0; xTy = ∥x∥∥y∥;(if x = 0) y = αx for some α ≥ 0
• x and y are opposed : θ = π; xTy = −∥x∥∥y∥(if x = 0) y = −αx for some α ≥ 0
• x and y are orthogonal : θ = π/2 or −π/2; xTy = 0denoted x ⊥ y
Linear algebra review 3–30
interpretation of xTy > 0 and xTy < 0:
• xTy > 0 means (x, y) is acute
• xTy < 0 means (x, y) is obtuse
x x
y yxTy < 0xTy > 0
{x | xTy ≤ 0} defines a halfspace with outward normal vector y, andboundary passing through 0
0
{x | xTy ≤ 0}
y
Linear algebra review 3–31
MATLAB overview
• MATLAB is a high-level technical computing language and interactive environment for algorithm development, data visualization, data analysis, and numerical computing
• Fast in prototyping, good for us! — research, teaching, learning…
• A wide range of applications: signal and image processing, optimization, financial modeling and analysis, and computational biology
• Add-on toolboxes (collection of special purposed MATLAB functions, available separately), e.g. statistics toolbox
• MATLAB language is based on vector and matrix operation
• With MATLAB can program faster: not necessary to declare variable, allocating memory etc.
MATLAB tutorialhttps://www.mathworks.com/academia/student_center/tutorials/mltutorial_launchpad.html?confirmation_page
Explorative data analysis• explore data qualitatively
• histogram
• quantile-quantile plot (q-q plot)
• box plot
• scatter plot
• surface plot
Histogram• obtained by creating a set of bins or intervals that
cover the range of the data set
x = randn(1000,1);
figure; hist(x)
y = [-4:0.1:4];z = normpdf(y)*1000;
hold on; plot(y, normpdf(z)*1000)
Frequency histogram vs relative frequency histogram
114 Computational Statistics Handbook with MATLAB
These plots are shown in Figure 5.1. Notice that the shapes of the histogramsare the same in both types of histograms, but the vertical axis is different.From the shape of the histograms, it seems reasonable to assume that the dataare normally distributed.
One problem with using a frequency or relative frequency histogram is thatthey do not represent meaningful probability densities, because they do notintegrate to one. This can be seen by superimposing a corresponding normaldistribution over the relative frequency histogram as shown in Figure 5.2.
A density histogram is a histogram that has been normalized so it will inte-grate to one. That means that if we add up the areas represented by the bars,then they should add up to one. A density histogram is given by the follow-ing equation
, (5.1)
where denotes the k-th bin, represents the number of data points thatfall into the k-th bin and h represents the width of the bins. In the following
On the left is a frequency histogram of the forearm data, and on the right is the relativefrequency histogram. These indicate that the distribution is unimodal and that the normaldistribution is a reasonable model.
16 18 20 220
5
10
15
20
25
30Frequency Histogram
Length (inches) 16 18 20 22
0
0.05
0.1
0.15
0.2
0.25Relative Frequency Histogram
Length (inches)
f x( ) νk
nh------= x in Bk
Bk νk
© 2002 by Chapman & Hall/CRC
q-q plot
• visually compare two distributions by graphing the quantiles of one versus the quantiles of the other
• order statistics for the first data
• order statistics of the second data
• plot one versus the other
Chapter 5: Exploratory Data Analysis 119
one could plot a stem-and-leaf with one and with two lines per stem as a wayof discovering more about the data. The stem-and-leaf is useful in that itapproximates the shape of the density, and it also provides a listing of thedata. One can usually recover the original data set from the stem-and-leaf (ifit has not been rounded), unlike the histogram. A disadvantage of the stem-and-leaf plot is that it is not useful for large data sets, while a histogram isvery effective in reducing and displaying massive data sets.
If we need to compare two distributions, then we can use the quantile plot tovisually compare them. This is also applicable when we want to compare adistribution and a sample or to compare two samples. In comparing the dis-tributions or samples, we are interested in knowing how they are shifted rel-ative to each other. In essence, we want to know if they are distributed in thesame way. This is important when we are trying to determine the distributionthat generated our data, possibly with the goal of using that information togenerate data for Monte Carlo simulation. Another application where this isuseful is in checking model assumptions, such as normality, before we con-duct our analysis.
In this part, we discuss several versions of quantile-based plots. Theseinclude quantile-quantile plots (q-q plots) and quantile plots (sometimescalled a probability plot). Quantile plots for discrete data are discussed next.The quantile plot is used to compare a sample with a theoretical distribution.Typically, a q-q plot (sometimes called an empirical quantile plot) is used todetermine whether two random samples are generated by the same distribu-tion. It should be noted that the q-q plot can also be used to compare a ran-dom sample with a theoretical distribution by generating a sample from thetheoretical distribution as the second sample.
The q-q plot was originally proposed by Wilk and Gnanadesikan [1968] tovisually compare two distributions by graphing the quantiles of one versusthe quantiles of the other. Say we have two data sets consisting of univariatemeasurements. We denote the order statistics for the first data set by
.
Let the order statistics for the second data set be
,
with .
x 1( ) x 2( ) … x n( ), , ,
y 1( ) y 2( ) … y m( ), , ,
m n≤
© 2002 by Chapman & Hall/CRC
Chapter 5: Exploratory Data Analysis 119
one could plot a stem-and-leaf with one and with two lines per stem as a wayof discovering more about the data. The stem-and-leaf is useful in that itapproximates the shape of the density, and it also provides a listing of thedata. One can usually recover the original data set from the stem-and-leaf (ifit has not been rounded), unlike the histogram. A disadvantage of the stem-and-leaf plot is that it is not useful for large data sets, while a histogram isvery effective in reducing and displaying massive data sets.
If we need to compare two distributions, then we can use the quantile plot tovisually compare them. This is also applicable when we want to compare adistribution and a sample or to compare two samples. In comparing the dis-tributions or samples, we are interested in knowing how they are shifted rel-ative to each other. In essence, we want to know if they are distributed in thesame way. This is important when we are trying to determine the distributionthat generated our data, possibly with the goal of using that information togenerate data for Monte Carlo simulation. Another application where this isuseful is in checking model assumptions, such as normality, before we con-duct our analysis.
In this part, we discuss several versions of quantile-based plots. Theseinclude quantile-quantile plots (q-q plots) and quantile plots (sometimescalled a probability plot). Quantile plots for discrete data are discussed next.The quantile plot is used to compare a sample with a theoretical distribution.Typically, a q-q plot (sometimes called an empirical quantile plot) is used todetermine whether two random samples are generated by the same distribu-tion. It should be noted that the q-q plot can also be used to compare a ran-dom sample with a theoretical distribution by generating a sample from thetheoretical distribution as the second sample.
The q-q plot was originally proposed by Wilk and Gnanadesikan [1968] tovisually compare two distributions by graphing the quantiles of one versusthe quantiles of the other. Say we have two data sets consisting of univariatemeasurements. We denote the order statistics for the first data set by
.
Let the order statistics for the second data set be
,
with .
x 1( ) x 2( ) … x n( ), , ,
y 1( ) y 2( ) … y m( ), , ,
m n≤
© 2002 by Chapman & Hall/CRC
120 Computational Statistics Handbook with MATLAB
We look first at the case where the sizes of the data sets are equal, so. In this case, we plot as points the sample quantiles of one data set
versus the other data set. This is illustrated in Example 5.4. If the data setscome from the same distribution, then we would expect the points to approx-imately follow a straight line.
A major strength of the quantile-based plots is that they do not require thetwo samples (or the sample and theoretical distribution) to have the samelocation and scale parameter. If the distributions are the same, but differ inlocation or scale, then we would still expect the quantile-based plot to pro-duce a straight line.
Example 5.4We will generate two sets of normal random variables and construct a q-qplot. As expected, the q-q plot (Figure 5.6) follows a straight line, indicatingthat the samples come from the same distribution.
% Generate the random variables.x = randn(1,75);y = randn(1,75);% Find the order statistics.xs = sort(x);ys = sort(y);% Now construct the q-q plot.plot(xs,ys,'o')xlabel('X - Standard Normal')ylabel('Y - Standard Normal')axis equal
If we repeat the above MATLAB commands using a data set generated froman exponential distribution and one that is generated from the standard nor-mal, then we have the plot shown in Figure 5.7. Note that the points in this q-q plot do not follow a straight line, leading us to conclude that the data arenot generated from the same distribution.
We now look at the case where the sample sizes are not equal. Without lossof generality, we assume that . To obtain the q-q plot, we graph the ,
against the quantile of the other data set. Note thatthis definition is not unique [Cleveland, 1993]. The quantiles ofthe x data are usually obtained via interpolation, and we show in the nextexample how to use the function csquantiles to get the desired plot.
Users should be aware that q-q plots provide a rough idea of how similarthe distribution is between two random samples. If the sample sizes aresmall, then a lot of variation is expected, so comparisons might be suspect. Tohelp aid the visual comparison, some q-q plots include a reference line. Theseare lines that are estimated using the first and third quartiles ofeach data set and extending the line to cover the range of the data. The
m n=
m n< y i( )i 1 … m, ,= i 0.5–( ) m⁄
i 0.5–( ) m⁄
q0.25 q0.75,( )
© 2002 by Chapman & Hall/CRC
Chapter 5: Exploratory Data Analysis 121
MATLAB Statistics Toolbox provides a function called qqplot that displaysthis type of plot. We show below how to add the reference line.
Example 5.5This example shows how to do a q-q plot when the samples do not have thesame number of points. We use the function csquantiles to get therequired sample quantiles from the data set that has the larger sample size.We then plot these versus the order statistics of the other sample, as we didin the previous examples. Note that we add a reference line based on the firstand third quartiles of each data set, using the function polyfit (seeChapter 7 for more information on this function).
% Generate the random variables.m = 50;n = 75;x = randn(1,n);y = randn(1,m);% Find the order statistics for y.ys = sort(y);% Now find the associated quantiles using the x.% Probabilities for quantiles:p = ((1:m) - 0.5)/m;
This is a q-q plot of x and y where both data sets are generated from a standard normaldistribution. Note that the points follow a line, as expected.
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
X − Standard Normal
Y −
Sta
ndar
d N
orm
al
© 2002 by Chapman & Hall/CRC
box plot• display the distribution of the sample
• five values from a data sets are used to construct a box plot
• 3 sample quantiles
• min valule
• max value
• IQR: interquartile range
132 Computational Statistics Handbook with MATLAB
Box plots (sometimes called box-and-whisker diagrams) have been in use formany years [Tukey, 1977]. As with most visualization techniques, they areused to display the distribution of a sample. Five values from a data set areused to construct the box plot. These are the three sample quartiles
, the minimum value in the sample and the maximum value.There are many variations of the box plot, and it is important to note that
they are defined differently depending on the software package that is used.Frigge, Hoaglin and Iglewicz [1989] describe a study on how box plots areimplemented in some popular statistics programs such as Minitab, S, SAS,SPSS and others. The main difference lies in how outliers and quartiles aredefined. Therefore, depending on how the software calculates these, differentplots might be obtained [Frigge, Hoaglin and Iglewicz, 1989].
Before we describe the box plot, we need to define some terms. Recall fromChapter 3, that the interquartile range (IQR) is the difference between thefirst and the third sample quartiles. This gives the range of the middle 50% ofthe data. It is estimated from the following
. (5.5)
This shows the binomialness plot for the data in Table 5.2. From this it seems reasonable touse the binomial distribution to model the data.
0 1 2 3 4 5 6 7 8 9 10
−10
−9.5
−9
−8.5
−8
−7.5
−7
−6.5
−6
−5.5
−5
1
1
1
Number of Females − k
φ (n
* k)
q0.25 q0.5 q0.75, ,( )
IQRˆ q0.75 q0.25–=
© 2002 by Chapman & Hall/CRC
132 Computational Statistics Handbook with MATLAB
Box plots (sometimes called box-and-whisker diagrams) have been in use formany years [Tukey, 1977]. As with most visualization techniques, they areused to display the distribution of a sample. Five values from a data set areused to construct the box plot. These are the three sample quartiles
, the minimum value in the sample and the maximum value.There are many variations of the box plot, and it is important to note that
they are defined differently depending on the software package that is used.Frigge, Hoaglin and Iglewicz [1989] describe a study on how box plots areimplemented in some popular statistics programs such as Minitab, S, SAS,SPSS and others. The main difference lies in how outliers and quartiles aredefined. Therefore, depending on how the software calculates these, differentplots might be obtained [Frigge, Hoaglin and Iglewicz, 1989].
Before we describe the box plot, we need to define some terms. Recall fromChapter 3, that the interquartile range (IQR) is the difference between thefirst and the third sample quartiles. This gives the range of the middle 50% ofthe data. It is estimated from the following
. (5.5)
This shows the binomialness plot for the data in Table 5.2. From this it seems reasonable touse the binomial distribution to model the data.
0 1 2 3 4 5 6 7 8 9 10
−10
−9.5
−9
−8.5
−8
−7.5
−7
−6.5
−6
−5.5
−5
1
1
1
Number of Females − k
φ (n
* k)
q0.25 q0.5 q0.75, ,( )
IQRˆ q0.75 q0.25–=
© 2002 by Chapman & Hall/CRC
Chapter 5: Exploratory Data Analysis 133
Two limits are also defined: a lower limit (LL) and an upper limit (UL). Theseare calculated from the estimated IQR as follows
(5.6)
The idea is that observations that lie outside these limits are possible outliers.Outliers are data points that lie away from the rest of the data. This mightmean that the data were incorrectly measured or recorded. On the otherhand, it could mean that they represent extreme points that arise naturallyaccording to the distribution. In any event, they are sample points that aresuitable for further investigation.
Adjacent values are the most extreme observations in the data set that arewithin the lower and the upper limits. If there are no potential outliers, thenthe adjacent values are simply the maximum and the minimum data points.
To construct a box plot, we place horizontal lines at each of the three quar-tiles and draw vertical lines to create a box. We then extend a line from thefirst quartile to the smallest adjacent value and do the same for the third quar-tile and largest adjacent value. These lines are sometimes called the whiskers.Finally, any possible outliers are shown as an asterisk or some other plottingsymbol. An example of a box plot is shown in Figure 5.14.
Box plots for different samples can be plotted together for visually compar-ing the corresponding distributions. The MATLAB Statistics Toolbox con-tains a function called boxplot for creating this type of display. It displaysone box plot for each column of data. When we want to compare data sets, itis better to display a box plot with notches. These notches represent theuncertainty in the locations of central tendency and provide a rough measureof the significance of the differences between the values. If the notches do notoverlap, then there is evidence that the medians are significantly different.The length of the whisker is easily adjusted using optional input argumentsto boxplot. For more information on this function and to find out whatother options are available, type help boxplot at the MATLAB commandline.
Example 5.10In this example, we first generate random variables from a uniform distribu-tion on the interval , a standard normal distribution, and an exponen-tial distribution. We will then display the box plots corresponding to eachsample using the MATLAB function boxplot.
% Generate a sample from the uniform distribution.xunif = rand(100,1);% Generate sample from the standard normal.xnorm = randn(100,1);% Generate a sample from the exponential distribution.
LL q0.25 1.5 IQRˆ⋅–=
UL q0.75 1.5 IQR .ˆ⋅+=
0 1,( )
© 2002 by Chapman & Hall/CRC
134 Computational Statistics Handbook with MATLAB
% NOTE: this function is from the Statistics Toolbox.xexp = exprnd(1,100,1);boxplot([xunif,xnorm,xexp],1)
It can be seen in Figure 5.15 that the box plot readily conveys the shape of thedistribution. A symmetric distribution will have whiskers with approxi-mately equal lengths, and the two sides of the box will also be approximatelyequal. This would be the case for the uniform or normal distribution. Askewed distribution will have one side of the box and whisker longer thanthe other. This is seen in Figure 5.15 for the exponential distribution. If theinterquartile range is small, then the data in the middle are packed aroundthe median. Conversely, if it is large, then the middle 50% of the data arewidely dispersed.
An example of a box plot with possible outliers shown as points.
1
−3
−2
−1
0
1
2
3
Val
ues
Column Number
Quartiles
Possible Outliers
AdjacentValues
© 2002 by Chapman & Hall/CRC
smallest data point within 1.5 IQR from
the first quantile
largest data point within 1.5 IQR from the third quantile
scatter plot• display each pair of data (xi, yi) as points using some
plotting symbol in a two-dimensional coordinate system
x = randn(100,1); y = randn(100,1); figure; scatter(x,y)
x = randn(100,1); y = x*2 + randn(100,1); figure; scatter(x,y,'r*')
multiple variables
4.4 Logistic Regression 123
sbp
0 10 20 30
ooo
o
ooooo
o
o
ooo
oo ooo
oo
ooo ooo
ooooo
ooo
oo
oo ooooo
oo
o
oooooo
ooo
o ooooo
ooo ooo oooo ooooooo
oooo o
oo oooooooo
o
oooo
oo
o
ooo
o
o
o
ooo
o
o
o
ooo
oooooooo
ooo o
oooo ooooooo
o
oooo
o
ooo
oooo
oooo
o
oo
o
o
ooo o
o
o
o
o
o
oo
o
o ooooo
oooo
ooo
oooooo
oooo
oooo oooo
oo
oooooooo
oo
oo
o
ooo ooo
ooo
o
oooo
o oo
o
oooo
o
ooo
ooo
o
o
o
ooo
ooo o
o
oooo o
ooo
oooooo
oo
o
oooo
o
o
o oooooooo
oo
oooo o
o
oo
o
oooo
ooo
o
o
ooo
ooo
o
oo o
ooo
oooo
oo
oooo
o
oooooo
ooo
o
ooo
o
oo
o
oooo ooo
ooo
o oooooo ooo o
oooooo
o
ooo o
oo oo
o
ooo o
ooo
o
oo oo
o
o
o
o
oo
oo
oo o
o
oo
oo
oo o
oo
o
oooo
oooooooooooooo
o
oooooooo
o
oooo
o
o
o
ooo
ooo
o
o oo
ooo
o
ooooo o
oo
ooo
ooo
ooooooo
ooo
oo
oo o
oo
oooo
o
oo o ooo
oo o
oooo
o oooooo
o ooooo ooo oo
o
ooo oo
oooooooooo
o
ooo o
oo
o
ooo
o
o
o
oo o
o
o
o
ooo
ooo ooooo
ooo ooooo oo
ooooo
o
oooo
oo
oooo
oooo
oo
o
ooo
o
ooooo
o
o
o
o
o o
oo ooo oo
ooo
o
oo
oooo
o oo
oooo
o ooo oooo
oo
ooo ooo
ooo
o
oo
o
o
oooo
o
ooo
o
oooo
ooo
o
oooo
o
ooo o
o o
o
oo
oo
o ooo oo
ooooo
o ooo
ooooo
oo
o
oooo
o
o
o oooo oooo
oo
oooo o
oo o
o
oooo
ooo
o
o
oo
o
o
o o
oooooooo
ooo
ooo oo oo
o oo
o oo
ooo
o
oo o
o
oo
o
o ooo ooooo
oo ooo
o oo oooo
oooooo
oo ooo
oo ooo
oo
oo
ooo
o
oooo
oo
o
o
oo
oo
oo o
o
ooo
o
ooo
ooo
oo ooooooo
oooo ooo
oo
o
ooo oo oo
oo
o
oo
o
o
o
o
o oo
0.0 0.4 0.8
oo
o
o
ooo
ooo
o
ooo
ooooo
oo
oo
oooo
o oooo
ooooo
oooo
oo ooo
o
ooo ooo
ooo
ooo
oo o
ooo ooo ooooo oooo
o
o
ooo oo
ooo ooooo
oo
o
o ooo
oo
o
ooo
o
o
o
ooo
o
o
o
o ooo ooooooo
oo
oooo
oooo oooo o
o
oooo
ooooo
oooo
oo
o
o
ooo
o
ooooo
o
o
oo
oo
o
oooooo
oooo
oo
oo ooo
oo
oo oo
o ooo
ooo o
o o
oo oo oooooo
oo
o
oo
o ooo
oo o
o
ooooooo
o
ooo
o
o
ooo ooo
o
o
o
oooooooo
o oooooo
ooooo o
o
oo
o
oooo
o
o
oo
ooooooo
oo
ooooo
ooo
o
oo oo
oo o
o
o
ooo
o
oo
oo ooo
ooo
ooo
oooo
ooo
oo ooo o
ooo
o
oo oo
oo
o
ooooo o
oooooo
oo
ooo o
oo oo
oo oo
o
o
oooo
ooooo
oo
oo
ooo
o
oooo
oo
o
o
oo
oo
ooo
o
ooo
o
ooo
ooo
oo oooooo
oo oo ooooo o
o
oo oooooo
o
o
ooo
o
o
o
oo o
ooo
o
o oo
ooo
o
oooooo
oo
oooooo
o oo oooo
ooo
oo
ooo
ooo o
oo
o
oooooo
ooo
ooooooooo o oo oooooo oo oo
o
ooooo
o ooo
o oo oo
o
o
o ooo
oo
o
ooo
o
o
o
oo o
o
o
o
ooo
oooooo
oo
oo
oooo
oo oooo
oo o
o
o o oo
oo
oooooooooo
o
ooo
o
ooooo
o
o
oo
oo
o
o ooo oo
oo
oo
ooo
oooo
oo
ooo o
ooooo ooo
oo
oo oo
ooooo
o
oo
o
oo
oooo
ooo
o
oooo
ooo
o
oooo
o
oooooo
o
o
o
oooo
oooo
o ooooooo
oo o ooo
oo
o
oo oo
o
o
oo
o oooooo
oo
oooo o
o
o o
o
oo oo
o o o
o
o
oo
o
o
oo
oooooooooo
o
oooo
ooo
oooooo
ooo
o
oo o
o
oo
o
ooo o ooooo
ooooo
oooo
o ooo
ooo o
o
ooo
o o
oo ooo
oo
o o
ooo
o
ooo o
o
o
o
o
oo
oo
o oo
o
oooo
ooo
oo
o
oo
oooooo
oooo oooo
oo
o
o oo ooo
ooo
o
oo
o
o
o
o
ooo
0 50 100
oo
o
o
oooooo
o
oooo ooo o
ooo
oooooooooo
ooo
oo
oo ooooooo
o
o ooo oo
oo o
oooooo
ooo ooooooo ooooooo
oooooo ooooooooo
o
o ooo
o
oo
ooo
o
o
o
ooo
o
o
o
ooo
o ooooooo
oooo
o ooo o oooooo
o
oo oo
ooooooooo
ooo
o
oo
o
o
ooooo
o
o
o
o
oo
o
oooooo
oooo
oooo ooooooooo
oooo
oooo
oo
oo oooooooo
oo
o
oooooo
ooo
o
oooo
ooo
o
ooo
o
o
oooooo
o
o
o
oo
oooo oo
ooooooo
oooooo
o
oo
o
ooo o
o
o
oo
ooooooo
oo
oooo o
ooo
o
ooooo oo
o
o
o oo
o
oo
oo oo
ooooooo
oooo
ooo
ooo
ooo
ooo
o
oooo
oo
o
oo ooooooo
oooo
ooooo
ooo oo oo o
o
oooo o
ooo o
o
oooo
ooo
o
ooo o
o
o
o
o
oo
oo
ooo
o
ooo
o
o o o
oo
o
o o oooooooooooooooo
o
o ooooooo
o
o
oo
o
o
o
o
oo
o
100
160
220
oo
o
o
ooo
ooo
o
ooo
oooo o
oo
oooo
o ooo oo
o
ooo
oo
oo o
oo
o oo o
o
ooo o o
oo
oo
o oo
oo o
ooo ooo ooo o oo o o oo
o
oo
oooo oooo oo o
oo
o
oo oo
oo
o
ooo
o
o
o
ooo
o
o
o
oo
o o oo ooo
oo
oo o o
oooo oo
oooo o
oo o o
o
oo
ooo
oooo
ooo
o
oo
o
o
o oo o
o
o
o
o
o
o o
o
oooo o
o
oo
oo
oo
oo ooo oo
oooo
o oo
o oo ooo
o
oooooo
ooo
o
oo
o
oo
oo oo
oo o
o
oooo
o o o
o
oooo
o
oooooo
o
o
o
ooo o
o o oo
oo oo ooo
oo
o oooo
o
o
o
ooo o
o
o
oo
oooo ooo
oo
o ooo o
o
oo
o
oo oo
ooo
o
o
ooo
ooo
oo
ooooo
ooo
o
ooo o
ooo
oooo o o
oo o
o
ooo
o
oo
o
oooo oo
ooo
oo
oooo o
o ooo o
ooo oo
o
ooo
o o
oooo
o
oo
oo
ooo
o
oooo
o
o
o
o
oo
oo
ooo
o
ooo
o
ooo
ooo
oooo
oooo
oooooo o o
oo
o
ooo ooo
ooo
o
oo
o
o
oo
ooo
010
2030
o
ooo
oo oo
o oo
o
o o
o
oo o
o
o
oo
oo
o
oooooo o
oo oo ooo
o
o ooooo o
ooooo oo
oooo
o o oo
o
oo
ooooooooooooo
ooo
ooo
oo
o
ooooo oo ooooo
oo
ooooooo
ooo
o
o o
o
oooooooo
ooooo
ooo
ooooooooo
o
oooo oo oo
o
oooooo
o
o
oo
o
o
ooo
oo
ooo ooo oooooo
o
oooo
oo
o
ooo ooo oooooooo oooo
oo
o oo
ooo oooooo
oo ooooooo
o oo
o
oo o
o
o ooooo oo
ooo
ooo oo oo oo
ooo
oooooo
oo
o
ooooo
oo
ooo
ooo oo
o
o
ooooooo o ooooo
o
ooo oooo
oooo
oo
ooo
ooo
o
ooo ooo ooo
ooooo
o
oooo
oooo
ooo ooo
oo
oo o
o
o oooo o
o
oo
ooo oooo
oooooo
oooooooo
o
ooooo
o o oo
o
oo
oo
ooo
o ooo
oo
oo
o
oo
oo
o
o
oooo
o
oo
oo ooo
ooo oo oooooo
ooo
oooooooooo
o oo o oooo
o
tobaccoo
ooo
o
oooo ooo
oo
o
oo
oo
o
oo
oo
o
oooooo ooooo oo
o
o
ooo ooo o
ooooo o
oo
oooooo o
o
oo
ooo
oooooooo oooooooo
oo
o
ooooo
o o ooooo
oo
ooooooooo o
oo o
o
oooo
o ooooo
ooo
ooo
oooo ooooo
o
oooooooo
o
oooo oo
o
o
ooo
o
o oooo
ooooo ooo
ooo o
o
o oooo
o
o
oooooo ooooo
oooo ooooooo
o
ooooo
oo oo
oo oo
oo o oo
ooo
o
ooo
o
oooo
oooo
oooo
oo oo
oooo
ooo
oooooo
o o
o
oo
oooo
o
oo oooo
ooo
o
ooo ooo oo ooooo
o
oo oo ooo
oo
ooo
o
ooo
oo o
o
ooooooo oooooo
o
o
ooo oo
o ooooo o
ooo
ooo o
o
ooo
ooo
o
ooo
oooooo
ooooo ooooo
oo
oo
o
oo
oo
oo ooo
o
oo
o o
ooooo
oo
oo
oo
o
oo
oo
o
o
o oooo
o
ooo ooo
ooo ooooo ooo ooo
ooo oo oooo
ooooo ooo o
o
o
o ooo
oo ooo
o
o
oo
oo
o o
o
o
oo
oo
o
ooo oooo
ooooooo
o
ooo ooo o
ooo
ooo
oooooo oo o
o
oo
ooo
ooooo
oooo oooo
o o
o
oo
o
ooooooooo oo
oo
oo ooooo
oooo
oo o
o
ooo ooooooo oo oo oo
oooo oo oo o
o
ooooo ooo
o
oooo o
o
o
o
o
oo
o
ooooo
ooooooo ooooo
o
o ooooo
o
oooo oooo oooooo ooo oo ooo o
o oooooooooo oo
ooo oo
ooo
o
ooo
o
o oooo ooo
oo o
oooo
ooooo
oo
o
oo oooo
oo
o
oooo o
oo
ooo
oooooo
o
oooooooo ooooo
o
oooo oo o
ooo o
oo
oooo oo
oo
ooo o
oo ooooo o
o
o
ooooo
oo oooo o
ooo
oooo
o
oooo oo
o
oooooo oo
o
ooo o
ooo oooo
ooo
o
oo
oo
ooooo
o
oo
o o
ooo
ooo o
oo
oo
o
oo
ooo
o
ooooo
o
ooo oo
oooooo oo ooooo oooo oo oo
ooooooo oo
ooo
o
o
ooo
oooo
o oo
o
oo
ooo o
o
o
oooo
o
o oo
oooooooo o o
o
o
ooo o ooooooooo
ooooo
oooo
o
oo
ooo
ooooo
o oo oooo
oo
o
o
o o
o
oo oooo
o oo ooo
oo
o ooooo
oooo
oo o
o
ooo o oo oooo
oo oo
oo
oooooooo o
o
o o oooooo
o
oooooo
o
o
ooo
o
ooooo
ooo
ooo oo ooo o
o
o oooo
o
o
oooo oo ooo o ooooo ooo ooooo
o ooooo
ooo
o
o oooo ooo
ooo
o
ooo
o
ooooooo oo
oooooo
ooooo
ooo
oo ooo
o
oo
o
ooo oooo
oooo oo
o oo
o
o ooo ooo o o oooo
o
oo o ooo o
oo
o ooo
o ooooo
o
ooooo
o oooo
oooo
o
ooooooo
oooo ooooooo o
o
oo oooo
o
ooooo ooo
oooo
ooo
oo oo
oo
oo
o
oooooo ooo
o
oo
oo
oo
ooo
oo
oo
oo
o
o
oo
oo
o
ooooo
o
ooo ooo
ooooooo oooo
oooo oo oooooooooo oooo o
o
o
ooo
oooooo
o
o
oo
o
oo
o
o
o
oo
oo
o
ooo
ooo oooo oooo
o
ooooo
ooo ooo oo
oo
ooooooo
o
oooo
oo
oooo
oooooo
oooo
o
o o
o
ooooo oooo ooo
oo
oooooo
oooo
ooo
o
oo
o ooooooo
oo o
oo o
ooo oooooo
o
oo oooooo
o
oooo
oo
o
o
oo
o
o
ooo
oo
oooooooooooo
o
ooo o
oo
o
ooooooooooooo o
oooooo
oo o
ooooo
ooooo
ooooooo
o
oo o
o
oo o
o
oooooo oo
oooooo oooo o
o
ooo
ooooo
o
oo
o
oo
ooooo
ooooo o ooo
o
oooo ooooo
o ooo
o
oooooo o
oo
oooo
o oo
oo o
oo
oo oooooo
ooo oo
o
oo ooo
ooo ooo o
ooooooo
o
o o oo oo
o
oooo ooooo
o ooo
oo oo ooo
ooo
o
ooo
oo
oooo
o
o
ooo
ooooooo
oo
oo
o
oo
oo
o
o
oo oo
o
o
oo o oooooooooooooooo
oo ooooo
oooooo ooo
oooo
o
ooo
ooo o
o oo
o
oo
oooo
o
o
ooo o
o
o oooooo
oooo o o
o
o
oooooo o
ooo
o ooo
oo
oo
o oo o
o
oooo
oooo oo
o o o ooo
o oooo
o o
o
oo oo oo
o o oo o o
oo
o ooooo
oooo
oo o
o
oo
o oo ooo
oo o
o oooo
oooo oooo o
o
o o oooo oo
o
oooo
oo
o
o
oo
o
o
o oooo
ooo
oo o ooooo o
o
oooo
oo
o
oooo oooooo
o ooo oo ooooooo
oo o oo
oooo
oo o o
o oooo
oo o
o
ooo
o
oooo
oooooooo ooo
o ooo
o
o oo
ooo oo
o
oo
o
oo
ooo oo
ooo
oo oooo
o
oooo ooo oo
o ooo
o
ooo ooo o
oo
ooo
o
ooo
ooo
o
oo
o oooo oo
ooo o
o
o
oo oo
oo
ooo
o ooo o
oo
oo o
o
oooooo
o
o o oooo o oo
ooo oo oo oo
ooo
oo
o
oo
oo
oo o oo
o
oo
o o
ooo
oooo
oo
oo
o
oo
oo
o
o
oooo
o
o
oooo o
oooooooooo o ooo
oooo ooo
oooo
ooo ooooo
o
oooo
oooooo
ooo ooo
o
o oo
o o
ooo
oo
oooooooo
oo
ooooo
ooo o
oo
oooo
o
o o
o
ooo ooo oooooo
o
ooooooooo ooo ooo
ooo ooo
oo oo
oooo
o
o
o
oo oo oo ooo
o o
o
o
o
o
o oo
ooooo o
o
oo
ooooooo
o
ooooooooo
oo
ooo ooooooo
oo
oooo
oooo
oo
o o
oo
ooooo
oo o
o oo
o
o
ooooooo oooo o
o
oooooo oo
ooo
ooooo
oo
ooo
o
ooo
oo
ooo o
o
ooo ooo
ooo oo o
oo
oooo
oooo
oooo
ooo
ooooo
oo oo
o
oooo
oo
oo
ooo o oooooo
oo
o
ooo
ooooo ooo
o
o
oooooo
oo o o
o
ooo oo oo ooo oo
ooooo
o
oooo
oo
o
oo
o
oo
o
ooo oo o
o
oooo
ooo
oo o o
oo ooooo
oooo
oooo oo
o
o
ooo
o
o
o ooo oooo
o
ooooo o o
o
oo ooo
o
o
o
oo
oo
oooo oo ooo
o
oooo
oo
o oooooo
oo
o
oooo
oo ooo o
oo
oo
oo
o
o
o ooo
oo
oooo
oo
ooo oo
o
oo o
o o
oo o
oo
ooooo oooooo
ooo
oooo o
oo
oo
oo
o
oo
o
o ooo
oo ooo ooo
o
ooo ooooooooooooooooooooo
ooooo
o
oo
oooooo ooo
o o
o
o
o
o
oo oooooo o
o
oo
ooooo
oo
o
ooooooooooo
oo ooooo
oooo
ooo
oooo oo
o
o
oo
ooooooo
ooo
ooo
oo
oooo
oooooooo
o
ooooooo
oooo
ooo
oo
o
ooo
o
o
oo o
ooo
oooo
oooo ooooo ooo
o
o
ooo
ooo
oo
oooo
o oo
oooo o
oo oo
o
oooo
oo
oo
ooooo o
oooooo
o
ooooooo
ooooo
o
ooo
oooo
oooo
ooo ooo oooooo
o ooo o
o
oo
ooo
oo
ooo
oo
o
oo
o ooo
o
oooo
ooo
oo ooooooooooo o
oo ooo oo
o
o
o ooo
o
ooo
ooo oo
o
oo
o oo oo
o
ooooo
o
o
o
oo
oo
ooo oo oooo
o
o oooooooooooo
oo
o
ooooooooo o
ooooooo
o
oldl
oo oooo
oooo
oo
oo oo
o
ooo
o o
ooo
oo
o oooo oooooo
ooo
oo
ooo
oo
oo
oo
o
oo
o
oooo
oooo
o oo
o
o
oooooo
oo oo ooo oo ooo
ooooooooo oo
o
o
o
o ooooo ooo
o o
o
o
o
o
ooo
oooooo
o
o o
ooo
oooo
o
o oo oo ooooo o
ooooooo
ooo
oooooooooo
o
o
oo
ooo o
ooo
ooo
ooo
oo
oooo
o oooo ooo
o
oo
ooo oo o
ooo
oo
ooo
o
oooo
o
oo o
ooo
o ooo
oooooo oooo oo
oo
o ooo
oooo
oooo
oo
o
o oooo
ooo o
o
oo ooo
ooo
oooooo
ooooooo
oo
oooooo ooo
o
o
o oo
oo oo
o ooo
o oo oo ooo ooo
o
oooo o
o
oo
ooo
oo
ooo
oo
o
oo
oooo
o
oooo
o oo
oo oo
ooo oooooo oo
oo oooo
o
o
oooo
o
oooooooo
o
oo
ooo oo
o
ooooo
o
o
o
ooo
oooooo oooo
o
oooo
ooo oo oooo
o
o
o
oo oo
ooooo o
ooo
oooo
o
o o ooo
oo
oooo
oooooo
o
ooo
o oooo
oo
o oooo o
ooo
oo
ooo
ooo oo
o o
oooo
o
oo
o
oooooooo
o ooo
o
ooo oo oo
ooooo
oooo oo oo oooooo
o oo
o
o
o
o ooooo oo
o
oo
o
o
o
o
oooo
o oooo
o
o o
o
ooooo
o
o
ooo ooo o oo
ooooo
ooooooo
ooooooooooo
o
oo
oooo
ooo
ooo
ooo
o
ooooo
o oo oooo o
o
ooo ooo ooooo
oo
oo
o
o
ooo
o
o
ooo
o ooooo
o
oo
oooooo
o ooo
oo
oooooo
ooooo ooo
o
o oooo
oo oo
o
o ooo
ooo
o
o ooo oooo oo
o o
o
oooooo o
o oooo
o
o oo
o o oooo oo
ooo ooooo oo o
o
ooo
o o
o
oooo
ooo
ooo
oo
o
oo
oooo
o
oooo
ooooooo
oo oo
o ooo oo
oo
oo ooo
o
o
oo o
o
o
ooo
oo o oo
o
o
ooo oo o
o
ooooo
o
o
o
oo
ooo
oooooooo
o
oo oooooo o oooo
oo
o
o ooo
ooo
ooo
oo
oo
oo
o
o
o oooo
oooooo
ooooo o
o
o oo
oo
ooo
oo
oooo
oooo
oo
oooo
ooooo
oo
ooo o
o
oo
o
oooo
o
o ooo ooo
o
ooo oooooo o oooooo o
oooooooooo oo
o
o
o
ooooo
oooo
oo
o
o
o
o
oo ooooooo
o
oo
oo o ooo
o
o
ooooo oo ooo oooooooo
ooo
oo
oooooo ooo
o
oo
ooooooo
oooooo
oooo oooooo
ooo o
o
oo
ooo oooooo
oo
ooo
ooooo
o
ooo
oooooo
o
ooo ooo oo
o oo o
oo
ooooo
oooo
ooo
ooo
ooooo
oooo
o
ooooooo
o
ooo ooo
ooooo o
o
oo
oo ooo
ooooo
o
ooo o ooo
oo oo
ooooo oo oooo
ooooo o
o
oo
o o
oo
o
oo
o
oo
o
oooooo
o
ooo o
o oooo ooo oooooo
oooo
ooo oo
oo
o
o ooo
o
ooo
ooooo
o
oo
oo oo o
o
oo ooo
o
o
o
oo
oo
oo o oo ooo o
o
ooooooooooooo
oo
o
o ooooo ooo o
oo
ooo oo
o
o
26
1014
o ooo
oo
o ooooo
oo oo
o
o oo
o o
ooo
oo
oo ooo oo o
oo
oo
ooo
ooo o
oo
oo
oo
o
o o
o
o oo o
o
o ooo ooo
o
oo o oo o
ooooo oooo
o oooo oo o
ooooo o
o
o
o
o ooooo
ooo
o o
o
o
o
o
o o oo
oo ooo
o
o o
ooooo
oo
o
ooo ooo o oo
oooo oooooo
ooo
ooo
oo oo oo
o
ooo
oo
ooo
oo
oooo o
o
o
ooo oo
oo
oo
oooo
o
oooo oooo
ooo
oooo
o
o
ooo
o
o
ooo
ooo
o ooo
oo
oo o ooo
o ooo
oo
oo o
oo
oo
o
oo oo
oo
o
oo oo ooo oo
o
oooo
ooo
o
ooo o oo
oooooo
o
oo
oo
oooo oo
oo
o
o oo
oooo
o ooo
ooo oo oo
ooooo
o oo
o o
o
oo
o o
ooo
ooo
o o
o
oo
o ooo
o
oooo
ooo
oo o o
ooo o ooo
oo oo
oo oo oo
o
o
oo o
o
o
oooo oooo
o
o
oooo oo
o
ooooo
o
o
o
oo
oo
oooo oo ooo
o
ooooooooooo o o
oo
o
ooo ooo ooo o
oo
ooooo
o
o
0.0
0.4
0.8
o
o
o ooo
o
oo o
o
o
o o
oo
o
o oo
o
o o
oo
o
o o
o
oo o
o
o oo oo
ooo
oo
o
oo
oo
o o
o
o o
o
ooooo
o
o
o o
oo
oo
o
o
oo oo
o
ooo
o
o
oo
o
o
o
o
oo
oo
ooo oo oo
ooo
o
o
o
o
oo oo
o
ooo
o
o
o
o
o
o o
o
oooo o o
o
o
o
o
o
ooooo
o
o
o
o
o
o
o
ooo o
o o
oo oooo
o o
o oo oo oo oo oo
oo o o
oo o
ooo
ooo o
o oooo
oo
oo
o
oo
oo
o
o o
o
oo
o
oo
o
o
oo
o
o
o
o
o oo o ooo oo
o
o
o
oo
o
o
o oo
ooo ooo
o
oo o
o
oo
o
o
o o
o
o
oo o
o oooo
o
o
o
oooo
oo o
o oo
o
o
oo
o o
oooo o oo
o
ooooooo o
o
ooooo
oo
o o
o
o
o o
oo
o o
o
o
o oo
o
o
o
o
oo
o
o
o o
o
oo oo
oo
oo
oo
o
o
oo
o o
oo
oo
o
o
o
o o oooo
oo
o
o
o
o
o oo
o o
oo
o
oo
o
o o
oo
o
o oo
oo
oo
oo
oo
oo o
oooo
o
o
o
ooo
o o
o
o o o
oo ooo
ooo
o
oo
oo o
o o
oo
o
o
o
ooo oo
o
o
o
oooo
o
ooo
o
o
oo o
oo
oo
oo
o
oooo
o o
o
o o oo
o
ooo
o
o
oo
oo
o
oo o
o
oo
o o
o
o o
o
ooo
o
ooooo
o oo
oo
o
o o
oo
o o
o
oo
o
o oo oo
o
o
o o
oo
oo
o
o
ooo o
o
ooo
o
o
oo
o
o
o
o
o o
oo
oooo ooo
oo o
o
o
o
o
oooo
o
ooo
o
o
o
o
o
oo
o
oooo o o
o
o
o
o
o
ooo oo
o
o
o
o
o
o
o
oooo
oo
o ooooo
o o
o oo oo oooo o o
oo oo
ooo
ooo
oo oo
o oo oo
oo
oo
o
oo
oo
o
oo
o
oo
o
oo
o
o
o o
o
o
o
o
ooo oooo oo
o
o
o
oo
o
o
ooo
oooo oo
o
oo o
o
oo
o
o
o o
o
o
oo o
oo ooo
o
o
o
oo o o
oo o
o oo
o
o
oo
oo
ooooooo
o
oooooooo
o
oooo o
oo
oo
o
o
o o
oo
oo
o
o
o oo
o
o
o
o
o o
o
o
oo
o
o ooo
o o
oo
oo
o
o
oo
oo
oo
oo
o
o
o
ooo ooo
o o
o
o
o
o
ooo
oo
oo
o
oo
o
oo
o o
o
o oo
oo
o o
oo
oo
ooo
oo oo
o
o
o
o oo
oo
o
o oo
oo oo o
o oo
o
oo
o oo
oo
oo
o
o
o
ooooo
o
o
o
oo oo
o
ooo
o
o
ooo
oo
oo
oo
o
o ooo
o o
o
o oo o
o
oo o
o
o
oo
oo
o
ooo
o
o o
oo
o
oo
o
oo o
o
ooo oo
o oo
oo
o
oo
oo
o o
o
o o
o
o ooo o
o
o
oo
oo
oo
o
o
oooo
o
oo o
o
o
oo
o
o
o
o
o o
oo
oo oo o oo
oo o
o
o
o
o
ooo o
o
oo o
o
o
o
o
o
oo
o
o ooooo
o
o
o
o
o
ooo oo
o
o
o
o
o
o
o
oooo
o o
oo oooo
oo
o oo ooooooo o
oo oo
o oo
o oo
o ooo
ooo oo
oo
oo
o
oo
oo
o
oo
o
oo
o
oo
o
o
o o
o
o
o
o
ooo oo oo oo
o
o
o
oo
o
o
ooo
oooooo
o
ooo
o
o o
o
o
oo
o
o
ooo
ooooo
o
o
o
oooo
o oo
o oo
o
o
oo
o o
ooooooo
o
ooo ooo oo
o
oooo o
oo
oo
o
o
oo
oo
o o
o
o
o oo
o
o
o
o
oo
o
o
oo
o
oo oo
o o
o o
oo
o
o
o o
oo
o o
oo
o
o
o
oo oooo
o o
o
o
o
o
oo o
oo
oo
o
oo
o
oo
oo
o
ooo
o o
oo
oo
oo
oo o
oooo
o
o
o
ooo
o o
o
ooo
oo oo o
o oo
o
oo
ooo
oo
oo
o
o
o
ooo oo
o
o
o
ooo o
o
ooo
o
o
o o o
oo
o o
oo
o
ooo o
o
famhist
o
o
o oo o
o
oo o
o
o
oo
oo
o
ooo
o
oo
oo
o
oo
o
ooo
o
ooo o o
oo o
oo
o
oo
o o
o o
o
oo
o
o oooo
o
o
oo
oo
o o
o
o
ooo o
o
oo o
o
o
oo
o
o
o
o
oo
oo
oo ooo oo
ooo
o
o
o
o
oooo
o
oo o
o
o
o
o
o
oo
o
oooooo
o
o
o
o
o
o oo oo
o
o
o
o
o
o
o
o ooo
o o
oooooo
oo
o ooooooooo o
oo oo
oo o
o oo
o ooo
ooo oo
oo
oo
o
oo
oo
o
o o
o
oo
o
oo
o
o
oo
o
o
o
o
o oo oooo oo
o
o
o
o o
o
o
ooo
oooooo
o
oo o
o
o o
o
o
oo
o
o
ooo
ooo oo
o
o
o
oo oo
oo o
o o o
o
o
oo
oo
oo ooo oo
o
o ooo ooo o
o
oooo o
oo
o o
o
o
oo
o o
oo
o
o
ooo
o
o
o
o
oo
o
o
o o
o
oo oo
oo
oo
oo
o
o
oo
oo
oo
oo
o
o
o
oo oooo
o o
o
o
o
o
ooo
o o
oo
o
oo
o
o o
oo
o
ooo
oo
o o
oo
oo
oo o
oo oo
o
o
o
oo o
o o
o
o oo
ooo oo
o oo
o
oo
ooo
oo
oo
o
o
o
ooooo
o
o
o
ooo o
o
oo o
o
o
oo o
o o
o o
oo
o
ooo o
o o
o
o o oo
o
ooo
o
o
oo
o o
o
o oo
o
oo
oo
o
o o
o
oo o
o
oo ooo
o oo
oo
o
oo
oo
oo
o
oo
o
o oooo
o
o
o o
oo
oo
o
o
ooo o
o
ooo
o
o
oo
o
o
o
o
oo
oo
ooooooo
ooo
o
o
o
o
oo oo
o
ooo
o
o
o
o
o
oo
o
oooooo
o
o
o
o
o
o ooo o
o
o
o
o
o
o
o
o ooo
oo
oooooo
oo
o oo oo oooo oo
oooo
ooo
ooo
oooo
ooo oo
oo
oo
o
oo
oo
o
oo
o
o o
o
oo
o
o
o o
o
o
o
o
ooooooooo
o
o
o
oo
o
o
oo o
ooo ooo
o
oo o
o
oo
o
o
oo
o
o
ooo
o oooo
o
o
o
oo oo
ooo
ooo
o
o
oo
oo
ooo o ooo
o
oooo oooo
o
o ooo o
oo
oo
o
o
oo
o o
oo
o
o
o oo
o
o
o
o
oo
o
o
oo
o
oooo
oo
oo
o o
o
o
oo
o o
oo
oo
o
o
o
ooo oo o
oo
o
o
o
o
ooo
oo
oo
o
o o
o
o o
o o
o
ooo
oo
o o
oo
o o
ooo
oooo
o
o
o
oo o
o o
o
oo o
ooo oo
oo o
o
oo
o oo
oo
o o
o
o
o
ooooo
o
o
o
oooo
o
oo o
o
o
oo o
oo
oo
o o
o
o ooo
o o
o
o ooo
o
oo o
o
o
oo
oo
o
o oo
o
o o
oo
o
o o
o
ooo
o
o ooo o
o oo
oo
o
o o
oo
o o
o
oo
o
ooo oo
o
o
o o
oo
oo
o
o
oo o o
o
o o o
o
o
o o
o
o
o
o
oo
oo
oo oo o o o
o o o
o
o
o
o
oo oo
o
ooo
o
o
o
o
o
o o
o
o oo o oo
o
o
o
o
o
ooo oo
o
o
o
o
o
o
o
o ooo
o o
o ooooo
o o
o oooo oooo o o
oo oo
o o o
ooo
o o oo
o oo oo
oo
oo
o
oo
oo
o
o o
o
oo
o
o o
o
o
o o
o
o
o
o
o oo oooo oo
o
o
o
oo
o
o
oo o
oooo o o
o
oo o
o
o o
o
o
oo
o
o
oo o
oo oo o
o
o
o
o oo o
oo o
o o o
o
o
o o
oo
ooo o o oo
o
oooo ooo o
o
o ooo o
oo
o o
o
o
oo
oo
o o
o
o
o o o
o
o
o
o
oo
o
o
oo
o
o o oo
o o
oo
o o
o
o
o o
oo
o o
o o
o
o
o
oo oooo
o o
o
o
o
o
o oo
oo
o o
o
oo
o
oo
oo
o
o oo
oo
o o
oo
oo
oo o
oooo
o
o
o
ooo
oo
o
o oo
oo ooo
ooo
o
oo
oo o
o o
oo
o
o
o
ooooo
o
o
o
o o oo
o
ooo
o
o
oo o
oo
o o
oo
o
oooo
o
ooo ooo
oooo oooooooooo
oo ooo
o
o
o
ooo o
oo o
oo
ooo
ooo
o
o
o oo
o
oo o oo ooooo o oo oo
ooo
oo
oo o
o
o
oooo
ooo
o
oooo
o
oooo
o oo
o
oooo
oooooo o
o
o
ooo o
oo
o
oo ooooo
o ooo
oooo
ooooooo oo
o o
ooo
o oooo
o oooo o oo
oo oo ooooo
o oo
oo
ooo
oooooo o
o
oo
oo
oooo o
o
oo
ooo
oooo o
ooo
oooooooo oo
o
oooo
o
o
ooooo oo oo oo
o ooo ooooo o
oooo ooo oo oo oo
o
oo
o
oooooooo
oooo
o ooo oo
ooo o
ooo
oooo
ooo
oo
o
oooo ooo
o
ooo o
oo
o
oo o
oo
ooooo oo o
o
o
oo
oooo
ooooooo
oooo o
ooo
ooo oo o
ooo oooo oooo oooo
oo
ooooooo
ooo o
o
oo
oo
o
ooo oo
o
oo
o
oo
ooooo
oo
oo
o ooo o
oo
o
oo o
oooo o
oooo
ooo
ooo oooooooo
oo
ooo
o
o
ooo
oo
oo oo
ooo
oo
o
ooo o
oo
ooo
oo
ooo oo oo
o o
ooo
o oo
o
o
ooooooo
o
ooo ooooo
o
o oo
o
oooooooo oooo
o ooo oo
oo
ooo
oo
oo
oo ooo
o
o ooo
o
oooo
oooo
ooo o
oooooooo
o
oooo
oo
o
oooooooo oo
ooooo
ooo
oooooo
o o
ooo
ooooo
ooooo o ooo
o ooo
oooo ooo
ooooo
ooo
oo oo
o
ooo
oooooo
o
oo
ooooo
oooooooo oo
ooo
ooo
o
oooo
o
o
o oooooooooo
ooooo
ooooo
oo
oooooo
oo oo o
o
o o
o
oooooooo
oo
oo
oooooo
oooooo
o
ooooooooo
o
oooo
oooo
ooo o
ooo
ooo
o
oooo
ooo ooo
o
oo
o ooooo
oo oooo ooooooo o
oo o ooo
oooo
ooo
ooo
ooooooo ooo oooooo o
o
oo
oo
o
ooo oo
o
oo
o
ooooo o
o
oo
oo
oooo o
oo
o
oo oooo oo
ooo
ooo
oooooooooooo
oo
ooo
o
o
ooo
ooo
ooo
oo o
oo
o
ooo
oo
o
oooo
o ooooo o
ooo
oo ooo o
o
o
ooo oooo
oo
oo ooooo
o
o oo
o
o oo ooo oooooo oooooo
oo
ooo
o
o
oooooo o
o
oooo
o
oo
oo
oooo
ooo oo ooooooo
o
oooo
oo
o
oooooooooo
ooo
oo
oooo
o oooo
o o
ooooooooooooo ooo
oo ooooooo o oo
oo
o ooo
ooo
ooo
o
oo
oo
ooooo
o
oo
oooo
ooo oo
oooo ooo
oo
ooo
o
o ooo
o
o
oooo ooooo ooooooooo
oooo
oooo oo ooo oo o
o
o o
o
oooooo ooo
ooo
o ooo o o
ooooooo
ooo oo
o oo
oo
ooooooo
o
ooo o
ooo
ooo
oo
oo ooooooo
o
o o
ooo
oo ooooo ooo
ooo
ooo
oo
ooooo
oooo
ooooo
oo ooo
ooo oooo
o ooooo
o
o o
ooo
ooooo
o
oo
o
oo
o oooo
oo
oooo
oo oo
oo
oo oo
ooooo
ooo
oooooo ooo
oo oooo
oooo
o
o
o oo
ooo
ooo
ooo
o
o
o
oo ooo
o
o ooooooo oooooo
ooo
ooo
o
ooo
ooooo
o
oo
ooo
ooo
o
ooo
o
o ooooooooo oo ooo
o oooo
ooo
o
o
ooo oo o
o
o
ooo
oo
oo
oo
oooo
ooooo o
o ooooo
o
oooo
oo
o
ooo ooooooo
oo
oo
o
ooooo oo oo
oo
ooooo o
ooooooo ooo
oooooooooo oo
oo
ooo
ooo
oooo
o
oooo
oo ooo
o
oo
o ooo
ooo ooo oo ooo
oo
oooo
o
ooooo
o
o ooo oo ooooooooo oooo oo
oo
o ooo oooo ooo
o
oo
o
ooo
oo
ooo
ooo
o
ooo
ooooooooo
o
oooooooo
oo
oooo o
ooo
oo
oo
oo
o
oo o
o
oo oo oo ooo
o
o
o o
oooo
oooo ooo ooo o
ooo
ooo
oo oooooo
oo
oo oo ooooo
oo
ooo
o oooo o
ooo
oo
oo
o
ooo oo
o
oo
o
oo
o oooo
oo
oooo
ooooo
o
oooo
oooo
ooooooo
ooooo oo
ooooo
oooo
o
o
ooo
oo o
ooo
ooo
oo
o
obesityo
oo oo
o
oooo
ooooo
ooo oo
ooo
ooo
o
o
ooo oooo
o
ooo oo
ooo
o
ooo
o
oo oooo
ooooooooo
o oooo
oooo
o
oooo o oo
o
ooo
o
o
oooo
oooo
oooooo
oooo oo
o
oooooo
o
ooo o
oooooooooo
o
oo
oo
ooooo
oo
oo
ooooooooooo ooo
oo oo oooooooo
ooooo
ooooooo
o
ooo
ooo ooo
o
oo
oooooo oooo ooo oo
ooo
ooo
o
ooooo
o
oooooooo o
ooo oooooooo o
ooooooo oooo
oo
o
o o
o
oooo
oooooo
oo
oooooooo o ooo
o
oooo
oooo
oo
ooooo
ooo
oooo
oo
o
ooo
oo
ooooo oo oo
o
oo
oooooo
ooo ooo o
oo ooo
oooooooo
oo o oooo oo ooo oooo
oo ooo
oo oo
oo
ooo
oo
o
oo ooo
o
oo
o
oo
ooooo
oooooo
ooooo
o
oo o
oo o oo
ooo
ooo
oo
oooooooooooo
oo o
o
o
ooo
oo ooo
o
oo o
oo
o 1525
3545
ooo o
oo
o ooo o
ooo oooo oo
oo o
ooo
o
o
o ooo
oo o
o
oo
o oo
ooo
o
o oo
o
o o ooo ooo oo oo o oo
o oooo
oo o
o
o
oo
oooo o
o
ooo
o
o
oo
oo
oooo
oo o oo oo
ooo oo
o
ooo o
ooo
oo o
oo
ooo ooo
oo
oo
oo
ooo o
ooo
oo
oo
oooooo
ooooo o oo
oo oo o
o oo oooo
oo
o ooooo
oo oo
o
oo
oo
oo ooo
o
oo
ooo
o ooo oo
ooo
o oooo
o
o oo
o
oooo
o
o
ooo
oo o oo
o oo
oo o oooo
ooo
oooooooo
o ooo o
o
o o
o
oo o
oo
ooo
oo
oo
o ooooo
oo o o
ooo
oooo
ooo o
oo
ooo
o oo
oo
oo
oo
ooo
oo o
oo
ooo oo oo oo
o
o o
o oo
oo o
ooo oooo
o ooo o
o o
oo ooo
oooo
oooo
oo o oooooo
ooo
o oo oo oo o
o
oo
oo
o
ooo oo
o
oo
o
oo
o oooo
ooo
ooo
oo o
ooo
oo ooooo ooooo
o ooo
oooooooo o o
oo
ooo
o
o
ooo
oo o
ooo
ooo
oo
o
050
100
o
ooo
o
oooo o
o
oo oooo
o
o
oo o
oooooooooooo o
o oo
oo
o ooooo oo
o
oooo
o o
o
ooo o ooo
oo
o
ooooooo
oooooo
o
o oooooo ooooooo ooooo
oo oooo
o
ooooo oo o
o
oo oo
ooooo o
o ooooo
o
ooo
ooooo oooo
o oo ooo oooo
oo
o
o
oo
oo
ooooo oo o ooo ooooooo o
o oo
o
o oooo ooo oooo
ooo
oo
ooo
oooooo
oo oo oooo oo oo
ooo
ooo o
o
ooo
o
ooooo
oo
o
oooo ooo
o
o oo
ooooo o
ooooooo o
o oooo oo o oooo
oo
oo
o
ooooooo o
ooooo
o
ooo oooo
oo
oo
oo
oo
oo
oo
oo
o
oo
oo oooo oooo oo
oo
oo
o
oo
ooo
o
ooo oo o oo
ooo
oo
oooo oo
o
ooo oo
o
oo
o
o
o
o
o
oo
o
oo
o
oooo o
o o oooo o
oo
ooo o
oooo
ooo o
oooo
oo
ooo
o
o
o
ooo
oo
oooo oo oooooooo
o
o
ooooo
o
oo
o
o o
o
o ooo o
o
o
ooo
o
ooooo
o
ooo oo o
o
o
oo ooo ooo
oooo
o ooo
oooo
oooooo oooo
oooo
oo
o
o oooooooo
o
ooooooo
oooooo
o
o oo ooo
ooooooo ooooo o
oooooo
o
o oooo ooo
oo ooo
ooooo ooooooo
o
ooo
ooooo oooo
ooooo ooooo
oo
o
o
oo
oo
oooo ooo oooooooooo
ooo oo o
o oooooooooooooo
oo
oooooooo
ooooo oooo oooo oooo
ooo
o
ooo
o
ooooooo
o
o ooo oo o
o
oo o
o ooo
oo
ooooooo oo o
ooooo ooooooo
oo
o
ooooooooooooo
o
ooooooo
oo
ooo
o
oo
oooo oo
o
ooooooo ooo
oo oooo
oo
o
oo
ooo
o
ooo o ooo
o
ooo
ooooo
oooo
oo oo o
o
oo
o
o
o
o
o
oo
o
oo
o
oo oo ooooo oo o
oo
o oo
oooo
o
ooo o
o oo o
oo
ooo
o
o
o
ooooo
ooooooooooo ooo
o
o
ooooo
o
oo
o
oo
o
ooooo
o
o
ooo
o
oooo o
o
ooooo o
o
o
ooo
ooo oooo oooooo
o oo
ooooo ooo oo
o
o oo
o
oo
o
ooooo oooo
o
oo ooooo
ooo ooo
o
ooooo
ooooooo
o o oooo oo
ooooo
o
oooo ooo o
ooo
oo
o ooooo
oooo
o o
o
o oo
ooooo oooooo
o ooo oooo
oo
o
o
ooo
oo ooo o oo ooo ooo ooo
oooooo
o
o ooooooooooo
o oo
oo
ooooooo
ooo
oo o oo oo oo ooooo
oooo
o
ooo
o
ooooooo
o
ooooo oo
o
oo o
ooo
oo o
oooooo ooo oooo oo o o ooooo
oo
o
ooo oo
o ooo
oooo
o
oo oo ooooo
oo
oo
oo
oo
oooo
o
ooooo oooooo oo ooo
o
o
o
oo
ooo
o
oo oooo
oo
ooo
ooooo
oooo
oooo o
o
oo
o
o
o
o
o
oo
o
oo
o
oo ooo
o ooooo o
oo
ooooo ooo
ooo oo oo o
o o
o oo
o
o
o
ooo
oo
oooo ooooo ooo oo
o
o
oo oo o
o
oo
o
oo
o
o ooo oo
o
o ooo
oo ooo
o
ooo oooo
o
oo oooo oo
oooo
o ooo
ooo
ooooo ooo oo
o
o ooo
oo
o
ooo oo oooo
o
oo ooooo
oooo oo
o
oo ooo
oo ooooooooo
ooo
ooo ooo
o
o oooo oo o
o
ooo
ooooooo
oo oooo
o
ooo
oo oo oo ooooo o
ooooooo
oo
o
o
ooooooooo ooooooo ooooooo oo
oo
oooooo oooo oo
o oo
oo
ooo
o oo
oo
ooooooooooo oooo
oo
ooo
o
ooo
o
ooooo
oo
o
o oo ooo o
o
oo o
oooo
o o
o ooooooooooo ooo ooooo
oooo
o
oooooooo
ooooo
o
oooo oo oo
o
ooo
o
oo
oo
oo
oo
o
oo
ooo ooooooooooo
oo
o
oo
ooo
o
oo oo oooo
oo
o
o oo
ooooo
o
o ooo o
o
oo
o
o
o
o
o
oo
o
oo
o
oooo o
oooooo o
oo
ooo
ooo ooo
ooooo
oo
oo
ooo
o
o
o
ooo
oo
oooooo oo ooooo o
o
o
ooo oo
o
oo
o
oo
o
ooooo
o
o
ooo
o
oooo o
o
oooooo
o
o
oooo ooo o
oooo
o ooo
oooo
oooo o ooo o
o
oooo
oo
o
ooooooooo
o
oo ooooo
o oo ooo
o
o ooooo
o oo oo ooo oo
ooo
oo o ooo
o
oooo ooo o
ooo
oooo oo
ooooooo
o
o
o oo
oooo oooo
oooo ooooooo
oo
o
o
oooo
oooo ooo oooo oo ooo oo o ooo o
o ooooo oo ooo o
ooo
oo
ooo oooo
ooo
ooo oooo oo ooooooooo
o
ooo
o
ooooooo
o
oooooo o
o
ooo
oo ooo o
o ooo
ooo oo oo oooo oooo o
oo
oo
o
o oooo
oo oo
oooo
o
oo o ooooo
o
oo
oo
oo
oo
oooo
o
oooo oooo oooooooo
oo
o
oo
oooo
oo ooooo
o
ooo
ooooo
oooo
ooo oo
o
oo
o
o
o
o
o
oo
o
oo
o
oo oooo oo o
oo o
oo
oooo
o oo o
o
oooo
oo o
oo
ooo
o
o
o
ooo
oo
o ooo oooo oooo oo
o
o
oo ooo
o
oo
o
oo
o
oooo o
o
alcoholo
ooo
o
oo oo o
o
ooo oooo
o
oooo ooo o
oo oo
o oo o
o ooo
oooo
oo o ooo
o oo
o
oo
o
o oo oo oo
oo
o
oo ooo oo
o o o ooo
o
o oooooooo oo oo o o o
oo oo
oo ooo
o
ooooo oo o
ooo
oo
o oo ooooo
o ooo
o
o o
o
oooo ooo
oo
ooo oo
ooooo
oo
o
o
oo
oo
o ooo ooo ooo o ooooo o ooo o
o o
o ooooo ooo
oooo oo
oo
o oo
oooooooo oo o
ooo oo o oo oo
oo
oo
o
ooo
o
o ooo
ooo
o
o oooooo
o
o oo
o oo
ooo
oo oo
ooo oo oooo o o ooooo
oo
oo
o
ooooo
oo oo
oo
oo
o
ooo ooo oo o
ooo
o
oo
oo
oo
oo
o
ooooo oo o oo
oo oo
oo
o
o
o
oo
oo o
o
o oo ooooo
ooo
ooooo
o oo
o
o o oo o
o
oo
o
o
o
o
o
oo
o
oo
o
o ooo o
o o oooo o
oo
ooo oooo oo
oo ooooo
oo
ooo
o
o
o
ooo
oo
oooooooooo o ooo
o
o
oo ooo
o
oo
o
ooo
ooooo
o
100 160 220
oo
oo
ooo
o
o
oo
o
o o
ooo
ooo
o
oo
oo
oo
ooo
o o
o
oo
oo
o
o
o
oo
o
o
o
oo
o
o
oooo
o
oo
oo
o
o
oo
o
o
o
o
oooo
ooo
o
oo
oo o
ooooo
o
oo
oo
o
o
o
o
oo
o
oo
o
oo
o
o
oo
ooo
o
oo
o
o
ooo
oo
o
o
o
oo
oo
o
oo
o ooo
o
o
oo
o oo
o o
o
o
o
o oooo
ooo
oo
o
o
o
o
oo
o
o
o oo
oo
o
o
oo
o
o
ooo
o
oo
o ooo
oo
o
o
o
oo
oo
o
oooo
oo
oooo
oo
o
o
oooo
ooo
o
o
o
oo
o
o
o
o
oo
oo
oo
oo
oo
o o
oo o
oo
oo o
oooo oooo
o
o
ooo
ooo
o
oo
o
o
o
oo
o
o
oo
o
o
o
oo o
o
oo
o
oo
o
o
oooo
o
oo
oo
oo
oo
oo
o
o
oo
o
o o
o
oo
o
o
o
o oo
o
o
ooo
o
ooo
o
o
o
o
o
o
oooo
o
oo
ooo
o o
oo
o
oo
o
oo
o
ooo o
o
oo
o
o
o
o
oo
o ooo o
o
oo
o
o
o
o
o
o
o
ooo
o
oooo
o
o
oo
oo
o
o
o
ooooo
o ooo
o o
o
o
ooo
o
oo
o
oooo
oo
o
ooo
o
oo
oo oooooo
oo
o
o
o
oooo
oo
o
oo
oo
o
o
o ooo
oo o
o
o
ooo
o
o
o
oo
o
oo
oo ooo o
o
ooo
ooo
oo
o
oo
o
oo
oo
o
o
o
oo
o
o
o
oo
o
o
o
ooo
o
oo
oo
o
o
oo
o
o
o
o
oooo
o
oo
o
oo
oo o
oo
ooo
o
o ooo
o
o
o
o
ooo
ooo
oo
o
o
oo
ooo
o
oo
o
o
o o
o
oo
o
o
o
oo
oo
o
oo
o ooo
o
o
oo
ooo
o o
oo
o
ooo
oo
ooo
oo
o
o
o
o
o oo
o
ooo
oo
o
o
ooo
o
oo o
o
oo
oooo
oo
o
o
o
oo
ooo
oooo
o
o
oooo
oo
o
o
oooo
ooo
o
o
o
oo
o
o
o
o
oo
oo
o
o
oo
oo
oo
ooo
o
o
ooo
oooo o
ooo
o
o
ooo
oo
o
o
oo
o
o
o
oo
o
o
oo
o
o
oo
oo
o
oo
o
oo
o
o
oooo
o
oo
oo
oo
oo
oo
o
o
oo
o
o o
o
oo
o
oo
oo
o
oo
oo
o
o
ooo
o
o
o
o
o
o
oo
oo
o
oo
o oo
oo
ooo
oo
o
oo
o
o ooo
o
oo
o
o
o
o
oo
ooooo
o
oo
o
o
o
o
o
o
o
oo
oo
oo
oo
o
o
ooo o
o
o
o
oo
o oo
oooo
oo
o
o
ooo
o
oo
o
ooo o
ooo
ooo
o
oo
oooooooo
oo
o
o
o
ooo
o
oo
o
oo
oooo
oooo
oo
2 6 10 14
oo
oo
o oo
o
o
oo
o
oo
oo oo
oo
o
oo
oo
oo
ooo
o oo
oo
o o
o
o
o
oo
o
o
o
oo
o
o
o
oo
oo
oo
oo
o
o
oo
o
o
o
o
oo oo
ooo
o
oo
ooo
oo
ooo
o
oo
oo
o
o
o
o
oo
o
oo
oo
o
o
o
oo
o ooo
o o
o
o
ooo
oo
o
o
o
oo
oo
o
oo
oooo
o
o
o
o
ooo
o o
o
o
o
ooo
oo
ooo
oo
o
o
o
o
o oo
o
ooo
oo
o
o
oo
o
o
oo o
o
oo
oo oo
oo
o
o
o
oo
oo
o
oooo
o
o
oooo
oo
o
o
ooo o
ooo
o
o
o
oo
o
o
o
o
oo
oo
oo
oo oo
oo
ooo
oo
ooo
oooooo
oo
o
o
ooo
ooo
o
oo
o
o
o
o o
o
o
oo
o
o
o
oo o
o
ooo
oo
o
o
ooo o
o
o o
oo
oo
oo
ooo
o
oo
o
oo
o
oo
o
o
o
oo
o
o
o
ooo
o
ooo
o
o
o
o
o
o
oo
o o
o
o o
ooo
oo
oo
o
oo
o
o
o
o
oooo
o
oo
o
o
o
o
oo
ooooo
o
oo
o
o
o
o
o
o
ooo
oo
oo
oo
o
o
oo
oo
o
o
o
oo
ooo
o o oo
oo
o
o
ooo
o
oo
o
o ooo
ooo
oo o
o
oo
oo ooooo o
oo
o
o
o
oooo
o o
o
oo
oo
o
o
o ooo
oo
oo
oooo
o
o
o
oo
o
oo
ooo ooo
o
oo
oo
ooo oo
oo
o
oo
oo
o
o
o
oo
o
o
o
oo
o
o
oo
oo
o
oo
oo
o
o
oo
o
o
o
o
oooo
o
oo
o
oo
o oo
oo
o oo
o
oooo
o
o
o
o
ooo
ooo
oo
o
o
oo
ooo
o
oo
o
o
ooo
oo
o
o
o
oo
oo
o
oo
o ooo
o
o
oo
o oo
oo
o
o
o
ooo
oo
ooo
oo
o
o
o
o
ooo
o
ooo
oo
o
o
oo
o
o
o oo
o
oo
oooo
oo
o
o
o
oo
ooo
oo oo
o
o
oooo
oo
o
o
oo oo
ooo
o
o
o
oo
o
o
o
o
oo
oo
oo
oooo
oo
oo o
oo
o oo
o oo ooo oo
o
o
ooo
o
oo
o
o o
o
o
o
oo
o
o
oo
o
o
o
ooo
o
ooo
oo
o
o
oooo
o
oo
oo
oo
oo
ooo
o
oo
o
oo
o
o o
o
o
o
ooo
o
o
oo
o
o
o oo
o
o
o
o
o
o
oo
oo
o
oo
ooo
oo
oo
o
oo
o
oo
o
oooo
o
o o
o
o
o
ooo
ooooo
o
oo
o
o
o
o
o
o
o
ooo
o
ooo
o
o
o
oo
oo
o
o
o
oo
ooooooooo
o
o
ooo
o
oo
o
oooo
ooo
oo o
o
oo
oooo oo oo
ooo
o
o
oo oo
oo
o
oo
oo
o
o
oooo
oo
15 25 35 45
oo
oo
oo
o
o
o
oo
o
oo
ooo
ooo
o
oooo
oo
ooo
oo
o
oo
o o
o
o
o
oo
o
o
o
oo
o
o
o
ooo
o
oo
oo
o
o
oo
o
o
o
o
oooo
oo
o
o
oo
ooo
oo
ooo
o
oo
oo
o
o
o
o
oo
o
ooo
o
o
o
o
oo
ooo
o
o o
o
o
ooo
oo
o
o
o
oo
oo
o
oo
o ooo
o
o
oo
ooo
oo
o
o
o
ooo
oo
ooo
oo
o
o
o
o
oo
o
o
ooo
oo
o
o
oo
o
o
oo o
o
oo
oo oo
oo
o
o
o
oo
oo
o
ooo o
oo
ooo o
oo
o
o
oo oo
ooo
o
o
o
oo
o
o
o
o
oo
oo
o
o
oooo
oo
ooo
oo
ooo
oooooooo
o
o
ooo
ooo
o
o o
o
o
o
oo
o
o
oo
o
o
oooo
o
o o
o
oo
o
o
o ooo
o
oo
o
o
oo
oo
oo
o
o
oo
o
oo
o
o o
o
o
o
oo
o
oo
ooo
o
o oo
o
o
o
o
o
o
oo
oo
o
oo
ooo
oo
ooo
oo
o
oo
o
oooo
o
oo
o
o
o
ooo
o oo
o o
o
oo
o
o
o
o
o
o
oo
oo
o
ooo
o
o
o
oo
oo
o
o
o
oooo o
oo oo
oo
o
o
o ooo
oo
o
oooo
oo
o
oo o
o
o o
oooooo oo
oo
o
o
o
oooo
oo
o
oo
oo
oo
oooo
oo
oo
o
oo
oo
o
o
oo
o
oo
ooo
ooo
o
oooooo
ooo
o o
o
oo
oo
o
o
o
oo
o
o
o
oo
o
o
o
oo
oo
oo
oo
o
o
ooo
o
o
o
oooo
oo
o
o
oo
oo o
oooo o
o
oooo
o
o
o
o
ooo
oooo
o
o
o
oo
ooo
o
oo
o
o
o oo
oo
o
o
o
oo
o
o
o
oo
oo o o
o
o
oo
ooo
oo
o
o
o
ooo
oo
ooo
oo
o
o
o
o
o o
o
o
ooo
ooo
o
ooo
o
ooo
o
oo
oooo
oo
o
o
o
oo
ooo
oooo
oo
o ooo
oo
o
o
oo oo
ooo
o
o
o
oo
o
o
o
o
oo
oo
o
o
oooo
o o
oo o
oo
ooo
ooooooo
o
o
o
o oo
o
oo
o
oo
o
o
o
oo
o
o
oo
o
o
oooo
o
ooo
oo
o
o
oooo
o
oo
oo
oo
oo
ooo
o
oo
o
oo
o
oo
o
o
o
oo
o
o
o
oo
o
o
ooo
o
o
o
o
o
o
oooo
o
oo
o oo
o o
oo
o
oo
o
oo
o
o oo o
o
o o
o
o
o
oo
o
ooooo
o
oo
o
o
o
o
o
o
o
ooo
o
oooo
o
o
oooo
o
o
o
oo
ooo
oooo
o o
o
o
o ooo
oo
o
oo o o
oo
o
oo o
o
oo
oooooooo
ooo
o
o
ooo
o
oo
o
oo
oo
oo
oo oo
oo
20 40 60
2040
60
age
FIGURE 4.12. A scatterplot matrix of the South African heart disease data.Each plot shows a pair of risk factors, and the cases and controls are color coded(red is a case). The variable family history of heart disease (famhist) is binary(yes or no).
3D plot• use 3D plots to view surface
Contour Plot
[x, y, z] = peaks; figure; c = contour(x, y, z); clabel(c)
Matlab basics
• http://www.mathworks.com/moler/intro.pdf
>> a = 1a =
1
>> b = 1;
>> c = [1 2 3]c =
1 2 3
>> c+2ans =
3 4 5
EE263 RS1 6
Matlab basics (contd...)
>> cc =
1 2 3
>> d = [4;5;6]d =
456
>> c*dans =
32
>> e = d.’e =
4 5 6
EE263 RS1 7
Matlab basics (contd...)
>> c+eans =
5 7 9
>> c.*eans =
4 10 18
>> A = [1 2; 3 4]A =
1 23 4
>> A*[1;1]ans =
37
EE263 RS1 8
Matlab basics (contd...)
>> A(2,1)ans =
3
>> A(:,1)ans =
13
>> A(2,:)ans =
3 4
>> t = 0:2:10t =
0 2 4 6 8 10
EE263 RS1 9
Table as a matrix
• nutrition chart
Vegetablex1 x2 · · · xn
y1 0.50 0.75 · · · 0.9Nutrient ... ... ... ... ...
ym 2.05 0.01 · · · 0.45
• vector x ∈ Rn is the vegetable diet; xj is amount of vegetable j
• vector y ∈ Rm is the nutrients; yi is the amount of nutrient i
• y = Ax gives the nutrients as a function of the vegetable diet
• Aij = amount of nutrient i in 1 unit of vegetable j
EE263 RS1 10
Examples
• x ∈ Rn
• find A for which y = Ax is the running average of x, i.e.,
yi =1
i
i∑
j=1
xj, i = 1, . . . , n
Solution.
⎡
⎢
⎢
⎣
y1y2...yn
⎤
⎥
⎥
⎦
=
⎡
⎢
⎢
⎢
⎢
⎣
1 0 0 0 · · · 01/2 1/2 0 0 · · · 01/3 1/3 1/3 0 · · · 0... ... ... ... . . . ...
1/n 1/n 1/n 1/n · · · 1/n
⎤
⎥
⎥
⎥
⎥
⎦
⎡
⎢
⎢
⎣
x1
x2
...xn
⎤
⎥
⎥
⎦
EE263 RS1 11
Examples
• Creating A in Matlab
n=5;A = zeros(n,n);for i=1:n
A(i,1:i) = 1/i;endA
A =
1.0000 0 0 0 00.5000 0.5000 0 0 00.3333 0.3333 0.3333 0 00.2500 0.2500 0.2500 0.2500 00.2000 0.2000 0.2000 0.2000 0.2000
EE263 RS1 12
• point estimator
– likelihood function `(✓|x) = f(x|✓)– maximum likelihood ˆ
✓(x) = argmax✓ `(✓|x)
• hypothesis testing
– hypotheses: ⇢H0 : x1, . . . , xn ⇠ f0
H1 : x1, . . . , xn ⇠ f1
– likelihood ratio test
`(x1, . . . , xn) =
nX
i=1
log
f1(xi)
f0(xi)
Claim H1 if `(x1, . . . , xn) > b where b is threshold
• confidence interval: ˆ
✓ 2 [✓ � z↵, ✓ + z↵] with probability 1� ↵
Prof. Yao Xie, ISyE 6416, Computational Statistics, Georgia Tech 6