Since the Sample Data Form the Only Available Information on Which to Base Inferences

8/12/2019 Since the Sample Data Form the Only Available Information on Which to Base Inferences

1/3


2/3

Let us start with the univariate case. Suppose that x1.., xn constitute a random sample of size n

from a normal population with mean u and variance z. The two statistics that summarize all the

available sample information about the unknown parameters (i.e are sufficient for u an z^2) are

the sample information mean.and the sample variance..Because of this sufficiency, all

optimal procedures concerning u and z are based on these two statistic, which are therefore the

two most common statistics used in univariate inference. Both of these statistics possess sampling

distributions that describe the likely fluctuations in their values from sample to sample, and it is

shown in most elementary texts on statistical inference that:

1)

The sampling distribution of isthe normal distribution

2)

The sampling distribution oftimes

3)

.are statistically independent in the behaviour

The central limit theorem furthermore establishes that, if n is large, the sampling distribution

of.), whatever the shape of the parent population of the x1.

Thus result(1) is applied almost invariably when n is sufficiently large(say n>25). No similar

approximation exists for result(2), however, and result(3) is certainly not true for most non-normal

distributions.

Now consider the multivariate analogue of the above situation. Here x1x2 constitute a random

sample of p-vectors from a multivariate normal population with mean vector u and dispersion

matrix z. It can be shown readily that the two statistics containing all the available sample

information about the unknown parameters, i.e. that are sufficient for u and z are sample mean

vectorand the sample covariance matrix.

These two statistics have already been used extensively in the descriptive techniques of part I. It is

worth recollecting that the sample mean vector is just the vector of sample means of each of the

variates, while the (j,k)th element of S contains the sample covariance..between the jth and kth

variates(if=/k) or the sample variance.. of the kth variate(if j=k).

It is also convenient to use the symbol C to de note the sample sum of squares and products(SSP)

matrix, so that. Now, we saw in chapter 7 that the multivariate normal and Wishart distributions

are the multivariate analogues of the univariate normal an chi-squared distributions respectively.

Hence it will come as no surprise to find that:

(1)

The sampling distribution of x is the multivariate normal distribution whit mean vector u anddispersion matrix 1/nz.

(2)

The sampling distribution of c= is the Wishart distribution with (n-1) degrees of freedom and

parameter z, and

(3)

X and c are statistically independent in their behaviour.


3/3

Documents

Since the Sample Data Form the Only Available Information on Which to Base Inferences