Since the Sample Data Form the Only Available Information on Which to Base Inferences

Embed Size (px)

Citation preview

  • 8/12/2019 Since the Sample Data Form the Only Available Information on Which to Base Inferences

    1/3

  • 8/12/2019 Since the Sample Data Form the Only Available Information on Which to Base Inferences

    2/3

    Let us start with the univariate case. Suppose that x1.., xn constitute a random sample of size n

    from a normal population with mean u and variance z. The two statistics that summarize all the

    available sample information about the unknown parameters (i.e are sufficient for u an z^2) are

    the sample information mean.and the sample variance..Because of this sufficiency, all

    optimal procedures concerning u and z are based on these two statistic, which are therefore the

    two most common statistics used in univariate inference. Both of these statistics possess sampling

    distributions that describe the likely fluctuations in their values from sample to sample, and it is

    shown in most elementary texts on statistical inference that:

    1)

    The sampling distribution of isthe normal distribution

    2)

    The sampling distribution oftimes

    3)

    .are statistically independent in the behaviour

    The central limit theorem furthermore establishes that, if n is large, the sampling distribution

    of.), whatever the shape of the parent population of the x1.

    Thus result(1) is applied almost invariably when n is sufficiently large(say n>25). No similar

    approximation exists for result(2), however, and result(3) is certainly not true for most non-normal

    distributions.

    Now consider the multivariate analogue of the above situation. Here x1x2 constitute a random

    sample of p-vectors from a multivariate normal population with mean vector u and dispersion

    matrix z. It can be shown readily that the two statistics containing all the available sample

    information about the unknown parameters, i.e. that are sufficient for u and z are sample mean

    vectorand the sample covariance matrix.

    These two statistics have already been used extensively in the descriptive techniques of part I. It is

    worth recollecting that the sample mean vector is just the vector of sample means of each of the

    variates, while the (j,k)th element of S contains the sample covariance..between the jth and kth

    variates(if=/k) or the sample variance.. of the kth variate(if j=k).

    It is also convenient to use the symbol C to de note the sample sum of squares and products(SSP)

    matrix, so that. Now, we saw in chapter 7 that the multivariate normal and Wishart distributions

    are the multivariate analogues of the univariate normal an chi-squared distributions respectively.

    Hence it will come as no surprise to find that:

    (1)

    The sampling distribution of x is the multivariate normal distribution whit mean vector u anddispersion matrix 1/nz.

    (2)

    The sampling distribution of c= is the Wishart distribution with (n-1) degrees of freedom and

    parameter z, and

    (3)

    X and c are statistically independent in their behaviour.

  • 8/12/2019 Since the Sample Data Form the Only Available Information on Which to Base Inferences

    3/3