531
Statistical Inference Notes of David Casado de Lucas You can decide not to print this file and consult it in digital format – paper and ink will be saved. Otherwise, print it on recycled paper, double-sided and with less ink. Be ecological. Thank you very much. 3 December 2017

Notes of Statistical Inference - Casado dcasado-d.org/edu/NotesStatisticalInference-Slides.pdf · Statistical Inference ... these kinds of project from time to time. ... complete

Embed Size (px)

Citation preview

  • Statistical Inference

    Notes of

    David Casado de Lucas

    You can decide not to print this file and consult it in digital format paper and ink will be saved. Otherwise, print it on recycled paper, double-sided and with less ink. Be ecological. Thank you very much.

    3 December 2017

    http://www.Casado-D.org/edu/teaching.htmlhttp://www.Casado-D.org/edu/index.html

  • Contents Inference Theory Point Estimations Confidence Intervals Hypothesis Tests Appendixes More On Statistics Exercises and Problems

    Additional Theory

    Statistical Kitchen

    Further Readings

    Probability Theory Mathematics

    Tables of Estimators and Statistics

    Names of sections are usually links too.

    To use these textboxes you must overwrite the file or

    save it with a different name.

    Errata and linguistic errors are corrected as soon as possible. You may want to update (download and overwrite) the version of this file you might have.

    Links to the beginning of the document

    and the chapter, respectively.

    19132240307

    438439450469

    474

    479515

    62

    http://www.casado-d.org/edu/NotesStatisticalInference-Slides.pdfhttp://www.casado-d.org/edu/NotesStatisticalInference-Slides.pdf

  • This file contents the slides that I am writing for my students. I try to consider pieces of advice included in:

    http://www.Casado-D.org/edu/GuideForStudents-Slides.pdf

    Solved exercises and problems are available at:

    http://www.Casado-D.org/edu/ExercisesProblemsStatisticalInference.pdf

    Prologue

    This document has been created with Linux, LibreOffice, OpenOffice, GIMP and R. (They allow me to work with my old computers.) I thank those who make this software available for free. I donate funds to these kinds of project from time to time.

    Acknowledgements

    3

    http://www.casado-d.org/edu/GuideForStudents-Slides.pdfhttp://www.casado-d.org/edu/ExercisesProblemsStatisticalInference.pdf

  • MotivationOne of two ways in which students can use this new book is as a supplementary text in a course that demands some statistical thinking but does not focus on statistics. The other use is as a self-teaching preparation for a course that does focus on statistics. It has been my observation, and that of my colleagues, that it is possible for a student to complete such a course without every really thinking about statistics. Many students learn to do the required calculations but have only the foggiest conception of what the calculations mean.

    [...]

    You may be planning to study statistics not because you want to but because you have to. If so, I know how you feel. I went through the same experience years ago; if I could have avoided statistics, I probably would have. However, my attitude changed after I began to study it, for I discovered in it a new way of thinking that was truly fascinating.

    But your present task may be even more challenging than mine was. You won't have to do the computations that I did, but you are about to acquire whithin a very short time (and possibly by yourself) the same grasp of the underlying structure of statistics that I acquired in two full semesters under an excellent teacher.

    (From: How to Think about Statistics. Phillips, J.L. W.H. Freeman and Company.)

    4

  • MotivationSomething [...] did happen with the draft lottery in the United States in 1970. People were assigned draft numbers on the basis of their birth dates, with a low number indicating a greater chance of being inducted. The 366 dates were put into capsules, mixed, and drawn and assigned lottery numbers 1, 2, etc. Apparently, the capsules were not mixed very wellpeople born in December had lottery number that averaged 121.5, which is pretty far away from the average of #1-366=183.5. Steps were taken with the 1971 draft lottery to make the results more random by drawing both the date and the lottery number from drums, after mixing them more thoroughly.

    [...]

    A good example of a nonrandom sample was the 1936 Literary Digest presidential election poll. The Literary Digest had 2 million people respond to its poll, which is a much larger number than would have been needed to get and accurate result if the sample had been selected randomly. However, the poll predicted that Alfred Landon would be an easy winner, whereas in fact Franklin D. Roosevelt won by a landslide. The problem was that the Digest sample was not a random sample. The magazine mailed out cards to people whose names were obtained from telephone lists and other sources, but the people who had telephones at that time were not representative of the population as a whole. If a sample is not selected randomly their is no way to estimate how far off it might be.

    (From: Business Statistics. Downing, D., and J. Clark. Barron's.)

    5

  • Suppose you are taking a 20-question multiple-choice exam. Each question has four possible answers, so the probability is .25 that you can answer a question correctly by guessing. What is the probability that you can get at least 10 questions right by pure guessing?

    (From: Business Statistics. Downing, D., and J. Clark. Barron's Educational Series.)

    A sporting goods store operates in a medium-sized shopping mall. In order to plan staffing levels, the manager has asked for your assistance to determine if there is strong evidence that Monday sales are higher than Saturday sales.

    (From: Statistics for Business and Economics. Newbold, P., W. Carlson and B. Thorne. Pearson-Prentice Hall.)

    Motivation6

  • Probability TheoryReview of concepts. Basic formulas. Some well-known distributions. Continuous probability models linked to the normal distribution: 2, t and F. Sums and sequences of independent random variables: theorems, modes of convergence, the central limit theorem.

    Inference TheoryConcept of sample. Types of sampling. Concepts of statistic and estimator. Sampling distribution. Main statistics and how they are used.

    Point EstimationsEstimation. Estimators of and 2: sample mean, sample proportion, sample variante, difference of means and difference of proportions, ratio of variances. Methods to estimate : maximum likelihood method and method of the moments. Properties of the estimators: unbiasedness, mean square error, efficiency, consistency.

    Confidence IntervalsConcept of confidence interval. Construction of confidence intervals: the method of the pivotal quantity. Main cases: mean, proportion, variance, difference of means, difference of proportions, quotient of variances. Margin of error. Minimum sample size.

    Hypothesis TestsTypes of statistical hypotheses. Type I and type II errors. Critical or rejection region. P-value. Parametric tests on the: mean, proportion, variance, difference of means, difference of proportions or quotient of variances. Power function. Likelihood ratio tests. Analysis of variance (ANOVA). Nonparametric tests: goodness-of-fit, independence, homogeneity. Chi-square tests. Kolmogorov-Smirnov tests. Analysis of Variance.

    7

    Syllabus

  • Probability TheoryReview of concepts. Basic formulas. Some well-known distributions. Continuous probability models linked to the normal distribution: 2, t and F. Sums and sequences of independent random variables: theorems, modes of convergence, the central limit theorem.

    Inference TheoryConcept of sample. Types of sampling. Concepts of statistic and estimator. Sampling distribution. Main statistics and how they are used.

    Point EstimationsEstimation. Estimators of and 2: sample mean, sample proportion, sample variante, difference of means and difference of proportions, ratio of variances. Methods to estimate : maximum likelihood method and method of the moments. Properties of the estimators: unbiasedness, mean square error, efficiency, consistency.

    Confidence IntervalsConcept of confidence interval. Construction of confidence intervals: the method of the pivotal quantity. Main cases: mean, proportion, variance, difference of means, difference of proportions, quotient of variances. Margin of error. Minimum sample size.

    Hypothesis TestsTypes of statistical hypotheses. Type I and type II errors. Critical or rejection region. P-value. Parametric tests on the: mean, proportion, variance, difference of means, difference of proportions or quotient of variances. Power function. Likelihood ratio tests. Analysis of variance (ANOVA). Nonparametric tests: goodness-of-fit, independence, homogeneity. Chi-square tests. Kolmogorov-Smirnov tests. Analysis of Variance.

    What is the probability for something to occur?How do we calculate the average or the spread of a quantity?...

    Main concepts of Probability: random experiment, random variable, probability function, distribution function, mean, variance, etc.Main discrete and continuous models: Bernoulli, binomial, Poisson, uniform, normal, etc.New models of probability distributions: 2, t and F.Calculation of probabilities and quantiles.

    8

    Syllabus

  • Probability TheoryReview of concepts. Basic formulas. Some well-known distributions. Continuous probability models linked to the normal distribution: 2, t and F. Sums and sequences of independent random variables: theorems, modes of convergence, the central limit theorem.

    Inference TheoryConcept of sample. Types of sampling. Concepts of statistic and estimator. Sampling distribution. Main statistics and how they are used.

    Point EstimationsEstimation. Estimators of and 2: sample mean, sample proportion, sample variante, difference of means and difference of proportions, ratio of variances. Methods to estimate : maximum likelihood method and method of the moments. Properties of the estimators: unbiasedness, mean square error, efficiency, consistency.

    Confidence IntervalsConcept of confidence interval. Construction of confidence intervals: the method of the pivotal quantity. Main cases: mean, proportion, variance, difference of means, difference of proportions, quotient of variances. Margin of error. Minimum sample size.

    Hypothesis TestsTypes of statistical hypotheses. Type I and type II errors. Critical or rejection region. P-value. Parametric tests on the: mean, proportion, variance, difference of means, difference of proportions or quotient of variances. Power function. Likelihood ratio tests. Analysis of variance (ANOVA). Nonparametric tests: goodness-of-fit, independence, homogeneity. Chi-square tests. Kolmogorov-Smirnov tests. Analysis of Variance.

    How should we study a characteristic of a population?How can we take a representative collection of data?How can we infer population characteristics from sample information?...

    Main concepts of Statistical Inference: population, sample, types of sampling, statistics, sampling distribution, etc.

    9

    Syllabus

  • Probability TheoryReview of concepts. Basic formulas. Some well-known distributions. Continuous probability models linked to the normal distribution: 2, t and F. Sums and sequences of independent random variables: theorems, modes of convergence, the central limit theorem.

    Inference TheoryConcept of sample. Types of sampling. Concepts of statistic and estimator. Sampling distribution. Main statistics and how they are used.

    Point EstimationsEstimation. Estimators of and 2: sample mean, sample proportion, sample variante, difference of means and difference of proportions, ratio of variances. Methods to estimate : maximum likelihood method and method of the moments. Properties of the estimators: unbiasedness, mean square error, efficiency, consistency.

    Confidence IntervalsConcept of confidence interval. Construction of confidence intervals: the method of the pivotal quantity. Main cases: mean, proportion, variance, difference of means, difference of proportions, quotient of variances. Margin of error. Minimum sample size.

    Hypothesis TestsTypes of statistical hypotheses. Type I and type II errors. Critical or rejection region. P-value. Parametric tests on the: mean, proportion, variance, difference of means, difference of proportions or quotient of variances. Power function. Likelihood ratio tests. Analysis of variance (ANOVA). Nonparametric tests: goodness-of-fit, independence, homogeneity. Chi-square tests. Kolmogorov-Smirnov tests. Analysis of Variance.

    How can the real value of a population measure or parameter be approximated?Which estimator should we consider?How can the quality of an estimator be measured?How does an estimator behave when the amount of information increases?...

    Two methods to find estimators of any parameter .Properties and quality of all these estimators.Some well-known estimators of the population measures and 2.

    10

    Syllabus

  • Probability TheoryReview of concepts. Basic formulas. Some well-known distributions. Continuous probability models linked to the normal distribution: 2, t and F. Sums and sequences of independent random variables: theorems, modes of convergence, the central limit theorem.

    Inference TheoryConcept of sample. Types of sampling. Concepts of statistic and estimator. Sampling distribution. Main statistics and how they are used.

    Point EstimationsEstimation. Estimators of and 2: sample mean, sample proportion, sample variante, difference of means and difference of proportions, ratio of variances. Methods to estimate : maximum likelihood method and method of the moments. Properties of the estimators: unbiasedness, mean square error, efficiency, consistency.

    Confidence IntervalsConcept of confidence interval. Construction of confidence intervals: the method of the pivotal quantity. Main cases: mean, proportion, variance, difference of means, difference of proportions, quotient of variances. Margin of error. Minimum sample size.

    Hypothesis TestsTypes of statistical hypotheses. Type I and type II errors. Critical or rejection region. P-value. Parametric tests on the: mean, proportion, variance, difference of means, difference of proportions or quotient of variances. Power function. Likelihood ratio tests. Analysis of variance (ANOVA). Nonparametric tests: goodness-of-fit, independence, homogeneity. Chi-square tests. Kolmogorov-Smirnov tests. Analysis of Variance.

    Why should we base an estimation on a unique numerical value (and its standard deviation)? What is the level of certainty of this value?How can we select an interval of values around the unknown population value?What is the level of certainty of an interval?What is the maximum error (in probability) of an interval?Given the maximum error, what is the minimum number of data necessary to guarantee it?

    Main statistics to study the population measures and 2.

    Method of the Pivotal Quantity to construct confidence intervals.Confidence.Margin of error. Minimum sample size.

    11

    Syllabus

  • Probability TheoryReview of concepts. Basic formulas. Some well-known distributions. Continuous probability models linked to the normal distribution: 2, t and F. Sums and sequences of independent random variables: theorems, modes of convergence, the central limit theorem.

    Inference TheoryConcept of sample. Types of sampling. Concepts of statistic and estimator. Sampling distribution. Main statistics and how they are used.

    Point EstimationsEstimation. Estimators of and 2: sample mean, sample proportion, sample variante, difference of means and difference of proportions, ratio of variances. Methods to estimate : maximum likelihood method and method of the moments. Properties of the estimators: unbiasedness, mean square error, efficiency, consistency.

    Confidence IntervalsConcept of confidence interval. Construction of confidence intervals: the method of the pivotal quantity. Main cases: mean, proportion, variance, difference of means, difference of proportions, quotient of variances. Margin of error. Minimum sample size.

    Hypothesis TestsTypes of statistical hypotheses. Type I and type II errors. Critical or rejection region. P-value. Parametric tests on the: mean, proportion, variance, difference of means, difference of proportions or quotient of variances. Power function. Likelihood ratio tests. Analysis of variance (ANOVA). Nonparametric tests: goodness-of-fit, independence, homogeneity. Chi-square tests. Kolmogorov-Smirnov tests. Analysis of Variance.

    Can a population measure be considered bigger than five (e.g.)?Is a variable distributed with the same spread in two different populations?Are two populations really different?Should a probability distribution be used to represent the variable an experiment?...

    Main concepts to test hypothesis: types of hypotheses, type I and type II errors, methodologies, certainty of the decision, significance, etc.Parametric tests: questions based on the population measures or a parameter.Analysis of Variance: comparison of the means for several populations.Nonparametric tests: questions based on general characteristics of the population.

    12

    Syllabus

  • There are three main chapters and some additional ones. The last chapters may be the first in preparing the subject, since they are tools.

    Within each chapter, there is a main body of slides with the basic contents plus some appendixes with complementary ideas that should or can be used for students according to their background and interests.

    Theory is difficult to understand without solving exercises (and quality is more important than quantity). A document with dozens of solved exercises is available. There are many proposed exercises, which can be used for self- -evaluation, with their solutions at the end of the chapters.

    The slides with the practicals are also at the end of each chapter.

    A good way of learning Statistical Inference may be based on the alternate use of the textbook, these slides (they may be useful to order concepts and ideas, since they are thought not only for lectures but also for students to read them autonomously) and the exercises.

    How to Use These Slides13

  • SymbolsHow to Use These Slides

    Some slides contain or are marked with any of the following symbols to mean that:

    It is useful for you to get a general view of the document

    This result plays a role though we do not use it directly but in an easier way.

    This slide mentions some further readings you may consider.

    This formula is looked at carefully to understand it thoroughly.

    This slide or section contains steps that may be useful for beginners.

    This slide contains tricky uses of Statistics you may be aware of.

    14

  • [1] Downing, D., and J. Clark. Business Statistics. Barron's Educational Series.

    [5] Newbold, P., W. Carlson and B. Thorne. Statistics for Business and Economics. Pearson-Prentice Hall.

    [8] Wikipedia http://www.wikipedia.org/

    [3] Grimmett, G., and D. Stirzaker. Probability and Random Processes. Oxford University Press.

    [4] Mendenhall, W., D.D. Wackerly and R.L. Scheaffer. Mathematical Statistics with Applications. Duxbury Press.

    [2] Frank, H., and S.C. Althoen. Statistics: Concepts and Applications. Cambridge University Press.

    [7] Serfling, R.J. Approximation Theorems of Mathematical Statistics. John Wiley & Sons.

    References (I) Theory15

    [6] Prez, C. Tcnicas de muestreo estadstico. Garceta.

    http://www.wikipedia.org/

  • http://www.picgifs.com

    http://actividades.parabebes.comThe Cartoon Guide to Statistics.

    Larry Gonick and Woollcott Smith.

    Harper.

    http://blogs.20minutos.es/...(Fotos: FIFA)

    I have not been able to find other links. And the source is not always clear.)

    http://all-free-download.com/

    References (II) Symbols16

    http://www.picgifs.com/http://actividades.parabebes.com/http://blogs.20minutos.es/que-paso-en-el-mundial/2014/05/http://all-free-download.com/

  • [1] Downing, D., and J. Clark. Business Statistics. Barron's Educational Series.

    [4] Newbold, P., W. Carlson and B. Thorne. Statistics for Business and Economics. Pearson-Prentice Hall.

    [2] Kazmier, L.J. Business Statistics. McGraw Hill.

    [3] Mann, P.S. Introductory Statistics. John Wiley & Sons, Inc.

    [6] Spiegel, M.R. and L.J. Stephens Statistics. McGraw Hill.

    [7] Wikipedia http://www.wikipedia.org/

    [4] Miller, I., and M. Miller John E. Freund's Mathematical Statistics with Applicationsn. Pearson.

    References (III) Exercises and Problems17

    [5] The R Project for Statistical Computing

    http://www.r-project.org/

    [8] Materials of my Department

    http://www.wikipedia.org/http://www.r-project.org/

  • References (IV) My Documents18

    [1] A Brief Guide for Students. http://www.Casado-D.org/edu/GuideForStudents-Slides.pdf

    [2] Notes of Probability Theory. http://www.Casado-D.org/edu/NotesProbabilityTheory-Slides.pdf

    [3] Notes of Statistical Inference. http://www.Casado-D.org/edu/NotesStatisticalInference-Slides.pdf

    [4] Solved Exercises and Problems of Statistical Inference. http://www.Casado-D.org/edu/ExercisesProblemsStatisticalInference.pdf

    [5] R Code Applied to Statistics. http://www.Casado-D.org/edu/CodeAppliedToStatistics-Slides.pdf

    http://www.casado-d.org/edu/GuideForStudents-Slides.pdfhttp://www.casado-d.org/edu/NotesProbabilityTheory-Slides.pdfhttp://www.casado-d.org/edu/NotesStatisticalInference-Slides.pdfhttp://www.casado-d.org/edu/ExercisesProblemsStatisticalInference.pdfhttp://www.casado-d.org/edu/CodeAppliedToStatistics-Slides.pdf

  • Inference Theory

    19

    Sections

  • Probability TheoryReview of concepts. Basic formulas. Some well-known distributions. Continuous probability models linked to the normal distribution: 2, t and F. Sums and sequences of independent random variables: theorems, modes of convergence, the central limit theorem.

    Inference TheoryConcept of sample. Types of sampling. Concepts of statistic and estimator. Sampling distribution. Main statistics and how they are used.

    Point EstimationsEstimation. Estimators of and 2: sample mean, sample proportion, sample variante, difference of means and difference of proportions, ratio of variances. Methods to estimate : maximum likelihood method and method of the moments. Properties of the estimators: unbiasedness, mean square error, efficiency, consistency.

    Confidence IntervalsConcept of confidence interval. Construction of confidence intervals: the method of the pivotal quantity. Main cases: mean, proportion, variance, difference of means, difference of proportions, quotient of variances. Margin of error. Minimum sample size.

    Hypothesis TestsTypes of statistical hypotheses. Type I and type II errors. Critical or rejection region. P-value. Parametric tests on the: mean, proportion, variance, difference of means, difference of proportions or quotient of variances. Power function. Likelihood ratio tests. Analysis of variance (ANOVA). Nonparametric tests: goodness-of-fit, independence, homogeneity. Chi-square tests. Kolmogorov-Smirnov tests. Analysis of Variance.

    Chapters20

  • Explain some basic ideas and concepts of Statistics. Describe the sampling processconvenience or

    necessity.

    Present the main kinds of sampling. The simple random sampling.

    Define the concepts of statistic and its sampling distribution.

    Define the concepts of estimator, estimate and estimation.

    Present and motivate the statistics we work with. Use software to practice some of the concepts.

    Chapter Goals 21

  • Among the basic ways of selecting the elements of a sample, only the simple random sampling will be considered. Why Statistics works is motivated through the convergence of the histogram to the probability function or the sample distribution function to the population counterpart (thanks to the laws of large numbers).

    The mathematical functions that use the sample to study the population are theoretically studied for any possible sample before using them with a particular sample. Few tables contain all the estimators and statistics necessary for our methods.

    For the population quantities of interest, the types of statistical problem and the cases that we deal with.

    Basic concepts: randomness, units of measurement, quantities of interest, population, sample, sampling, histograms, use of data, etcTypes of problem Point estimations Confidence intervals Hypothesis testsStatistics and estimators Statistics Estimators Statistics made with estimators Sampling distribution Tables of statistics TsCasesStatistical Studies Steps Qualities Useful questionsUse of Ts How T is usually used Notation FrameworkAppendixes: practicals, guide for students, inference in other fields , etc

    Advice to understand how the estimators and the statistics are used to solve the problems, and how mathematical notation must be interpreted. Finally, a summary with the conditions under which we work.

    Contents 22

  • Sections

    ( Introduction: Basic Concepts )

    Inference Theory

    Types of Problem

    Statistics and Estimators

    Cases

    Statistical Studies

    Use of T's

    23

    ( Appendixes: Practicals)

    Statistical Inference in Other FieldsA Brief Guide for Students

  • Quantity of interest

    Variable of interest

    Deterministic Random or stochastic

    Total knowledge about the process

    generating the values.

    Partial knowledge (values and probabilities) about the process

    generating the values.

    Probability distributions (for the variable) are used to explain the real relation

    values-probabilities (quantity of interest).

    Statistics exploits data (some values and their frequencies) in order to select and

    study a probability distribution model (values and their probabilities) in order to

    explain the process of interest.

    What we'll do, both theoretically and in practice.

    Basic Concepts 24

    Randomness

  • We are frequently interested in a characteristicvariableof the elements of a grouppopulation, or we may have interest in how a property behaves in two populations or more.When population variables are supposed to be stochastic, Probability Theory provides a framework to try explaining them. The most important quantities are the measures

    = E(X) and 2 = Var (X) = E([X ]2)If any (well-known) parametric probability distribution is used as a model to explain the variable X (model-based approach), we are interested in studying its parameters (e.g. , , ) or the entire distribution, that is:

    FX(x)There is a relation between the measures and and the parameters (for the normal distribution, and are directly used as parameters in the density function).

    Basic Concepts 25

  • Studying each element of the population is too long-lasting, expensive or even impossible (the population is infinite or it is necessary to broke or spoil the elements).

    Fortunately, it is possible to consider a sample of elementsusually few, with respect to the size of the populationby applying proper sampling techniques so as to guarantee that they are representative and therefore we will succeed in inferring population information. Additionally, considering a sample can reduce some types of error.

    There are statistical techniques designed to describe the most important characteristics of both models and dataDescriptive Statistics, to select the model that best suits some data or to approximate the parameters of a particular modelInferential Statistics, or, once a good enough model has been found, to predict or forecast future valuesPredictive Statistics.

    When the main statistical process involves distributions with parameters , we talk about Parametric Statistics; otherwise, about Nonparametric (or Distribution-Free) Statistics.

    Basic Concepts 26

  • 1 coin

    independent coins

    1 dice

    Economic problem

    X ~ B()

    X ~ Bin(,)

    X ~ UnifDisc(6)

    X ~ F with f(x)

    Methods that use the sample

    X = {X1,...,Xn}

    (1) To study = E(X)

    (2) To study 2 = Var(X)

    (3) To study the parameter(s) (and hence possibly and 2 too)

    (4) To study a characteristic (e.g. median of X)

    (5) To study the whole FX

    Real-World ProbabilityModel Statistics

    Relation between , and

    For most distributions, appears in the expression of and . Examples Bernoulli: = , and then == and 2 == (1) Poisson: = , and then == and 2 == Normal: = (,), and then == and 2 == 2

    (For discrete variables, sums instead of integrals.)

    Thus, when we estimate we obtain natural (plug-in) point estimates of and

    =E(X )= x f ( x)dx 2=E([XE(X )]2)= (x)2 f (x )dx

    Basic Concepts 27

  • The Statistical World

    Theory

    Practice

    Theoretical population

    Empirical population

    Element of the population

    Empirical sample subset

    Theoretical sample subset

    X1X n

    X 2

    X X

    XX

    X

    XX

    X F (x ;)

    =E(X )2=Var (X )

    f (x ;)

    X={X1 ,... , X n}

    x1xn

    x2

    x x

    xx

    x

    x

    x

    x={x1, ... , xn}

    Variable of interestElement of the

    sample and variable

    Data

    X

    Inferential Process

    ...

    ...

    Tools of Probability Theory used for: Point Estimations Confidence Intervals Hypothesis Tests Other types of problem Formulas

    Basic Concepts 28

    ...

  • =E(X )

    Deduction InductionInference

    2=Var (X )

    (X )

    (x )

    Theory

    Practice

    Theoretical population

    Empirical population

    X

    Random variableParameters and main measures

    T (X )

    T ( x)

    Evaluations of the estimator and

    the statistic

    Theoretical sample

    x

    Possible values of the random variable

    X={X 1 , X 2 ,... , X n}

    x={x1 , x2 , ... , xn}

    Estimator and statistic

    Empirical sample

    Empirical sample subset

    Theoretical sample subset

    Probability function (for X continuous) and expected histogram

    Empirical histogram

    x= 1N x i

    X= 1N X i

    f ( x ;)

    Basic ConceptsThe Statistical World

    29

  • Basic Concepts 30

    RandomnessHaving partial knowledge and using only some elements of the population implywe can only hypothesize about the other elementsthat variables must be assigned a random character, on the one hand, and that the results will have no total certainty in the sense that statements will be set with some probability, on the other hand. For example: a 95% confidence in applying a method must be interpreted as any other probability: the results are true with probability 0.95 and false with probability 10.95 (frequently, we will never know if the method has failed or not).

    Units of MeasurementIn Probability Theory, random variables are dimensionless quantities; in real- -life problems, variables almost always are not. Since usually this fact does not cause troubles in Statistics, we do not pay much attention to the units of measurement, and we can understand that the magnitude of the real-life variable, with no unit of measurement, is the part that is being modeled by using the proper probability distribution with the proper parameter values (of course, units of measurement are not random). To get used to pay attention to the units of measurement and to manage them, they are written in many numerical expressions.

  • Naranjito Has a Question

    Yes. In fact, I have The Question:

    What can these things be used for?

    Basic Concepts 31

  • The Roper Organization conducted a poll in 1992 (Roper, 1992) in which one of the questions asked was whether or not the respondent had ever seen a ghost. Of the 1525 people in the 18 to 29-year-old age group, 212 said yes.a. What is the risk of someone in this age group seeing a ghost?b. What is the approximate margin of error that accompanies the proportion in (a)?c. What is the interval that is 95% certain to contain the actual proportion of people in this age group who have seen a ghost?

    (From: Mind on Statistics. Utts, J.M., and R.F. Heckard. Thomson.)

    The U.S. Senate has 100 members. Information was obtained from the individuals responsible for managing correspondence in 61 senators' offices. Of these, 38 specified a minimum number of letters that must be received on an issue before a form letter in response is created.a) Assume these observations constitute a random sample from the population, and find a

    90% confidence interval for the proportion of all senators' offices with this policy.b) In fact, information was not obtained from a random sample of senate offices.

    Questionnaires were sent to all 100 offices, but only 61 responded. How does this information influence your view of the answer to part (a)?(From: Statistics for Business and Economics. Newbold, P., W. Carlson and B. Thorne. Pearson-Prentice Hall.)

    Real-World ProblemsBasic Concepts

    32

  • Real-World ProblemsBasic Concepts

    33

    Well, lets hum a little... [Population] There is a huge population of 18 to 29-year-old students (it cannot be

    determined in more detail from the information in the statement). [Model] The answer could be no or yes, so we can model any student of the

    population through a Bernoulli variable X whose parameter would be the probability for any student to answer yes (this is a model-based approach).

    [Sample] A sample of n = 1525 students was gathered (by applying simple random sampling with or without replacement) and the data are (x1,,x1525), where 212 answers are yes and the others are no.

    [Strategy] We need an estimator of to use the data. That risk they talk about is the probability mentionedthe parameter . I must look for the definition of margin of error and how to calculate it. Finally, I must learn a method to build confidence intervals. On the other hand, we will work with random variables, (X1,,Xn), while doing the calculations and until the statistical variables (x1,,xn) are finally substituted in the theoretical expressions.

    We must interpret the probability, the margin of error and the confidence interval (in Statistics, there is always a measure of error).

    Thoughts

  • StatisticsSamples

    Sampling

    X = {X1,...,Xn} sample of elements

    Variable: Presence of the policy (of a senator)Interest: Distribution, mean and variance of the variableCharacter: Since we cannot control all the details of the making process, the variable is treated as a random variable

    ProbabilityRandom variable X Presence, =E(X), 2=Var(X)Which distribution explains X best?

    Population

    Methods

    How should the elements of the sample be selected?How many of them?

    ToolsConcretely, what do we want to study about the variable presence of the policy?How do we use the sample?How trustworthy will our conclusions be?

    Basic ConceptsReal-World Problem

    34

  • Real-World Question

    Probability Theory

    Probability Theory

    Mathematical Formulas

    Numerical Results

    Statistical Interpretation

    Real-World Answer

    Quantity partially known: Values Frequenciesx = {x1,...,xn}

    Random variable: Values Probabilitiesf(x;)X = {X1,...,Xn}

    Basic Concepts 35

    Subject

    (1) (2)

    (3)

    (4)(5)

    In some exercises we go overthrough arrows 2 to 4; in others, over arrows 1 to 5.

  • Population: set of elements in which we are interested. Examples: (1) All possible clients in a region. (2) The light bulbs of a batch. (3) All five-year-old children of the world.

    Parameter: fixed quantitynot a variablethat appears in the expression of the functions of the random variables.

    Sample: subset of the population that is considered. Examples: (1) Some potential clients randomly selected for interview. (2) The first six light bulbs of a batch of one thousand. (3) The five-year-old children living near certain interviewers.

    Sampling: process to properly select the elements of the sample from the elements of the population.

    Note: When working with two populations, we assume that they are independent, meaning that their values do not influence each other (models X and Y are independent). This independence between populations is different from the independence within samples (Xi independent, and, on the other hand, Yi independent).Sometimes data are paired to reduce the effectvariabilityof a factor (e.g. the person who manages a machine); these paired populations need special statistical methods.

    36Basic Concepts

  • Simple: each element is selected independently and with the same probability. E.g.: Inhabitants are selected with the same probability and independently from the whole country.

    Cluster: elements are previously grouped into subsets (clusters) as similar as possible among them and to the population. E.g.: Inhabitants are selected with the same probability and independently from some cities representative of the country.

    Stratified: elements are previously grouped, by using a characteristic or factor, into subsets (strata) as different as possible among them. E.g.: To analyse the possible effect of the city size, inhabitants are selected with the same probability and independently from some cities of quite different size.

    Basic Typesof

    Sampling

    Simple Random SamplingsThe main theoretical implications for us are the following:

    ...

    General formulas:

    We work under this type of sampling. This implies that random variables will be independent copies of the model

    Var (X 1X 2)=Var (X 1)+Var (X 2)2 cov (X 1, X 2)

    Var (X 1X 2)=Var (X 1)+Var (X 2)

    Note: Applying the appropriate sampling allows saving money: e.g. by reducing the travels in the cluster sampling or by quickly attaining the necessary sample sizes in the stratified sampling. Additionally, not all the elements of the population can always be accessed. On the other hand, it does not matter whether the sampling is applied with or without replacement since we assume that n j

    kcov (X i , X j)

    Under independence:

    Var ( j=1k

    X j)= j=1k

    Var (X j)E ( j=1k

    X j )= j=1kE(X j)

    E (X 1X 2)=E (X 1)E (X 2)

    Basic Concepts

    F (x1 , x2 , ... , xn)= j=1n

    F X j(x j) f ( x1 , x2 ,... , xn)= j=1n

    f X j( x j)

  • Note: If the sample is not representative of the population, the inferential process will fail. Thus, paying attention to the sampling process applied must be the first step in reading and interpreting any statistical analysis.

    A small population and three possible samples

    Taken from: R Code Applied to Statistics. David Casado de Lucas. http://www.Casado-D.org/edu/CodeAppliedToStatistics-Slides.pdf( )

    Appropriate sampling: the sample does represent the population

    (Trustworthy results.)

    Inappropriate sampling: the sample does not represent the population(Untrustworthy results.)

    38Basic Concepts

    http://www.casado-d.org/edu/CodeAppliedToStatistics-Slides.pdfhttp://www.casado-d.org/edu/CodeAppliedToStatistics-Slides.pdf

  • ProbabilityFunction

    X random variable

    ExpectedHistogram

    EmpiricalHistogram

    Discrete Continuous

    Mass function Density function

    Probability for X totake a value in Ci

    Expected absolutefrequency of the i-th

    class (think about theexpectation of the

    binomial indicator variablenumber of trials inside Ci)

    C i

    Sample X1,...,Xn Sample x1,...,xn

    Empirical absolutefrequency of the i-thclassproportion

    of values inside Ci

    C i i-th class C i i-th class

    i-th class C i i-th class

    C i i-th classC i i-th class

    Histograms can be built by using either absolute or relative frequencies:

    The histograms tend to the probability function, which justifies the use of samples to infer population information.

    pi=P (C i)

    e i=npi N i

    e i

    f (x) f n(x )= #{X iC x }n f n(x )=#{x iC x }

    n

    Laws of large

    numbers

    39

    f i=e in

    Basic Concepts

  • DistributionFunction

    X random variable

    Expected SampleDistribution Function

    Discrete Continuous

    Probability for X totake a value in Ci

    Expected absolutefrequency of the i-th

    class

    C i

    Sample X1,...,Xn

    Sample x1,...,xn

    Empirical absolutefrequency of the i-thclassproportion

    of values inside Ci

    C i i-th class C i i-th class

    i-th classC i i-th class

    The sample distribution functions Fn(x) tend to the distribution function, which justifies the use of samples to infer population information (in fact, Glivenko-Cantelli theorem proves that the convergence is uniform).

    pi=P (C i)

    e i=npi

    N i

    Empirical SampleDistribution Function

    C i i-th classC i i-th class

    F (x)=P(X x)Fn(x)=

    # {x i x }n

    Fn(x)=# {X ix }

    n

    Laws of large

    numbers

    40Basic Concepts

  • Use of the Samples

    41Basic Concepts

    Mathematical, it is possible to consider all possible samples even if the number of them is infinite. It is usually difficult to cope with this task directly, but for most situations we consider addional, indirect results. For example:

    can be interpreted as follows: if all possible samples and their probabilities were considered, the sample mean were evaluated at them, and all these quantities were substituted into the expression of the expectation, the final value would be the same as the expectation of X.

    X (1)={X 1(1) , X2

    (1) ,... , X n(1)}

    XX (2 )={X 1

    (2) , X 2(2) , ... , Xn

    (2 )}

    X (m )={X1

    (m ) , X2(m) , ... , Xn

    (m )}

    Discrete

    Continuous

    Q= 1m j=1

    mq j= j=1

    mq j1m

    E ( X )==E (X )=

    Completesampling

    Use

    Theoretical PopulationMean of Q

    Theoretical SampleMean of Q

    Partial, realsampling

    In practice, to estimate the population mean we consider m samples:

    q=Q(X1 , X2 , ... , Xn)

    Each time we use the quantity Q, one sample is considered to obtain one value:

    All samples

    Sample Quantity Q

    Q=E(Q)== q jf Q(q j)= qf Q(q)dq

    Representation of all possible samples of size n

    (the number of them can be infinite or they may not be

    ordered totally, in fact)

  • Use of the Samples

    42Basic Concepts

    XX (2 )={X 1

    (2) , X 2(2) , ... , Xn

    (2 )}

    X (m )={X1(m ) , X2

    (m) , ... , Xn(m)}

    Completesampling

    Use

    P(X(2))

    P(X(m))

    Q(X (2))

    Q(X (m))

    Q(X (1))

    P(X(1))

    P(X(2))

    P(X(m))

    X (1)={X 1(1) , X2

    (1) ,... , X n(1)} P(X(1))

    Sampling probability distribution of Q

    Random samplingsnot others, e.g. based on expertsare usually the only ones guaranteeing that the sample is representative. Apart form this fact, let us think of a particular representative sample, say {x1,,xn}, and two practitioners.(1)A practitioner using mathematics but not inference theory can enumerate (by extension or by comprehension) all possible

    samples and hence all possible values for Q, from which some posterior calculations and representations can be done: sample mean, sample variance, sample median, histogram, et cetera.

    (2)Another practitioner using both mathematics and inference theory can enumerate (by extension or by comprehension) all possible samples and their probabilities and hence all possible values for Q with their chances, that is, the sampling distribution of Q. This distribution of probability plays the role of system of reference where values can be compared statistically, which allows this practitioner to quantify the statistical statements, to compare what has happened with what could or should have happened (sampling error), to study what can or will happen, et cetera. (This second practitioner can also do, obviously, what the first practitioner can. Both could study the true error if the whole population were also studied.)

    Quantifying

    Joint probability distribution of X

  • Sample size

    Asymptotic framework

    Finite-sample-size framework

    Sample

    Some concepts:Asymptotic unbiasedness,

    consistency, etc

    Some concepts:Unbiasedness, efficiency, etc

    Note: Although more data will usually imply better information, this is not true if data have not a minimum quality. This is a problem we do not face, but it may appear in real statistical analyses.

    Asymptoticity

    Some concepts are studied for a finite value of n, while others are studied in the limit.

    There is not a severe change of behaviour at any value of n, although in practice we consider as asympotic those larger than 30 (or 25).

    In cases where only few data will be available, asymptotic concepts make no sense, while these concepts are the only important ones when many data will always be involved.

    n

  • Sections

    ( Introduction: Basic Concepts )

    Inference Theory

    Types of Problem

    Statistics and Estimators

    Cases

    Statistical Studies

    Use of T's

    44

    ( Appendixes: Practicals)

    Statistical Inference in Other FieldsA Brief Guide for Students

  • Data are used as input in statistics and estimators, which will be used in the statistical methods that allow us to study the population characteristics of interest: mean, variance, parameters, et cetera.

    We can talk about three main kinds of problem: Point estimations: By using a proper estimator, the value of , or has to be estimated.

    Well-known estimators and general methods of building estimators are introduced. The fulfilment of some properties of the sampling distribution of the estimators are studied.

    Confidence intervals: By using a proper statistic, an interval of values for or instead of only one valuehas to be obtained in such a way that we could have a minimum certainty that the unknown true value would be inside the interval.

    Hypothesis tests: By using a proper statistic, a decision about the value of or (parametric problem) is made by applying testing methodologies. Additionally, other statistics to make decisions about more general questions (nonparametric problems) are introduced. All these statistics evaluate possible discrepancies between the sample information and the population information expected under theoretical conditions.

    Note: Weand many authorsrepresent the population quantities through Greek letters: (theta), (mu), (sigma), (lambda), (kappa), (eta), etc. Latin letters or accents are used for sample quantities: S, x, etc.

    45Types of Problem

  • DescriptiveStatistics

    Applications

    Inference TheoryPoint Estimations Methods: moments and maximum likelihood Properties: unbiasedness, consistency, etcConfidence Intervals Methods: pivotal quantity Minimum Sample sizeHypothesis Tests Methodologies: critical region and p-value Character: parametric and nonparametric

    Metasubject Thinking Listening, reading, writing and speaking Teaching and learning Exploitation

    Statistical Inference

    ProbabilityTheory

    Hypotheses

    Direct, to solve problems: X Speed of a particle X Presence of an error X Gross domestic product

    , , , f(x,), F(x,)

    Indirect, to develop new theory: Estimation of the coefficients of

    a model. Tests of the hypotheses of a

    model. Tests for the diagnosis of a

    model.

    ResultsInterpretation

    Subject

    46Types of Problem

  • Statistical Problem: Study a variable of a population by using samples question random variable probability distribution Probability Theory Statistical Inference Sampling: appropriate process for the sample to be representative of the population Types of sampling, simple random sampling... Inference Theory

    Statistic: function that uses the sample information in a proper way Sample mean, sample variance... sampling probability distribution... Probability Theory

    Inferential method: technique to answer the statistical question and solve the problem A unique value that estimates a measure (, 2, etc) or a parameter (, , , , etc) Estimators (X, S2, etc) and statistics (T, , etc), methods (maximum likelihood, moments), properties (MSE, etc)... Point Estimations

    A set of values and the probability for an unknown measure (, 2, etc) or parameter (, , , , etc) to be inside Estimators (X, S2, etc) and statistics (T, , etc), methods (pivotal quantity)... Confidence Intervals

    A decision to choose between two hypotheses about a measure (, 2, etc), a parameter (, , , , etc), a characteristic of F (probability of an event, median, symmetry, etc) or the entire distribution (F), made with bound probability of rejecting the null hypothesis when it is true (). Estimators (X, S2, etc) and statistics (T, , etc), types of error, p-value, power function, methodologies... Hypothesis Tests

    Real-World Problem: Study a characteristic of a group weight tree of a forest Biology speed particle of a gas Physics benefit industrial sector Econonomics

    (In some cases, more than one variable or population are considered.)

    Do not confuse the population distributions

    with the sampling distributions of

    estimators and statistics.

    47Types of Problem

  • StatisticsSamples

    Point Estimations, Confidence Intervals and Hypothesis Tests

    X = {X1,...,Xnx} and Y = {Y1,...,Yny}, simple random samples

    Although we may know the sampling distribution of the estimators in some cases (e.g. X for normal populations), in general we have to used statistics involving them such that: (1) A theorem tells us the (asymptotic) sampling distribution necessary to calculate probabilities and quantiles. (2) They are dimensionless, so they do not depend on the scale in which the data are measured.

    Real-World Problem Variables of interest, processes, means and variabilities...Probability

    X and Y, functions FX(x;), fX(x;), FY(y;) and fY(y;), measures X, Y, X, Y,... Populations

    Statistics

    Estimators

    Methods

    To Study the Means X and Y To Study the Variances X2 and Y2 To Study the Parameters

    They also allow studying the means and the variances.

    To Study Measures (, 2, etc) or Parameters (, , , , etc) To Study Characteristics or Whole Probability Distributions i=1

    K (N i ei)2

    e i, max x|Fn( x)F (x ;)|

    ( X) nS

    , (n1) S2

    2

    48Types of Problem

  • Statistical Questions

    49

    In 1990, 25% of births were by mothers of more than 30 years of age. This year a simple random sample of size 120 births has been taken, yielding the result that 34 of them were by mothers of over 30 years of age.

    a)With a significance of 10%, can it be accepted that the proportion of births by mothers of over 30 years of age is still 25%, against that it has increased? Select the statistic, write the critical region and make a decision. Calculate the p-value and make a decision. If the critical region is calculate (probability of the type II error) for 1 = 0.35. Plot the power function with the help of a computer.

    b)Obtain a 90% confidence interval for the proportion. Use it to make a decision about the value of , which is equivalent to having applied a two-sided (nondirectional) hypothesis test in the first section.

    Rc={> 0.30 } ,

    1 Nonparametric Test to validate this assumption.

    3 Confidence interval to bind the value of the proportion .

    2 Parametric Test to evaluate these hypotheses about the value of .

    Types of Problem

  • Sections

    ( Introduction: Basic Concepts )

    Inference Theory

    Types of Problem

    Statistics and Estimators

    Cases

    Statistical Studies

    Use of T's

    50

    ( Appendixes: Practicals)

    Statistical Inference in Other FieldsA Brief Guide for Students

  • A statistic T is the mathematical function using the information contained in the sample X = {X1,...,Xn}, that is:

    T(X) = T(X1,...,Xn)Since Xi are random variables, T is a random quantity too. Its distribution (possible values and their probabilities) is termed sampling distribution. Sometimes, this distribution is one of the well-known probability models; other times it is difficult to know, although we can always study it empirically (mean, variance, histogram, et cetera). For unknown values X = {X1,...,Xn}, a statistic T(X) is still a random quantity; for specific values x = {x1,...,xn}, the evaluation T(x) is no longer stochastic but a number.

    EstimatorsAn estimator is a statistic that is used to estimate the value of a quantity of interest (it cannot depend on it): any of the measures and , or a parameter . To guarantee the quality of an estimator, we define some concepts.The evaluation of an estimator of is termed estimate of the parameter . The whole process is termed estimation.

    Pay attention to the notation we use: upper-case letters for random quantities, lower-case letters for their possible values (known or unknown).

    Statistics and EstimatorsStatistics

    51

  • To study the quantities in which we are interested, or , we need estimators of them. We introduce two general methods to build estimators of any parameter and some well-known estimators of and .

    Once the quality of an estimator is guaranteed (by studying concepts and properties based on its mean and variance), we need know its sampling distribution so as to be able to do calculations (quantiles and probabilities).

    In fact, instead of using the estimators themselves we frequently define statistics involving them such that:

    Their sampling distributionexact or asymptoticis known in theory

    They are dimensionless versions of the estimators

    The basic statistics are summarized in tables from which we will select the appropriate for each situation. The underlying theorems will be mentioned.

    This will allow us to do the calculations and to evaluate the agreement between the sample information and the theoretical assumptions.

    That is why we need not care about the units of measurement of the data during the calculations (although we should care about it for a proper interpretation of the problem and the solution). On the other hand, the natural spread of data is taken into account by these statistics.

    Statistics Made with EstimatorsStatistics and Estimators

    52

  • For an estimator E. Given a value e of E, is it small or large? To answer this question we need a system of reference. The probability distribution of E plays this role. How many values ('much' would be a better word for continuous distributions) are above it? Quantiles (median, quartiles, deciles and centiles are referential values). Random variables are dimensionless.

    For any random quantity, say E. The distribution of E must be known to judge a value e. Nevertheless, we do not always know this distribution but the distribution of a quantity involving E, say T. This is enough to jugde any value e of E, since it is possible to judge the transformed te within the distribution of T. This quantity T is dimensionless, which makes it more useful even.

    Example. In an exam, the average score of the class has been 6.7 points. How good is this score? Answer: It must or should be comparede.g. using a figurewith the distribution of the variable average score for any class taking that exam. (A variable can also be defined as the average of other quantities.)

    (Analogous figures can be created for the discrete case.)

    e

    f E (e) f T (t )or

    tor

    Statistics and EstimatorsReferential Values of a Sampling Distribution

    53

  • Both statistics and estimators are unidimensional random quantities. The mean and the variance of their sampling distributions should be analysed to study how these quantities behaveas if we were going to use them many times with different samples, even if in practice they are to be used only once.

    Let Q be a statistic or an estimator; theoretically, these two measures are

    Nevertheless, it is usually difficult to know fQ. Instead, to try finding Q and Q2 we will apply the basic general properties of the measures E() and Var().

    Note: We are interested in the sampling distribution of the univariate quantity Q, not in the joint distribution of the random vector (X1,,Xn).

    Discrete

    Continuous

    Discrete

    Continuous

    Statistics and EstimatorsSampling Distribution

    54

    =Q q jf Q (q j)

    =Q qf Q (q)dq

    =Q (q jQ)2f Q(q j)

    =Q (qQ)2f Q(q)dq

  • Concretely, we want to study how the values of Q are distributed with respect to the quantity under study, say .

    Another case:

    What if the only time we are going to use Q it takes a value q4, far from ? Thus, it is necessary to study the behaviour or quality of the estimators.

    Standard Errors (Absolute and Relative)

    q4 q1 E(Q) q3 q2

    q4 q1 q3 E(Q) q2 q5

    The average value E(Q) may be close, or even equal, to while all possible values of Q are far from it. Variability measures how close to the average value E(Q), and hence among them,

    the possible values qj are.

    The quantity Q is termed sampling or standard (absolute) error of Q; the dimensionless version quantity Q/|Q| is termed sampling or standard (relative) error of Q. For example, when Q = X is used to estimate = X, it takes the form

    As the sampling distribution fQ is usually unknown, Q is calculated or estimated by using the last approximation.

    qQ 0 q q+Q

    We want the average value E(Q) to be as close to as possible.

    If the standard error has the same order of magnitude as q, or higher, the estimate is not trustworthy.

    Q= X= X2= X2n = Xn Sn=Q

    Statistics and EstimatorsSampling Distribution

    55

    Q|Q|

  • Let Q(X1,...,Xn) be a sample quantity. We are interested in the behaviour of Q, that is, in

    Possible values and their probabilities

    This is usually difficult. Some details can sometimes be seen by using measures and figures instead of the values themselves (as we do in Descriptive Statistics). Possible properties are:

    Probability p of some events involving Q, or quantile c determining the set of thesmallest or biggest values that Q can take with certain probability, for example

    Mean, variance, moments, etc. Bias Mean square error Sufficiency: Q contains the same information to estimate the parameter as the whole sample Asymptotic behaviour: asymptotic bias, consistency

    On the other hand, comparison of estimators is of great interest:

    Relative efficiency: comparison of the mean square error of two estimators Efficiency: unbiasedness plus minimum variance Asymptotic behaviour: asymptotic (relative) efficiency

    p=P (Qc )

    E(Q) Var (Q)

    MSE (Q)=b(Q)2+Var (Q)

    x F(x )=P(Xx)

    b (Q)=E(Q)

    b (Q)2=0 Var (Q) minimum once b(Q)=0

    Statistics and EstimatorsSampling Distribution

    56

  • Population Quantities Sample QuantitiesMean

    (or average)

    Variance

    Standard deviation

    Sample mean

    Sample proportion

    Variance of the sample

    Sample variance

    Sample quasivariance

    Standard deviation of the sample

    Sample standard deviation

    Sample quasi-standard deviation

    Proportion

    Parameter(s) Estimator of

    Statistics and EstimatorsFor two

    populations, similar

    quantities can be written.

    57

  • Random Quantities Nonrandom Quantities

    Randomvariable

    Sample(any)

    Two important measures of the (model of the) variable X

    Three important statistics to study the two important measures of the

    variable X

    Values of the three random statistics

    when they are evaluated at a

    specific sample x

    Sample(a specific one)

    Value of the random variable

    (a specific one)

    Estimator of

    Parameter(s)

    Estimate of Two important measures of any important

    statistic (the sample mean, now) to study the two important measures of the variable X

    Popu

    latio

    n Q

    uant

    ities

    Sam

    ple

    Qua

    ntiti

    es

    (X) (x)

    Statistics and Estimators 58

    X={X1 , X2 ,... , X n} x={x1 , x2 ,... , xn}

    For two populations,

    similar quantities can

    be written.

  • Laws of large numbersIntuitively, sample relative frequencies tend to the population probabilities, which justifies why Statistics worksthe empirical histogram tends to the population histogram and both tend to the population probability distribution.

    Linear combinations of normal variablesWhen a normally distributed variable is added to, subtracted from, multiplied by or divided by a quantity, we can know the normal distribution of the result. As a particular case, the probability distributions of the total sum and the sample mean are known.

    Central limit theoremsFor any population probability distribution, these theorems allow us to know the asymptotic probability distribution of the total sum and the sample mean.

    Fisher's theoremnonlinear combinations of normal variablesFor normally distributed population variables, this theorem allows us to have a result involving both the population variance and an estimator of it.

    Others

    Main Theorems

    Statistics and EstimatorsProbability Theory provides results to compare population and sample information, and hence to support Statistics.

    59

  • We are frequently interested in studying = E(X) and 2 = Var(X)or, more ambitiously, the parameters of the entire FX(x;)for one or more populations. We learn ways of finding estimators and evaluating their quality.

    Statistics are made with estimators by applying well-known mathematical results (theorems); now we do not see the results but merely tabulate the statistics. Since any variable of a simple random sample follows the same distribution as X and the sample is used through statistics T, and 2 of X appear also in the expression of E(T) and Var(T).

    Statistics for nonparametric methodsto study characteristics of the population distribution different from the mean or the varianceare also tabulated.

    X is an estimator of

    Statistic with which the sample information is used in a proper way to answer the statistical question. A

    theoretical result (theorem) tells us its distribution,

    which we use to calculate probabilities or find

    quantiles

    Sample Information

    Number of populations: 1 or 2Type of population: normal (any n), any (big n) or Bernoulli (big n)

    Inferential tool

    Tables of StatisticsPopulation information

    Parameter on which the statistical question is based

    Knowledge about the other parameter of the distribution

    Statistics and EstimatorsMotivation of the Statistics

    60

  • Let us apply a mathematical zoom to some statistics:

    X

    S2nParameter: population mean (n1)S2

    2

    Estimator: sample mean

    Dissimilarity: a comparison, based on a difference, between what the data say and the population value

    Variability: this denominator is a measure of the order of magnitude of the spread of the data, to have a reference with which the dissimilarity is measured. It makes the quotient a dimensionless quantity.

    Dissimilarity: a comparison, based on a quotient, between what the data say and the population value

    Estimator: sample quasivariance

    Parameter: population variance

    Cautions: Even if dimensionless quantities are necessary not to depend on the units of measurement, it is also necessary to look at the different terms in the expression of statistics. Otherwise, too small or large values of a term can be hightlighted or hidden by other terms. We will insist on this fact several times.

    Statistics and Estimators 61

  • Taken from: Solved Exercises and Problems of Statistical Inference. David Casado. http://www.Casado-D.org/edu/ExercisesProblemsStatisticalInference.pdf( )Basic Measures

    Basic Estimators

    Statistics and Estimators 62

    http://www.casado-d.org/edu/ExercisesProblemsStatisticalInference.pdfhttp://www.casado-d.org/edu/ExercisesProblemsStatisticalInference.pdf

  • Basic Quantities and Estimators

    Statistics and Estimators 63

  • Basic StatisticsX

    2n=

    Xn

    =( X)n

    X

    S 2n=

    XS n

    =( X)n

    S

    Equivalent formulas:

    Statistics and Estimators 64

    X

    s2n1=

    Xs

    n1

    =( X) n1

    s=

  • Basic Statistics

    Statistics and Estimators 65

  • Basic Statistics

    Statistics and Estimators 66

  • Basic Statistics

    ( XY )( XY )

    S p2nX + S p2

    nY

    =( XY )( XY )

    S p2 1nX + 1nY=( XY )( XY )

    S p2 n X+nYn XnY=

    [( XY )( XY )] n XnYnX +nY nX s X2 +nY sY2nX +n y2

    Equivalent formula:

    Statistics and Estimators 67

    [( X Y )( XY ) ] n XnYn X+nY (nX1)S X2 +(nY1)S Y2n X+n y2

    or

  • Basic Statistics

    Statistics and Estimators 68

  • Basic Statistics

    Statistics and Estimators 69

  • Basic Statistics

    Statistics and Estimators 70

  • Tests Based on and Analysis of Variance (ANOVA)

    Statistics and Estimators 71

  • Chi-Square Tests

    Statistics and Estimators 72

  • Kolmogorov-Smirnov Tests

    Statistics and Estimators 73

  • Other Tests

    Statistics and Estimators 74

  • Tables of Statistics

    Statistics and Estimators 75

  • Sections

    ( Introduction: Basic Concepts )

    Inference Theory

    Types of Problem

    Statistics and Estimators

    Cases

    Statistical Studies

    Use of T's

    76

    ( Appendixes: Practicals)

    Statistical Inference in Other FieldsA Brief Guide for Students

  • One Population and One Variable

    XX j

    X F (x ;)

    =E(X )2=Var (X )

    f (x ;)

    Tools of Probability Theory used for: Point Estimations Confidence Intervals Hypothesis Tests Other types of problem

    Data

    Formulas

    X

    Inferential Process

    ...

    xx j

    Mathematical representation of the variable in which we

    are interested

    Probability distribution to explain the

    behaviour of X (a model for it). There is at least one unknown quantity we want to

    study, unless we were in a simulated situation where an estimator is

    being studied; the other quantities can be

    known or unknown

    Tables of statistics from which we select

    an appropriate T, taking into account the

    information about X and the sample size.

    Types of statistical question (an easy

    one, in this subject).

    Cases 77

  • Two Populations and One Variable

    Tools of Probability Theory used for: Point Estimations Confidence Intervals Hypothesis Tests Other types of problem

    Data

    Formulas

    Inferential Process

    12

    12/2

    2

    ...

    XX j

    X F (x ; x)

    =E(X )2=Var (X )

    f (x ;x)

    X

    ...

    YY j

    Y F ( y ; y )

    =E(Y )2=Var (Y )

    f ( y ; y)

    Y

    ...

    78Cases

    xx j

    yy j

  • One Population and Two Variables

    (X ,Y )(X j , Y j)

    X F (x , y ;)

    1,1=E(XY )f (x , y ;)

    Tools of Probability Theory used for: Point Estimations Confidence Intervals Hypothesis Tests Other types of problem

    Data

    Formulas

    (X ,Y )

    Inferential Process

    ...

    (x , y)(x j , y j)

    Mathematical representation of the variables in which we

    are interested. We can also talk about one bivariate random

    variable.

    Probability distribution to explain the joint behaviour of (X,Y),

    and all the concepts around it: joint distribution and

    probability functions, bivariate moments,

    marginal distributions, conditional

    distributions...

    Tables of statistics from which we select

    an appropriate T, taking into account the

    information about X and the sample size.

    Type of statistical question (an easy

    one, in this subject). In these slides, we consider only the

    nonparametric hypothesis test of

    independence

    r1,r 2=E(Xr 1Y r 2)

    79Cases

  • Several Populations and One Variable

    80Cases

    Tools of Probability Theory used for: Point Estimations Confidence Intervals Hypothesis Tests Other types of problem

    Data

    Formulas

    Inferential Process

    Tables of statistics from which we select an appropriate T, taking into account the information about X and the sample size.

    Type of statistical question (an easy

    one, in this subject). In these slides, we

    consider only parametric hypothesis tests of the equality of meansAnalysis of Variance (ANOVA)

    X (P)

    X j(P)

    X (P) F P(x ;P)

    X

    x(1)x j(1)

    x(P )

    x j(P )

    ... i= j i=1,2,... , P j=1,2,... , P?

    X (1 )

    X j(1 )

    X (1 ) F1(x ;1)

    X

    X (2)

    X j(2)

    X (2) F2(x ;2)

    X

    x(2)

    x j(2) ...

  • Populations, Variables and Statistical Techniques

    X

    (X (1) , X (2))

    (X (1) , X (2) , ..., X (k ))

    (X (1) , X (2) , ..., X (k ) ,...)

    Univariate Statistical Methods: descriptive statistics, statistical inference, etc.

    Bivariate Statistical Methods: descriptive statistics, statistical inference, simple regression, independence, etc.Multivariate Statistical Methods: descriptive statistics, statistical inference, multiple regression, independence, principal components, etc.Infinite-Dimensional Statistical Methods: discrete- and continuous-time stochastic processes, random functions, descriptive statistics, statistical inference, independence, etc.

    One Population, (Quantitative) Variables

    81Cases

    Several Populations, One (Quantitative) Variable

    X Y X(1) X (2) X (P )

    ...

    Comparison of two populations Comparison of P populationsANOVA Homogeneity Hypothesis Tests

  • Number of populations

    Number of variables

    Number of data

    Quantity of interest

    Knowledge about other parameters Statistic T

    1 (normal)1

    n

    2 known [23], [24], [25]

    2 unknown [26]

    2 known [27]

    unknown [28]

    1 (any)n large

    2 known or unknown [29], [30], [31]

    1 (Bernoulli) [32], [33]

    2 (normal)

    1

    n

    XYX

    2 and Y2 known [34], [35]

    X2 and Y

    2 unknown [36], [41]

    X2/Y

    2X and Y known [37]

    X and Y unknown [38]

    2 (any) nX and nY large

    XYX

    2 and Y2 known or

    unknown[42], [43]

    2 (Bernoulli) XY [44], [45]

    Cases 82

  • Main CasesNumber of

    populationsNumber of variables

    Number of data

    Quantity of interest

    Knowledge about other parameters Statistic T

    P (normal) 1 nk large k = k2 = 2 unknown [60]

    Number of populations

    Number of variables

    Number of data

    Quantity of interest

    Knowledge about other parameters Statistic T

    1 1 n large F0(x;) [61], [68]

    1? 1 nk large F(x|S) [64], [71]

    1 2 n large f(x,y;) [66]

    83Cases

  • N(0,1)

    t

    2

    F1,2

    N(,2) Any n

    Bern()

    Large n(> 30)

    Bin(,)

    P()

    t

    .

    .

    .

    Probability distribution random variablesX and Y

    can follow in this subject

    Probability distribution statisticsT

    in the previous tables can follow

    Good news: we need only theprobability tables of these four cases.

    Good news: two possible situationsnormal populations or many data.

    Main Cases

    84Cases

  • Sections

    ( Introduction: Basic Concepts )

    Inference Theory

    Types of Problem

    Statistics and Estimators

    Cases

    Statistical Studies

    Use of T's

    85

    ( Appendixes: Practicals)

    Statistical Inference in Other FieldsA Brief Guide for Students

  • [1] Real-world problem: identify the quantities, the assumptions or hypotheses, and the main question [Economics, Business Administration, Finance, etc.]

    [2] Translation into the mathematical language[3] Design of the whole process: the number of data needed, how to obtain them while guaranteeing the representativeness, and how to use these data [sampling process, mininum sample size, steps to solve exercises and problems]

    [4] Theoretical calculations: e.g. the inferential methods we are learning [point estimations, confidence intervals, hypothesis tests]

    [5] Data obtainment: collection of real data or generation of simulated data [others' real data and simulated data]

    [6] Analysis of data: characteristics, erroneous data, missing data, outliers, treatment (e.g. remotion of the units of measurement) [descriptive statistics, standardization]

    [7] Use of the data: with the theoretical expressions [substitution into formulas][8] Statistical interpretation: within the statistical framework [including: standard error, confidence, significance, types of error]

    [9] Solution or answer: based on the interpretation of the results within the framework of the real-world problem [from the mathematical language to the real-world]

    Statistical Study (Here, the contents of the subject are in this color.)

    86Statistical Studies

  • Our Study [3] and [4] We mention how to calculate the number of data needed (minimum sample case) in

    simple cases, as well as the basic ideas on sampling (simple random sampling). We learn some inferential methods (point estimations, confidence intervals, hypothesis tests) and whether they can be applied (assumptions or hypotheses). Besides, we mention the main theoretical results supporting them (theorems). Finally, we also design ways for solving the exercises and problems (steps).

    [1], [2], [5], [7], [8], and [9] We apply the methods to real-world problems (Economics, Business Administration, Finance, etc.).

    [6] We do not deal with analyses of data directlyalthough we frequently standardize to use some statistics T.

    In GeneralIt is especially interesting to highlight: [1] and [2] To be allowed to use the theorems under T's, the assumptions or hypotheses must be

    fulfilled. For any statistical study to be useful, the real-world problem must be well-stated (including the assumptions) and properly translated into the mathematical language.

    [7] To base the results upon reliable data and formulas, we must pay attention to the values statistics T take but also to the values the terms they are made with take

    [8] To interpret the results statistically, In point estimations, we must pay attention to the estimates but also to the standard error In confidence intervals, we must pay attention to the endpoints but also to the confidence In hypothesis tests, we must pay attention to the decision but also to the power function

    87Statistical Studies

  • The usual steps of a statistical study have already been mentioned. Even if we will focus on the inferential methods and their applications (step 4, 7 and 8), it is worthwhile mentioning at least once the importance of the quality of all those steps.

    [1] Real-world problem: First of all, the convenience and appropriateness of the real-world problem.[2] Translation into the mathematical language: The translation into the mathematical language must also be correct, and this mathematization may sometimes be done with several degrees of quality.[3] Design of the whole process: The design of the whole statistical process determines the characteristics of the data: type of sampling and representativeness, sample size, registration...[4] Theoretical calculations: The statistical methods themselves: selection of the method, assumptions or hypotheses, theoretical calculations...[5] and [6] Data obtainment and analysis: In practice, attention must be paid to obtaining and analysing the data.[7] Use of the data: Data must be used in the right theoretical formulas in the right way. Usually, this is not a problem if we understood step 4 or even we did those calculations.[8] and [9] Statistical interpretation and solution or answer: The statistical interpretation of the results and its translation into the field of the real problem are quite important, obviously.

    For the types of statistical problem we deal with, well-known statistical methods are already givenfor step 4in these slides. We will apply them either to practice their use or to solve particular real-world problems.

    88

    QualitiesStatistical Studies

  • 89

    In Statistics, results may change severely when assumptions are really false, other method is applied, different certainty is considered, or data has no proper information (representativity, quantity, quality, etc.). Alongside this document, we do insist on the cautions that statisticians and readers of statistical works must take in interpreting results. Even if you are not interested in statistically cooking data, you had better know the recipes... (Some of them have been included in the notes mentioned in the prologue.)

    We hightlight once more the very basic points on which results are based: The data available. The assumptions. The statistical method applied, including particular details of its steps, mathematical

    theorems and, finally, its precision. Certainty with which the method is applied: probability, confidence or significance.

    Now, let us devote some words to what quality means in applying our methods.

    Quality of Statistical MethodsWe would like, though it is not possible in Statistics, that for any sample X:

    The point estimator provides always the true value: The interval contains always the true value: The test provides always the right decision:

    QualitiesStatistical Studies

    P ( (X )= )=1 , X

    P (I (X))=1 , X

    P ({Rc|H 0}{Ra|H1})=0 , X

  • 90

    In studying a population quantity, say , all methods work with estimationsvalues calculated from the data that are expected to be close to the real value of (only under simulated situations, e.g. in exercises or practicals, is known).

    Estimations are provided by an estimator, say . Each time data are substituted in the estimator, the value provided is differentwe have already talked about randomness. Then, how different do these values tend to be?, how to measure the error and the quality of an estimator? For confidence intervals, formulas will also provide different endpoints for different data. Finally, in testing hypothesis the decisions are based on values that are also different each time.

    Under good conditions, the essential information in different samples is quite similar even if most of their values are different. Either way, the knowlegde about the possible samples or the possible values of the estimator plus the knowlegde about their probabilities allow us to have a reference and hence to measure how similar or different the results tend to be.

    ErrorsFor a particular sample, we can talk about an estimate and the classical errors (we simplify the notation of the estimator):

    (x1, ... , x n)

    Absolute

    errorRelative

    error

    QualitiesStatistical Studies

    The sign can be removed from the previous quantities by considering the absolute value or the square. These errors appear in concepts

    like bias, mean square error, consistency, etc.

  • 91

    For any sample, we must talk about an estimator and consider that the previous errors are also random, so they can take different values with some probabilitiesthese are the probability distributions explaining their behaviour (the concept of sampling distribution that we will study). Thus, in Statistics we cannot usually talk about the classical errors but about conceptspositive, usuallyrelated to the probability distributions of the estimator itself or the classical errors, namely:

    In literature, the word precision can be referring to any of these concepts and even some others (e.g. the variance of the estimator but also the inverse of the variance of the estimator). The probability that appears in the two probabilistic expressions is termed confidence, and it is a measure of the strengh with which the bound is ensured.

    Due to the purpose of confidence intervals and hypothesis tests, the quality is measuredalthough the quantities above can be involved in the processthrough the length and the confidence, for the former, and the probability of making wrong decisions for different real values of , for the latter.

    We will learn how to interpret the different measures of quality.

    (X1 , ... , Xn)

    =Var ()=E([ ]2)=E([ E()]2) CV= | |

    E such that P(||E) E([]2)E such that P(| |E)

    Note: Mathematically, an expression like can also exist for random, though such deterministic bounds are not frequent or useful.

    ||E

    QualitiesStatistical Studies

    E(||)

    The interpretation depends on whether they compare the estimator with the true

    value or not, they involve a deterministic or a probabilistic bound, they penalize large differences (e.g. using an exponent), they

    are dimensionless or not, etc.

  • On the Populations How many populations are there? Are their probability distributions known?

    On the Samples If populations are not normally distributed, are the sample sizes large enough to

    apply asymptotic results? Do we know the data themselves, or only some quantities calculated from them?

    On the Assumptions What is supposed to be true? Does it seem reasonable? Do we need to prove it? Should it be checked for the populations: the random character, the

    independence of the populations, the goodness-of-fit to the supposed models, the homogeneity between the populations, et cetera?

    Should it be checked for the samples: the within-sample randomness and independence, the between- -samples independence, et cetera?

    Are there other assumptions (neither mathematical nor statistical)?

    On the Statistical Problem What are the quantities to be studied statistically?

    Useful Questions

    92Statistical Studies

  • Concretely, what is the statistical problem: point estimation, confidence interval, hypothesis test, etc?

    On the Statistical Tools Which are the estimators, the statistics and the methods that will be applied?

    On the Quantities Which are the units of measurement? Are all the units equal? How large are the magnitudes? Do they seem reasonable? Are all of them

    coherent (variability is positive, probabilities and relative frequencies are between 0 and 1, etc)?

    On the Interpretation What is the statistical interpretation of the solution? How is the statistical solution interpreted in the framework of the problem we are

    working on? Do the qualitative results seem reasonable (as expected)? Do the quantities seem reasonable (signs, order of magnitude, etc)?

    Useful Questions

    93Statistical Studies

  • Sections

    ( Introduction: Basic Concepts )

    Inference Theory

    Types of Problem

    Statistics and Estimators

    Cases

    Statistical Studies

    Use of T's

    94

    ( Appendixes: Practicals)

    Statistical Inference in Other FieldsA Brief Guide for Students

  • Main statistics are summarized in tables

    The measures = E(X) and 2 = Var(X) are two main moments of the probability distribution of X. Any variable of a simple random sample follows this distribution too (it is a copy) and the sample is used through statistics T, which explains that and 2 of X appear also in the expression of E(T) and Var(T).Statistics for nonparametric methodsto study characteristics of the population distribution other than the mean or the varianceare also tabulated.

    Use of T's 95

    Population information Number of populations: 1 or 2Type of population: normal (any n), any (big n) or Bernoulli (big n)Sample Information

    Inferential tool

    Statistic with which the sample information is used in a proper way

    to answer the statistical question. A theoretical result (theorem) tells us

    its distribution, which we use to calculate probabilities or find

    quantiles

    X is an estimator of

    Parameter on which the statistical

    question is based

    Knowledge about the other parameter of

    the distribution

  • They can be used under some assumptions or hypotheses, and their use is not justified otherwise.

    To use them, we need to make them appear in the expressions we are working with.

    Statistics T are mathematical theorems comparing population and sample information, and, therefore, allowing us to quantify our statistical statements and answers.

    How T is Usually UsedWhat T's do

    96Use of T's

    In the following slides we use one-population cases and easy questions because they are easier to understand at the beginning. You may not try to understand all details the first time you read them, come back to these slides while preparing the other contents.

  • 1. Select TBy reading the statement, we identify:

    The main characteristics of the population: assumptions or hypotheses, knowledge about the population distribution (measures, parameters, etc.)

    The main characteristics of the sample: type of sampling, quality of data, quantity of data (size n).

    The statistical question: type of problem, translation into the mathematical language, quantities involved (e.g. estimators)

    2. Rewrite the QuestionThe question is usually made in terms of estimators. Since only in few cases we can know their sampling distribution, we need to rewrite the event so as to make the statistic T appear. In rewriting the question or event, we must take into account that:

    There