Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

Embed Size (px)

Citation preview

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    1/15

    1

    Basic Concepts of Statistics & Probability

    Review ofStatisticalConcepts

    SamplingfromDistributions

    Hypothesis

    Testing

    Industrial Engineering

    Define the following.

    Probability Population

    Sample Mean

    Median Mode

    Standard Deviation Variance

    Range Box-plot

    Histogram Descriptive Statistics

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    2/15

    2

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    Types of Distributions

    Continuous Distributions

    Normal Distribution

    Chi-square (X2) Distribution

    t-Distribution

    F-Distribution

    Exponential Distribution

    Weibull Distribution

    Discrete Distributions

    Binomial Distribution

    Poisson Distribution

    Sampling from Distributions

    Hypothesis

    Testing

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    3/15

    3

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    Normal Distribution

    The probability of the normal random variable

    Probabilities for the normal random variable are given by areas under thecurve.

    Where for Standard Normal Distribution

    = 0

    = 1

    = 3.14159

    e = 2.71828

    222/)(

    2

    1)(

    xexf

    Hypothesis

    Testing

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    4/15

    4

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    Normal Distribution

    43210-1-2-3-4

    x

    For a population that is

    normally distributed:

    approx. 68% of the data will lie within +1standard deviation of the mean;

    approx. 95% of the data will lie within +2

    standard deviations of the mean, and

    approx. 99.7% of the data will lie within +3standard deviations of the mean.

    Hypothesis

    Testing

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    5/15

    5

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    HypothesisTesting

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    6/15

    6

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    HypothesisTesting

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    7/15

    7

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    HypothesisTesting

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    8/15

    8

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    Chi-square (2) Distribution

    If x1, x2, , xn are normally and independently distributed randomvariables with mean zero and variance one, then the random variable

    is distributed as chi-square with n degrees of freedom.

    Furthermore, the sampling distribution of

    is chi-square with n 1 degrees of freedom when sampling from a normalpopulation

    22

    2

    2

    1... nxxxy

    2

    2

    2

    1

    2

    )1()(

    Snxx

    y

    n

    i

    i

    HypothesisTesting

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    9/15

    9

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    Chi-square (2) Distribution for various degrees of freedom.

    HypothesisTesting

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    10/15

    10

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    t-distribution

    Ifxis a standard normal random variable and ifyis a chi-square randomvariable with kdegrees of freedom, then

    is distributed as t with k degrees of freedom.

    k

    y

    xt

    HypothesisTesting

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    11/15

    11

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial Engineering

    F-distribution

    If w and y are two independent chi-square random variables with uand v

    degrees of freedom, respectively, then

    is distributed as F with unumerator degrees of freedom and vdenominator

    degrees of freedom.

    vy

    uwF

    /

    /

    HypothesisTesting

    Sampling from Distributions

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    12/15

    12

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial EngineeringHypothesis Testing

    HypothesisTesting

    Statistical Hypothesis

    Statement (assumption) either about the parameters of probability distribution

    or parameters of a model. This assumption may or may not be true.

    Statistical Hypothesis Test

    A method of making statistical decisions using experimental data. It istypically consists of examining a random sample from the population. Ifsample data are consistent with the statistical hypothesis, the hypothesis is

    accepted; if not, it is rejectedThere are two types of statistical hypotheses:

    Null Hypothesis. The null hypothesis, denoted by H0, is usually thehypothesis that sample observations result purely from chance.

    Alternative Hypothesis. The alternative hypothesis, denoted by H1 , whichis the hypothesis that sample observations are influenced by some non-random cause.

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    13/15

    13

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial EngineeringHypothesis Testing

    HypothesisTesting

    The significance level

    , is the maximum probability tolerated for rejecting a true nullhypothesis.

    The p value is the probability of a more extreme departure from the nullhypothesis than the observed data

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    14/15

    14

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial EngineeringHypothesis Testing

    HypothesisTesting

    The hypotheses are stated in such a way that they are mutually exclusive.

    That is, if one is true, the other must be false.

    Errors in Hypothesis Testing

    Type I error occurs when the null hypothesis is rejected when it is

    true, an error, or a "false positive". Thus indicating a test of

    poor specificity.

    Type II error occurs when the null hypothesis is not rejected

    when it is false, a error, or a "false negative". Thus indicating atest of poor sensitivity.

    = P(type I error) = P(reject H0H0 is true)

    = P(type II error) = P(fail to reject H0H0 is false)

    Power = 1- = P(reject H0H0 is false)

    Critical Region or Rejection RegionA set of values in which the null hypothesis is rejected or failed in the

    test statistics.

  • 7/30/2019 Class 04 - Basic Concepts of Statistics and Probability (2 of 2)

    15/15

    15

    Review ofStatisticalConcepts

    Samplingfrom

    Distributions

    Industrial EngineeringHypothesis Testing

    HypothesisTesting

    Steps for Hypothesis Testing

    1. Formulate a null hypothesis, and the alternative hypothesis. Thehypotheses are statements about the population parameters

    2. State the test statistic

    3. Define the level of significance of the test (the probability of rejectingwhen it is true) and hence the critical region.

    4. Collect the data and calculate the observed value of the test statistic

    using the sample data and find the p-value.

    5. Reject H0 if observed value of test statistic falls in critical region or p-value is less than H0 . Otherwise there is no evidence to reject .

    6. State the conclusions clearly in non-technical terms.