422
GENG200 Probability and Statistics for Engineers Spring 2013 1

Probability and Statistics

Embed Size (px)

DESCRIPTION

Probability and statistics for engineers

Citation preview

  • GENG200

    Probability and Statistics for Engineers

    Spring 2013

    1

  • Course Information Instructor: Dr. Adel Elomri,

    Office: E212, corridor 5

    E-mail: [email protected]

    Office Hours: to be fixed later

    TA:

    Lectures: Monday, Wednesday 09:30 10:45

    Course Webpage: http://mybb.qu.edu.qa

    2

  • 3

    Textbook: Applied Statistics and Probability for Engineers, Douglas C. Montgomery, George C. Runger,

    5th edition , publisher John Wiley & Sons, 2007

    ISBN: 978-0-471-74589-1 Reference: Probability and Statistics for Engineering and the

    sciences, Jay L. Devore, 6th edition, publisher John

    Wiley & Sons, Inc. 2007

    -See Library

    Textbooks

  • Course Objectives

    1. Provide students with statistical methods, both descriptive and

    analytical, for dealing with the variability in observed data.

    2. Provide students with fundamental concepts of probability and

    random variables.

    3. Introducing concepts of Statistical Inference and Hypothesis

    testing and confidence intervals of parameters.

    4. Emphasize practical engineering-based applications and the use

    of real data examples.

    4

  • 5

    Topics Weeks Introduction.

    1

    Probability: Addition rule, conditional probability, multiplication rule and Bayes Theorem.

    1

    Discrete random variables. Probability mass function. Mean and variance of discrete random variables.

    1

    Probability Distribution functions: Uniform, Binomial, Geometric and Poisson Distribution.

    2

    Continuous random variables. Probability Density functions. 1 Normal Distribution. Approximation to Binomial and Poisson Distribution. Exponential distribution. Uniform Distribution

    1

    Joint probability function. Multiple discrete and continuous random variables. 1

    Covariance and correlation. Linear combination of random variables. Functions of random variables.

    2

    Descriptive Statistics: Data Summary, Presentation: Stem-Leaf Diagrams, Frequency Distributions, Histograms, Box Plots.

    1

    Parameter estimation. Properties of estimators. Method of Moments. 1

    Interval estimation. Inference on the mean of a population: variance known or unknown. Inference on the variance of a normal population

    1

    Hypothesis testing about the mean and Proportion: Small and Large Sample 1

    Covered Topics

  • Evaluation Scheme

    Quizzes: There will be announced quizzes during class hours. There will be no make-up for missed quizzes.

    Homework: You will be given some homework assignments, which will be announced in class and posted on the course website on The Blackboard. You will have one week for submitting each homework. Late submissions are accepted up to 3 days with a penalty of 15% deduction per each day lost.

    Examination: There will be one mid-term Exam in addition to the final

    Exam. If you miss one of these exams, you must have a university accepted official written excuse to take a make-up.

    Term Project/Paper: The project details, guidelines and evaluation criteria shall be provided duly.

    6

  • Evaluation

    Homework : 05% Quizzes: 10% Term Project: 10% Midterm Exam : 35% Final Exam: 40%

    Some of the exercises and in-class examples will be only

    corrected in the whiteboard. It is the student responsibility to

    take notes.

    7

  • Our ground rules

    Please switch off/silent our phones when in class,

    Arrive on time, If you are late, just open the door and have a seat quietly. Late arrivals are not accepted after 20 minutes,

    Attendance will be taken at the beginning of classes (within first 10-15 min, after that time late arrivals are considered absent,

    In general be courteous to others

    .

    For more details see course syllabus

  • GENG200

    Probability and Statistics for Engineers

    Spring 2013

    9

  • 10

  • 11

  • An engineer is someone who solves problems of interest to society by the efficient application of scientific principles by

    Refining existing products

    Designing new products or processes

    1-1 The Engineering Method and

    Statistical Thinking

    12

  • 1-1 The Engineering Method and

    Statistical Thinking

    Figure 1.1 The engineering method

    13

  • Engineering Example An engineer is designing a nylon connector to be used in an automotive engine application. The engineer is considering establishing the design specification on wall thickness at 3/32 inch but is somewhat uncertain about the effect of this decision on the connector pull-off force. If the pull-off force is too low, the connector may fail when it is installed in an engine. Eight prototype units are produced and their pull-off forces measured (in pounds): 12.6, 12.9, 13.4, 12.3, 13.6, 13.5, 12.6, 13.1.

    1-1 The Engineering Method and

    Statistical Thinking 14

  • The field of statistics deals with the collection, presentation, analysis, and use of data to

    Make decisions

    Solve problems

    Design products and processes

    1-1 The Engineering Method and

    Statistical Thinking 15

  • Statistical techniques are useful for describing and understanding variability.

    By variability, we mean successive observations of a system or phenomenon do not produce exactly the same result.

    Statistics gives us a framework for describing this variability and for learning about potential sources of variability.

    1-1 The Engineering Method and

    Statistical Thinking 16

  • Engineering Example The dot diagram is a very useful plot for displaying a small body of data - say up to about 20 observations. This plot allows us to see easily two features of the data; the location, or the middle, and the scatter or variability.

    1-1 The Engineering Method and

    Statistical Thinking 17

  • Engineering Example The engineer considers an alternate design and eight prototypes are built and pull-off force measured. The dot diagram can be used to compare two sets of data

    Figure 1-3 Dot diagram of pull-off force for two wall thicknesses.

    1-1 The Engineering Method and

    Statistical Thinking 18

  • Engineering Example Since pull-off force varies or exhibits variability, it is a random variable.

    A random variable, X, can be modeled by

    X = +

    where is a constant and a random disturbance.

    1-1 The Engineering Method and

    Statistical Thinking

    Environment effect; such as test equipment, difference in material and others.

    19

  • 1-1 The Engineering Method and

    Statistical Thinking

    Examples: Ohms law and ideal gas law.

    These laws help us with designing products and processes.

    Data is not available about the population (e.g. All nylon connectors produced for the next two weeks). So a sample is taken, measured and

    considered to represent the whole population

    20

    e.g. population: Connectors sold to customers

  • Three basic methods for collecting data:

    A retrospective study using historical data

    An observational study

    A designed experiment

    21

    1-2 Collecting Engineering Data

  • A retrospective study using historical data: It may involve a lot of data. So extracting the needed data

    only (for solving a specific problem) is not an easy job. Data may contain relatively little useful information

    Some information could be missing other could not be recorded at all.

    Accuracy of collected data could be an issue (outliers). Therefore you have to be careful before making a general conclusions.

    22

    1-2 Collecting Engineering Data

  • An observational study

    It solves most of the problems of the retrospective method by collecting the needed data (sometimes more) with accuracy.

    To avoid disturbing the current process, some variations of interest can not be tested. Therefore, experiments of the current system (or its model) will be necessary.

    23

    1-2 Collecting Engineering Data

  • 1-2.4 Design of Experiment

    Example 1: For the nylon connectors, there are two values of

    thickness that are considered to test their effect on the

    connectors pull-off force.

    Therefore, the engineer is interested in determining if there is

    any difference between the 3/32 and the 1/8 connectors.

    Hypothesis testing can be used to answer the question.

    24

  • 1-2.4 Design of Experiment

    Example 2: Acetone-Butyl Alcohol Distillation column.

    The process has three factors

    Reboil Temperature

    Condensate Temperature

    Reflux Rate

    We want to study the effect of these factors on the concentration of the produced acetone.

    Each factor has two levels (values) that can be denoted as low (-1) and high (+1).

    So the number of combinations = 2x2x2 = 8

    25

  • 1-2.4 Design of Experiment 26

  • 1-2.4 Designed Experiments

    Figure 1-5 The factorial design for the distillation column

    27

  • 1-2.4 Designed Experiments

    A four-factorial experiment for the distillation column

    Fourth factor can be considered: type of distillation column

    Number of combinations = 24 = 16

    General form is 2k, k factors with 2 levels each.

    28

  • 1-2.5 Observing Processes Over Time

    Whenever data are collected over time it is important to plot the data over time. Phenomena that might affect the system or process often become more visible in a time-oriented plot and the concept of stability can be better judged.

    The dot diagram illustrates variation but does not identify the problem.

    The large variation shown in the plot diagram indicates in a lot of variability in concentration but the chart does not help explain the

    reason of variation. 29

  • 1-2.5 Observing Processes Over Time

    A time series plot of concentration provides more information than a dot diagram.

    The plot shows a shift in the process mean level and an estimate of the time of the shift can be obtained.

    30

  • 1-2.5 Observing Processes Over Time

    Demings funnel experiment.

    Strategy 1: No adjustment

    Strategy 2: Adjust the funnel to compensate for the error (in the opposite direction of the error of the previous trial)

    31

    Experiment to understand the nature of variability in processes and systems over time. E.Deming : Very Influential Industrial Statistician

  • 1-2.5 Observing Processes Over Time

    Adjustments applied to random disturbances over control the process and increase the deviations from the target.

    32

  • 1-2.5 Observing Processes Over Time

    Process mean shift is detected at observation number 57, and one adjustment (a decrease of two units) reduces the deviations from target.

    33

  • 1-2.6 Observing Processes Over Time

    A control chart for the chemical process concentration data.

    Some disturbance happened at sample # 20 because all of the following observations are below the center line and 2 of them are below the lower limit.

    34

  • 1-2.6 Observing Processes Over Time

    Enumerative versus analytic study.

    Enumerative Study: Collect data from a process to evaluate the current production (sampling in quality control).

    Analytic Study: Use data from current production to evaluate future production. This requires stable process (use to control charts to verify that).

    35

  • 1-3 Mechanistic and Empirical Models

    A mechanistic model is built from our underlying knowledge of the basic physical mechanism that relates several variables.

    Example: Ohms Law

    Current = voltage/resistance

    I = E/R

    I = E/R +

    Due to uncontrolled factors the actual measured current could be different (e.g. Temperature, humidity, variations in voltage, and

    measuring unit). 36

  • 1-3 Mechanistic and Empirical Models

    An empirical model is built from our engineering and

    scientific knowledge of the phenomenon, but is not

    directly developed from our theoretical or first-principles

    understanding of the underlying mechanism.

    37

  • 1-3 Mechanistic and Empirical Models

    Example

    Suppose we are interested in the number average molecular weight (Mn) of a polymer. Now we know that Mn is related to the viscosity of the material (V), and it also depends on the amount of catalyst (C) and the temperature (T ) in the polymerization reactor when the material is manufactured. The relationship between Mn and these variables is

    Mn = f(V,C,T) say, where the form of the function f is unknown. where the bs are unknown parameters. These parameters can be estimated using least square method.

    38

  • Example:

    39

  • 1-3 Mechanistic and Empirical Models

    In general, this type of empirical model is called a regression model.

    The estimated regression line is given by

    40

  • Three-dimensional plot of the wire and pull strength data.

    41

  • Plot of the predicted values of pull strength from the

    empirical model.

    42

  • 1-4 Probability and Probability Models

    Probability models help quantify the risks involved in statistical inference, that is, risks involved in decisions made every day.

    Probability provides the framework for the study and application of statistics.

    Example:

    a container has 25 items. One of them is defective.

    If a sample of size (n) is taken, what is the chance that the defective part will be detected?

    What is the risk of not detecting it?

    Probability models will quantify this risk.

    43

  • Probability versus Statistics

    Probability deals with predicting the likelihood of future events, while statistics involves the analysis of the frequency of past events.

    Probability is primarily a theoretical branch of mathematics. Statistics is an applied branch of mathematics.

    In summary, probability theory enables us to find the consequences of a given ideal world, while statistical theory enables us to measure the extent to which our world is ideal.

    44

  • Example: Probability The standard example is flipping a fair coin. Fair means, technically, that the probability

    of heads on a given flip is 50%, and the probability of tails on a given flip is 50%. This doesn't mean that every other flip will give a head after all, three heads in a row is no surprise.

    Another example would be flipping an unfair coin, where we know ahead of time that there's a 60% chance of heads on each toss.

    A third example would be rolling a loaded die, where (for example) the chances of rolling 1, 2, 3, 4, 5, or 6 are 25%, 5%, 20%, 20%, 20%, and 10%, respectively.

    45

  • Probability

    1

  • Sample Spaces and Events

    2

  • Random Experiment

    3

    An experiment that can result in different outcomes, even though it is repeated in the same manner every time, is called a random experiment.

  • Random Experiments - Example

    Communication system such as voice communication network.

    The information capacity available (number of external lines) to service customers (answer their calls) is an important design consideration.

    Assuming that each line can carry only one conversation, how many lines should be purchased (too few or too many)?

    To answer this question, we need to develop a model that shows the number of calls and the duration of calls.

    If you know that on average, a call is received every 5 minutes and it lasts for 5 minutes? Is that enough? If you deal with this problem in a deterministic manner, how many lines will be purchased?

    4

  • Random Experiments - Example

    Communication system

    5

    Conclusions: considering the variation in our analysis of communication systems is very important.

    Deterministic based

    on average values

    Real-life

    Stochastic model is

    needed

  • Sample Space

    6

    The set of all possible outcomes of a random experiment is called the sample space of the experiment. The sample space is denoted as S.

  • Sample Space - Example

    7

  • Sample Space - Example

    8

  • Sample Spaces Example 2

    9

    Or S = (R+, R+)

  • Sample Spaces Example 2

    10

  • Tree Diagrams

    11

    Sample spaces can also be described graphically with

    tree diagrams.

    When a sample space can be constructed in several

    steps or stages, we can represent each of the n1 ways of

    completing the first step as a branch of a tree.

    Each of the ways of completing the second step can be

    represented as n2 branches starting from the ends of the

    original branches, and so forth.

  • Tree Diagrams - Example

    12

  • Tree Diagrams - Example

    13 Tree diagram for three messages.

  • Tree Diagrams

    Example 2:

    Assume we have a bag containing 6 balls (1 white, 2 Red, 3 black), a random experiment consists in taking without replacement two balls from the bag.

    1-Use a tree diagram to find the sample space.

    S={WR, WB, BW, BR, BB, RW, RR, RB}

    2-What will be sample space if the two balls were selected with replacement.

    S={WW, WR, WB, BW, BR, BB, RW, RR, RB}

    14

  • Events

    15

    An event is a subset of the sample space of a random experiment.

  • Representing Events with Sets

    16

  • Representing Events with Sets Venn Diagrams

    17

  • Event Representations - Example

    18

    Consider the sample space S={yy, yn, ny,nn} in Example 2-2. Identify the following events: -E1: the event in which at least one part conforms -E2: the event in which at most one part conforms - E3: the event in which at least one part do not conform - E4: the event in which both parts do not conform - E5: the event in which both parts are conform Find the following:

    1 3, 1 3, 1 1, 4 4, 4 5

  • Event Representations - Example

    19

  • Mutually Exclusive Events

    20

  • Probability

    21

  • Probability

    22

  • Interpretations of Probability

    23

    Used to quantify likelihood or chance

    Used to represent risk or uncertainty in engineering applications

    Can be interpreted as our degree of belief or relative frequency

  • Interpretations of Probability

    24

    Relative frequency of corrupted pulses sent over a

    communications channel.

  • Axioms of Probability

    25

  • Equally Likely Outcomes

    26

  • Definition

    27

  • Example

    28

  • Addition Rules: Probability of Union

    29

  • Example

    30

    Suppose one wafer is selected at random.

    Let H denote the event that the wafer contains high level of contamination.

    Let C denote the event that the wafer is in the center of a sputtering tool.

    Find:

    P(H), P(C), P(HC), P(HC)

    The table below lists the history of 940 wafers in a semiconductor process.

  • Example Continued

    31

    Suppose one wafer is selected at random.

    Let H denote the event that the wafer contains high level of contamination.

    Let C denote the event that the wafer is in the centre of a sputtering tool.

    Find:

    P(H), P(C), P(HC), P(HC)

    P(H) = 358/940

    P(C) = 626/940

    P(HC) = probability that the wafer is from the centre of the sputtering tool and contains high level contamination = 112/940

    P(HC) = probability that the wafer is from the centre of the sputtering tool, or contains high level contamination, or both = P(H) + P(C) - P(HC) = 872/940

  • Addition Rules: Mutually Exclusive Events

    32

  • Example

    33

    Example: semiconductor process with more details.

    1-What is the probability that a wafer was either at the edge or that it contains 4 or more particles?

    2-What is the probability that a wafer contains less than 2 particles, or it is both at the edge and contains more than 4 particles?

  • Addition Rules

    34

    What is the probability that a wafer was either at the edge or that it contains 4 or more particles?

    Let E1 denote the event that a wafer contains 4 or more particles

    Let E2 denote the event that a wafer is at the edge

    Then, the required probability is P(E1E2)

    P(E1E2) = P(E1) + P(E2) - P(E1E2) = (0.05 +0.1) + 0.28 (0.01+0.03) = 0.39

  • Addition Rules

    35

    What is the probability that a wafer contains less than 2 particles, or it is both at the edge and contains more than 4 particles?

    Let E1 denote the event that a wafer contains less than 2 particles

    Let E2 denote the event that a wafer is both at the edge and contains more than 4 particles

    Then, the required probability is P(E1E2)

    P(E1E2) = P(E1) + P(E2) - P(E1E2) = (0.2 +0.4) + 0.03 0 = 0.63

    E1 and E2 are mutually exclusive; e.g., E1E2 = .

  • Addition Rules: Union of Three Events

    36

  • Addition Rules: Generalized Mutually Exclusive Events

    37

  • Addition Rules: Generalized Mutually Exclusive Events

    38

    Venn diagram of four mutually exclusive events

  • Conditional Probability

    To introduce conditional probability, consider an example involving

    manufactured parts.

    Let D denote the event that a part is defective and let F denote the event

    that a part has a surface flaw.

    Then, we denote the probability of D given, or assuming, that a part has

    a surface flaw as P(D|F). This notation is read as the conditional

    probability of D given F, and it is interpreted as the probability that a

    part is defective, given that the part has a surface flaw.

    39

  • Conditional Probability - Example

    40

    P(D|F) = 25%

    P(D|F) = 5%

    In a manufacturing process, 10% of parts contain visible

    surface flaws (F) and 25% of these parts are defective (D).

    5% of parts without surface flaws are defective.

  • Conditional Probability Example

    41

    The table below shows an example of 400 parts classified by surface flaws and

    functionality defective.

    P(D|F) =

    P(D|F) =

    10/40 = 0.25

    18/360 = 0.05

  • Conditional Probability

    42

    Same Last Example:

    P(D|F)=P(DF)/P(F) = (10/400)/(40/400) = 10/40

    P(F) = 40/400 P(F|D) = P(FD)/P(D) = 10/28

    P(D) = P(D/F)P(F)+P(D/F)P(F)

    P(D) = 28/400 = (18/360)(360/400) + (10/40)(40/400)

  • Multiplication Rule

    43

    10 parts from tool 1

    40 parts from tool 2

    What is the conditional probability that a part from

    tool 2 is selected randomly (without replacement),

    given that a part from tool 1 is selected first?

    Let E1 denote the event that the first part from tool

    1, and E2 denote the event that the second part is

    from tool 2.

    P(E2|E1) =?

  • Multiplication and Total Probability Rules

    Let E denote the outcomes with the first part from tool 1 and the

    second part from tool 2.

    Find P(E)?

    P(E2|E1) = 40/49 (from previous slide)

    P(E1) = 10/50

    P(E) represents the outcome of selecting part of tool 1 and then

    followed by the outcome of the conditional probability of selecting

    a part of tool 2 given that a part of tool 1 is already selected.

    P(E) = P(E1) P(E2|E1) = (40/49)(10/50) = 8/49 = P(E2E1)

    44

  • Sometimes the probability of an event is given under each of several conditions.

    Example:

    In semiconductor manufacturing, the probability is 0.01 that a chip is subjected to high level of contamination causes a product failure.

    The probability is 0.005 that a chip is not subjected to high level of contamination causes a product failure.

    In a particular production run, 20% of the chips are subjected to high levels of contamination.

    What is the probability that a product using one of these chips fails?

    Total Probability Rule

    45

  • The previous example can be summarized as following

    Total Probability Rule

    46

  • Understanding the Total Probability Rule

    47

    Total Probability Rule

    Partitioning an event into two mutually exclusive subsets. Partitioning an event into several mutually exclusive subsets.

  • Total Probability Rule (Two Events)

    48

  • Example:

    Multiplication and Total Probability Rules

    49

  • Total Probability Rule: Multiple Events

    50

    Total Probability Rule (multiple events)

  • Independence

    51

    Definition (two events)

  • Independence

    52

    Definition (multiple events)

  • Bayes Theorem

    53

    Remember what we mentioned about total probability rules that

    sometimes information is given in terms of conditional probability.

    But after a random experiment generates an outcome, we may be

    interested in the probability that a condition was present (e.g. high

    contamination) given an outcome (e.g. a semiconductor failure).

    In other words; you would like to know the chance that the failure is due

    to the high contamination condition.

  • Bayes Theorem

    54

    Thomas Bayes addressed this issue in 1700 and developed the Bayes

    theorem.

    Example: recall the example solved on slide 64.

    Now we want to know

    the probability of

    P(H|F)?

    i.e., a failure has happen

    and the H contamination

    Condition exist.

  • Bayes Theorem

    55

    P(H|F) = P(F|H) P(H)/P(F) P(F) was estimated on previous slide, P(F) = 0.024 P(H|F) = (0.10x0.20)/ 0.024 = 0.83 We can use the Total Probability Rule (multiple events) to obtain the general form which represents Bayes theorem on the next slide.

  • Bayes Theorem

    56

    Bayes Theorem

    There are many conditions that could cause the outcome B (E1, E2, , EK).

    All of these conditions are mutually exclusive. What is the chance that the

    current outcome has been caused by the E1 condition.

  • Example: New Medical Procedure

    Because of a new medical procedure has been shown to be effective in

    the early detection of an illness, a medical screening of the population is

    proposed.

    The probability that the test correctly indentifies someone with illness as

    positive is 0.99.

    The probability that the test correctly identifies someone without the

    illness as negative is 0.95.

    The incidence of the illness in the general population is 0.0001.

    You take the test and the result is positive, what is the probability that

    you have the illness? 57

  • Example: New Medical Procedure

    Let D denotes the event that you have the illness,

    Let S denotes the event that the test signals positive.

    The probability requested can be denoted as P(D|S).

    Why is it D|S??

    The test results is already known (either positive or negative).

    So this is given. Now we are looking for the chance that is

    either caused by the condition of being ill or not ill. 58

  • Example: New Medical Procedure

    Based on these definitions, what is the given information:

    1) P(D) = 0.0001

    2) P(D) = 1 0.0001

    3) P(S|D) = 0.99

    4) The probability that the test correctly signals someone without

    the illness as negative is 0.95. Consequently, the probability

    of a positive test without the illness is: P(S|D)=0.05.

    59

  • Example: New Medical Procedure

    From Bayes Theorem:

    60

    )]P(D )D|P(S P(D) D)|[P(S

    P(D) D)|P(S S)|P(D

    0.0001)]-0.05P(1 01)[0.99(0.00

    1)0.99(0.000

    002.0506

    1

  • Example: New Medical Procedure

    That is, the probability of non having the illness given a

    positive result from the test is only 0.2% (or 1 in 500).

    Surprisingly, even though the test is effective, in the

    sense that P(S|D) is high and P(S|D) is low, because

    the incidence of illness in general population is low, the

    chances are quite small that you actually have the

    disease even if the test is positive. 61

  • Chapter 3

    1

  • Random Variables

    2

  • Random Variables

    3

  • Introductory Example

    Supposes that a days production of 850 manufacturing parts contains 50 parts that do not conform to customer requirements.

    Two parts are selected at random, without replacement, from the batch.

    Let the random variable X equal the number of nonconforming parts in the sample.

    This case can be generalized as following in next slide 4

    What are all possible value of X, and what are the associated probabilities.

  • 3-1 Discrete Random Variables

    Many physical systems can be modeled by the same or

    similar random experiments and random variables.

    The distribution of the random variables involved in each

    of these common systems can be analyzed, and the results of

    the analysis can be used in different applications and

    examples.

    So instead of using the sample space of the random

    experiment we describe the distribution of a particular

    random variable.

    5

  • 3-1 Discrete Random Variables

    Example: Work Sampling

    A voice communication system for a business contains 48 external lines.

    At a particular time, the system is observed and some of the lines are used.

    X denotes the number of lines in use (0

  • 3-2 Probability Distributions and

    Probability Mass Functions

    Probability Distribution: Probability distribution of a random variable X is a description

    of the probabilities associated with possible values of X. For a discrete random variable, the distribution is often

    specified by just a list of the possible values a long with the probabilities of each

    In some cases, it is convenient to express the probability in terms of a formula.

    7

  • 3-2 Probability Distributions and

    Probability Mass Functions

    Example: Bits in error

    There is a chance that a bit transmitted in a digital transmission channel is received in error.

    Let X equal the number of bits in error in the next 4 bits transmitted. The possible values of X are (0,1,2,3,4).

    The probability distribution of X can be specified by the possible

    values along with the probability of each:

    P(X=0) = 0.656 P(X=1) = 0.292 P(X=2) = 0.049

    P(X=3) = 0.004 P(X=4) = 0.0001

    8

  • 3-2 Probability Distributions and

    Probability Mass Functions

    Probability distribution for bits in error. 9

  • 3-2 Probability Distributions and

    Probability Mass Functions

    Loadings at discrete points on a long, thin beam.

    Example: Loading on a long beam. At each discrete values of X, what is the probability of a certain weight.

    10

  • 3-2 Probability Distributions and

    Probability Mass Functions

    Definition

    11

  • Example

    X: a random variable denotes the number of semiconductor

    wafers that need to be analyzed to detect a large particle of

    contamination.

    Probability that a wafer contains a large particle is 1%.

    Determine the probability distribution of X.

    Let p denote a wafer with a large particle is present

    Let a denote a wafer with a large particle is absent 12

  • Example

    The sample space is infinite and represents all possible sequences

    that starts with a and end with p.

    S = {p, ap, aap, aaap, aaaap, and so forth}

    Examples: P(X=1) = P(p) = 0.01

    P(X=2) = P(ap) = 0.99*0.01 = 0.0099

    13

  • Example 3-5 (continued)

    Describing the probabilities associated with X in terms of

    this formula is the simplest method of describing the

    distribution in this example.

    14

  • 3-3 Cumulative Distribution Functions

    For the example of Bits in error (slide # 9), if we would like to

    find P(X3), we know that it is the union of the mutually

    exclusive events: X=0, X=1, X=2, and X=3.

    Hence P(x3) will be summation (or accumulation) of the

    probability of these events:

    P(X3) = P(X=0)+ P(X=1)+ P(X=2)+ P(X=3) = 0.9999

    Using the cumulative probabilities is another way to

    represent the probability distribution of a random variable.

    15

  • 3-3 Cumulative Distribution Functions

    Definition

    16

  • Example 3-8

    Supposes that a days production of 850 manufacturing parts contains 50 parts that do not conform to customer requirements.

    Two parts are selected at random, without replacement, from the batch.

    Let the random variable X equal the number of nonconforming parts in the sample.

    What is the probability mass function of X? What is the cumulative distribution function of X?

    17

  • Example 3-8

    X={0, 1, 2}

    18

    probability mass function

    cumulative distribution

  • Example 3-8

    Cumulative distribution function for -x

    19

  • Probability Distributions and Probability Mass Functions

    20

  • 21

  • 3-4 Mean and Variance of a Discrete

    Random Variable

    Definition

    22

  • 3-4 Mean and Variance of a Discrete

    Random Variable

    A probability distribution can be viewed as a loading with the mean

    equal to the balance point.

    Parts (a) and (b) illustrate equal means, but Part (a) illustrates a larger

    variance.

    23

  • 3-4 Mean and Variance of a Discrete

    Random Variable

    The probability distribution illustrated in Parts (a) and (b) differ even

    though they have equal means and equal variances.

    What does the length of each line represent in the above graphs? 24

  • Example 3-11

    Determine the mean and standard deviation of the number of messages sent per hour.

    The number of messages sent per hour over a computer network has the following distribution:

    25

  • 3-4 Mean and Variance of a Discrete

    Random Variable

    Expected Value of a Function [h(X)] of a Discrete Random Variable

    Where h(X) is any function of the random variable X.

    Example: the expected value of the function (X-)2 is the variance of the random

    variable X (see slide # 19).

    26

  • 3-4 Mean and Variance of a Discrete

    Random Variable

    Example: Bits in error (see slide #9)

    Let X equal the number of bits in error in the next 4 bits transmitted. P(X=0) = 0.656; P(X=1) = 0.292; P(X=2) = 0.049;

    P(X=3) = 0.004; P(X=4) = 0.0001

    What is the expected value of X2? h(X) = X2

    E[h(X)] = 02 x 0.656 + 12 x 0.292 + 22 x 0.049 + 32 x 0.004 + 42 x 0.0001 = 0.52

    27

  • Chapter 3

    1

    Outline Discrete Uniform Distribution Binomial Distribution Geometric Distribution Poisson Distribution

  • 3-5 Discrete Uniform Distribution

    The first digit of a parts serial number is likely to be any one of the digits 0 to 9. If a part is selected from a batch, what is the probability that the first digit is 8? The first number is a discrete random variable with range:

    R = {0, 1, 2, , 9}

    Probability of each value in this range is equal. Hence, f(x) = 1/(10) = 0.1 = 10%

    This is a serial number of a product: 723-975632188

    2

  • 3-5 Discrete Uniform Distribution

    Probability mass function for a discrete uniform random variable.

    3

  • 3-5 Discrete Uniform Distribution

    Definition

    4

  • 3-5 Discrete Uniform Distribution

    Estimating the

    Mean and Variance

    Example of Product serial number:

    B C BxC (B-3.5)2 CxE Face Probability E

    1 1/6 1/6 6.25 6.25/6 2 1/6 2/6 2.25 2.25/6 3 1/6 3/6 0.25 0.25/6 4 1/6 4/6 0.25 0.25/6 5 1/6 5/6 2.25 2.25/6 6 1/6 6/6 6.25 6.25/6

    Sum 3.5 2.91667 Mean Variance

    5

  • 3-5 Discrete Uniform Distribution

    Mean and Variance

    6

  • 3-5 Discrete Uniform Distribution

    Example of work sampling

    A voice communication system for a business contain 48 external lines. At a particular time, the system is observed and some of the lines are used. X denotes the number of lines in use (0 = X = 48). Assume X is uniformly distributed (range: 0-48).

    E(X) = (48+0)/2=24 = {[(48-0+1)2 1]/12}0.5 = 14.14 = Standard deviation

    Let Y denote the proportion of the 48 lines that are in use at particular time. Then Y = X/48.

    E(Y) = E(X)/48 = 24/48 = 0.5 = 50% V(Y) = V(X)/(48)2 = 0.087

    7

  • 8

    End of Discrete Uniform Distribution

  • Any example of Random experiment with (only) two possible outcomes?

    9

    Examples:

    Tossing a coin The gender of an expected baby The result of a basketball match Whether a produced part is good or defective

    9

    Two options

    success with probability p

    failure with probability 1-p

    Bernoulli Distribution

  • Bernoulli Distribution

    What if a Bernoulli Distribution is repeated n times?

    Jacob Bernoulli (1655-1705)

  • 3-6 Binomial Distribution

    Random experiments and random variables

    11

  • 3-6 Binomial Distribution

    Random experiments and random variables

    12

  • 3-6 Binomial Distribution

    13

    Given n Bernoulli trials

    How many success and how many failure ?

    P (1-P) (1-P) P (1-P) (1-P)

  • 3-6 Binomial Distribution

    Example:

    The chance that a bit transmitted through a digital

    transmission channel is received in error is 0.1.

    Assume that the transmission trials are independent.

    Let X = the number of bits in error in the next 4 bits

    transmitted.

    Determine P(X=3). 14

  • 3-6 Binomial Distribution

    Let E denote a bit in error, and O denote the bit is fine.

    The outcomes can be represented as following:

    15

  • 3-6 Binomial Distribution

    Definition

    16

  • 3-6 Binomial Distribution

    Mean and Variance

    17

  • 3-6 Binomial Distribution

    Binomial distributions for selected values of n and p.

    For a fixed n, the distribution becomes more symmetric as p increases from 0 to 0.5 or decreases from 1 to 0.5.

    For a fixed p, the distribution becomes more symmetric as n increases. 18

  • 3-6 Binomial Distribution

    Example

    19 P(X=x) = Number of outcomes that result in x errors) x (0.1)x(0.9)16-x

  • 3-6 Binomial Distribution

    Example (cont.)

    20

  • 3-6 Binomial Distribution

    Example (cont.)

    21

  • 3-6 Binomial Distribution

    Example

    For the number of transmitted bits received in error:

    n = 4, p=0.1,

    So E(X) = 4(0.1) = 0.4

    V(X) = 4(0.1)(0.9) = 0.36

    22

  • End Binomial Distribution

  • 3-7 Geometric Distribution

    24

    How many trials (including the success) to get a success for the first time?

    (1-P) (1-P) (1-P) (1-P) P (success for the first time)

    Number of success is known =1 Number of trials is unknown =Random variable

  • 3-7 Geometric Distribution

    Example

    25

    1-What is the sample space? 2-What are the possible values of X 3-What is the event associated with X=5 4-Calculate the probability of X=5, P(X=5) 5-Derive the Probability Mass function of X

  • 3-7 Geometric Distribution

    Example

    26

  • 3-7 Geometric Distribution

    Definition

    27

  • 3-7 Geometric Distribution

    Definition

    28

  • 3-7 Geometric Distribution

    Geometric distributions for selected values of the parameter p.

    Probability decreases as

    in Geometric series.

    That is why it is called

    Geometric.

    29

  • 3-7 Geometric Distribution

    Example X: a random variable denotes the number of semiconductor

    wafers that need to be analyzed to detect a large particle of

    contamination (see slide # 13).

    Probability that a wafer contains a large particle is 1%.

    What is the probability that exactly 125 wafers will be

    analyzed to find the first wafer with a large particle?

    30

  • 3-7 Geometric Distribution

    Example

    Let X denote the number of samples analyzed until a large

    particle is detected.

    Then X is a geometric random variable with p = 0.01.

    The requested probability is:

    31

  • 3-7 Geometric Distribution

    Example

    32

    X: a random variable denotes the number of times a Die

    need to be thrown to get face number 2 on top.

    What is the probability that exactly 4 trials will be needed to

    get face n2 on top for the first time ?

    P(X=14\ X

  • 3-7 Geometric Distribution

    Lack of Memory Property

    P(X=106|100 are transmitted) = P(X=6)

    33

  • 3-7 Geometric Distribution

    Lack of Memory Property

    34

  • Poisson Distribution

    Example

    Consider the transmission of n bits over a digital communication channel.

    Let the RV equal the number of bits in error.

    When the probability that a bit is in error is constant and the

    transmissions are independent, X has a binomial distribution.

    Let p denote the probability that a bit is in error.

    Let = pn. Then E(x) = pn = . 35

  • Poisson Distribution

    Now, suppose that the number of bits transmitted increases and the

    probability of an error decreases exactly enough that pn remains

    equal to a constant. That is, n increases and p decreases

    accordingly, such that E(x) = pn = remains constant.

    Then, with some work on the above equation, it can be shown that:

    36

  • Poisson Distribution

    So that:

    Also, because the number of bits transmitted tends to

    infinity, the number of errors can equal any nonnegative

    integer.

    Therefore, the range of X is the integers from zero to

    infinity. 37

  • Poisson Distribution

    Example Flaws occur at random along the length of a thin copper wire.

    Let X denote the RV that counts the number of flaws in a length of L

    millimeters of wire and suppose the average number of flaws in L is

    .

    Partition the length of wire into n subintervals of small length, say 1

    micrometer each. L

    38

  • Poisson Distribution

    Example

    If the subinterval chosen is small enough, the probability

    that more than one flaw occurs in the subinterval is

    negligible.

    Every subinterval has the same probability of containing

    a flaw, say p.

    Finally, if we assume that the probability that a

    subinterval contains a flaw is independent of other

    subintervals, then:

    39

  • Poisson Distribution

    Example

    E(X) = =np

    Then p = /n

    With small enough subintervals, n is very large and p is

    very small, the distribution of X can be obtained as in

    the previous example.

    40

  • Definition

    Poisson Distribution

    41

  • Consistent Units

    Poisson Distribution

    42

  • Poisson Distribution

    Example

    Contamination is a problem in the manufacture of optical

    storage disks.

    The number of particles of contamination that occur on an

    optical disk has a Poisson distribution, and the average number

    of particles per centimeter squared of media surface is 0.1.

    The area of a disk under study is 100 squared centimeters.

    Find the probability that 12 particles occur in the area of a disk

    under study.

    43

  • Poisson Distribution

    Let X denote the number of particles in the area of a disk

    under study.

    Because the mean number of particles is 0.1 particles per

    cm2, then:

    Therefore:

    44

  • Poisson Distribution

    The probability that zero particles occur in the area of the

    disk is:

    Determine the probability that 12 or less particles occur in

    the area of the disk under study.

    The probability is:

    You need a computer to estimate it!! 45

  • Mean and Variance

    Poisson Distribution

    46

  • Poisson Distribution

    47

  • Poisson Distribution

    Suppose that the number of flaws follow a Poisson distribution with

    mean of 2.3 flaws per millimeter.

    Determine the probability of exactly 2 flaws in 1 millimeter of wire

    P(X=2) = e-2.3 2.32/2! = 0.265

    Determine the probability of 10 flaws in 5 millimeters of wire.

    E(X) = 5 mm x 2.3 flaws/mm = 11.5 flaws

    P(X=10) = e-11.5 11.510/10! = 0.113

    Example: The copper wire

    48

  • Poisson Distribution

    Determine the probability of at least 1 flaw in 2

    millimeters of wire.

    E(X) = 2 mm x 2.3 flaws/mm = 4.6 flaws

    P(X1) = 1- P(X=0) =1-e-4.6 = 0.9899

    Example: The copper wire

    49

  • Example 3-118

    The number of failures of a testing instrument from contamination particles on the product is a Poisson random variable with a mean of 0.02 per hour.

    (a) what is the probability that the instrument does not fail in an 8-hour shift?

    (b) what is the probability of at least one failure in a 24-hour day?

    50

  • Example 3 131

    The probability that your call to a service line is answered in less than 30 seconds is 0.75. Assume that your calls are independent.

    If you call 10 times, what is the probability that exactly 9 of your calls are answered within 30 seconds?

    If you call 20 times, what is the probability that at least 16 calls are answered in less than 30 seconds?

    If you call 20 times, what is the mean number of calls that are answered in less than 30 seconds?

    51

  • 4-1 Continuous Random Variables

    2

    The objective of this chapter is to introduce some continuous random distributions.

  • 4-2 Probability Distributions and

    Probability Density Functions

    Density function of a loading on a long, thin beam.

    Density functions are commonly used to describe physical systems.

    The load over the interval can be found by integrating the density function over the interval.

    You can not find the load at a discrete point. But you can find the load

    over an interval of length. How?

  • 4-2 Probability Distributions and

    Probability Density Functions

    Probability determined from the area under f(x).

    Similar to discrete R.V, the probability density function f(x) can be

    used to describe the probability distribution of a continuous random

    variable X.

    The probability that the random variable X is between point a and b is

    the integral of f(x) from a to b.

  • 4-2 Probability Distributions and

    Probability Density Functions

    Definition

  • 4-2 Probability Distributions and

    Probability Density Functions

    Histogram approximates a probability density function.

    Probability of each interval

  • 4-2 Probability Distributions and

    Probability Density Functions

    P(X=x) = 0

  • 4-2 Probability Distributions and

    Probability Density Functions

    Example

    If a part with a diameter larger than 12.60 millimeters is scrapped, what proportion of parts is scrapped?

    A part is scrapped if X12.60.

  • 4-2 Probability Distributions and

    Probability Density Functions

    Probability density function for previous example

  • 4-2 Probability Distributions and

    Probability Density Functions

    Example (cont.)

  • 4-3 Cumulative Distribution Functions

    Definition

  • 4-3 Cumulative Distribution Functions

    Example

    For the drilling operation in the previous example, F(x) consists of two expressions:

    And for 12.5 x

  • 4-3 Cumulative Distribution Functions

    Example (cont.)

    Therefore,

    The figure below displays a graph of F(x).

    Cumulative distribution function

  • 4-4 Mean and Variance of a Continuous

    Random Variable

    Definition

  • 4-4 Mean and Variance of a Continuous

    Random Variable

    Expected Value of a Function of a Continuous

    Random Variable

  • Exercises in section 4-2, 4-3, 4-4

  • UNIFORM DISTRIBUTION

    NORMAL DISTRIBUTION

    Exponential Distribution

  • 4-5 Continuous Uniform Random

    Variable

    Definition

  • 4-5 Continuous Uniform Random

    Variable

    Continuous uniform probability density function.

  • 4-5 Continuous Uniform Random

    Variable

    Mean and Variance

  • 4-5 Continuous Uniform Random

    Variable

    Example

    What is the probability that a measurement of current is between 5 and 10 mA?

    X Continuous Uniform Random Variable with a range of 0 to 20 mA (i.e., a=0 and b=20).

    f(x) = 1/(20-0)= 0.05,

  • 4-5 Continuous Uniform Random

    Variable

    Example The mean and variance formulas can be applied with a=0 and b=20. Therefore,

    Consequently, the standard deviation of X is 5,77 mA.

  • 4-5 Continuous Uniform Random

    Variable

    The cumulative distribution function of a continuous uniform random variable is obtained by integration. If a x b,

    Therefore, the complete description of the cumulative function of a continuous uniform variable is:

  • A movie theatre scheduled three sessions for a movie at: 5:00pm, 7:00pm and 9:00pm.

    Once the movie starts, the gate will be closed. A visitor will arrive at the movie theatre at a time

    uniformly distributed between 4:00pm and 9:00pm. (i.e., ~[4: 00, 9: 00])

    1- Determine the cumulative distribution function of the arrival time in minutes (Hint: set

    4:00 pm =0 min)

    Use the cumulative distribution function to determine the following:

    1- The probability that the visitor attends at least one movie session

    2- The probability that the visitor waits more than 20min for a movie session

    3- The probability that the visitor waits less than 10min for a movie session

    4- The probability that the visitor attends the second show knowing that he missed the first

    show.

    Exercise

  • 1- Determine the cumulative distribution function of the arrival time in minutes (Hint: set 4:00 pm =0 min)

    ~[0, 300]

    =

    0 , < 0

    300, 0 < 300

    1, 300 <

    1- The probability that the visitor attends at least one movie session

    According to the arrival distribution the visitor will come any time between 4pm and 9pm (0 and 300) which means he will

    attend for sure one movie session (worst case he will come at 9pm and then attend the 3rd movie session),

    P (attending at least one movie session)= 0 300 =F(300)-F(0)=1

    2The probability that the visitor waits more than 20min for a movie session

    0 40 + 60 160 + 180 280 = 40 + 160 + 280 0 60 180

    =240

    300= 0.8

    3- The probability that the visitor waits less than 10min for a movie session

    50 60 + 170 180 + 290 300 = 60 + 180 + 300 50 170 290

    = 3 10

    300=

    30

    300= 0.1

    The probability that the visitor attends the second show knowing that he missed the first show.

    A: be able to attend the second session (t60)

    \ =

    =

    6060=

    180 (60)

    1(60)=

    180

    30060

    300

    160

    300

    =120

    240= 0.5

  • 4-6 Normal Distribution

    Definition

  • 4-6 Normal Distribution

    Normal probability density functions for selected values of the

    parameters and 2.

  • 4-6 Normal Distribution

  • 4-6 Normal Distribution

    Definition : Standard Normal

  • 4-6 Normal Distribution

    Example

    Standard normal probability density function

    Assume that X is a standard normal distribution variable.

    Find P(Z 1.5).

    Appendix table II provides P(Z z).

  • z = 1.50 = 1.5 + 0.00

    P(Z 1.5) =0.9332

    P(Z 1.93)

    z = 1.93 = 1.9 + 0.03

    =0.9732

    P(Z 1.938) =P(Z 1.93) or

    =P(Z 1.94)

  • 4-6 Normal Distribution

    Example

    Standard normal probability density function

    z = 1.50 = 1.5 + 0.00

    Assume that X is a standard normal distribution variable.

    Find P(Z 1.5).

    Appendix table II provides P(Z z).

  • 4-6 Normal Distribution

    Example

    z = 1.53 = 1.5 + 0.03

    Find P(Z 1.53).

  • 4-6 Normal Distribution

    Standardizing

  • 4-6 Normal Distribution

    Example

    Let X denote the current in mA.

    The requested probability is P(X13)

    Transform it to Z = (X-10)/2

    So X 13 corresponds to Z (13-10)/2=1.5.

    Hence we look for P(Z 1.5) of the standard normal Table.

    The table gives P(Z 1.5).

  • 4-6 Normal Distribution

    Standardizing a normal random variable.

  • 4-6 Normal Distribution

    To Calculate Probability

  • 4-6 Normal Distribution

    Example (cont.)

    Based on the previous example, what is the probability that the current

    is between 9 and 11 mA?

  • 4-6 Normal Distribution

    Example (cont.)

    Determine the value for which the probability that a current is below it is

    0.98.

    So we need x such that P(X x) = 0.98.

    Determining the value of x to meet a specified probability.

  • 4-6 Normal Distribution

    Example (cont.)

    From Appendix table II, we can find the value of z that gives

    a probability of 0.98.

    2.05 = (x-10)/2 x = 14.1

  • 4-8 Exponential Distribution

    Definition

  • 4-8 Exponential Distribution

    Mean and Variance

  • 4-8 Exponential Distribution

    Example

    In a large corporate computer network, user log-ons to the

    system can be modeled as Poisson process with a mean

    of 25 log-ons per hour.

    What is the probability that there are no log-ons in an

    interval of 6 minutes?

  • 4-8 Exponential Distribution

    Let X denote the time in hours from the start of the interval until the

    first log-on.

    Then X has an exponential distribution with =25 log-ons/hour.

    Required: P(X6minutes)

    is given in hours. So we need to change all times in hours.

    6 minutes = 0.1 hour !!!

    Example (cont.)

  • 4-8 Exponential Distribution

    Probability for the exponential distribution in Example 4-21.

    Example (cont.)

  • 4-8 Exponential Distribution

    Example (cont.)

  • 4-8 Exponential Distribution

    Example (cont.)

  • 4-8 Exponential Distribution

    Example (cont.)

  • 4-8 Exponential Distribution

    Example 2

    Let X denote the time between detections of a particle with a Geiger

    counter.

    Assume that X has an exponential distribution with a mean =1.4.

    minutes.

    The probability that we detect a particle within 30 seconds of starting

    the counter is:

  • 4-8 Exponential Distribution

    Example 2 (cont.)

    Now suppose we turn the Geiger counter and wait 3 minutes without detecting a particle.

    What is the probability that a particle is detected in the next 30 seconds?

    Do you think that the probability will be higher than 0.3?

    No.

    Why?

    This is the nature of exponential distribution!!!!

    Prove it!!!

  • 4-8 Exponential Distribution

    This situation can be expressed as a conditional probability

    that, P(X3.5|X3).

    P(X3.5|X3) = P(3X3.5) / P(X3).

    Where

    Example 2 (cont.)

  • 4-8 Exponential Distribution

    Example (cont.)

    and

    Therefore P(X3.5|X3) = 0.0035/0.117 = 0.3

    The fact that you waited 3 minutes without a detection does

    not change the probability of a detection in the next 30

    seconds.

  • 4-8 Exponential Distribution

    Lack of Memory Property

    So we have two distributions with this property; one

    for discrete RVs. (Geometric), the second is for the

    continuous RVs (Exponential).

  • 5-2 TWO OR MORE CONTINUOUS RANDOM VARIABLES

  • + 5-1 Two Discrete Random Variables

    In previous chapters, we studied the probability distributions

    of a single random variable.

    Sometimes it is useful to study more than one random

    variable in a random experiment.

    Example 1: transmitted signals

    X= the number of high quality signals

    Y = the number of low quality signals

  • + 5-1 Two Discrete Random Variables

    Example 2: Injection molded parts

    X = length of dimension of injected part

    Y = length of another dimension of the injected part

    Let the specs of X is: 2.95 to 3.05, and for Y is: 7.60 to 7.80

    We may be interested in the probability that a part satisfies both specs; that is P(2.95 X 3.05) and P(7.69 Y 7.80) .

    In general, if X and Y are two random variables, the probability distribution that defines their simultaneous behavior is called a joint probability distribution.

  • + 5-1 Two Discrete Random Variables

    Example 5-1

    Calls are made to check the airline schedule at your departure city.

    You monitor the number of bars of signal strength on your cell phone

    and the number of times you have to state the name of your departure

    city before the voice system recognizes the name.

    In the first 4 bits transmitted, let:

    X denote the number of bars of signal strength on your cell phone

    Y denote the number of times you need to state your departure city

  • +Joint probability distribution of X and Y in Example 5-1.

    By specifying the probability of each of the points, we define the range of

    the random variables (X,Y) to be the set of points (x,y) in two-dimensional

    space for which the probability that X=x & Y=y is positive.

    The joint probability distribution of the two random variables is

    sometimes referred to as the bivariate probability distribution or bivariate distribution.

  • + 5-1 Two Discrete Random Variables

    5-1.1 Joint Probability Distributions

  • + 5-1 Two Discrete Random Variables

    5-1.2 Marginal Probability Distributions

    Example 5-2 Find the marginal probability distribution of X in example 5.1:

    P(X=3) = P(X=3, Y=1)+ P(X=3, Y=2)+ P(X=3, Y=3)+ P(X=3, Y=4)

    = 0.25 + 0.2 + 0.05 + 0.05 = 0.55

  • + 5-1 Two Discrete Random Variables

    Example 5-3

    Based on example 5-1:

    P(Y=1| X=3) = P(X=3, Y=1)/P(X=3)

    = fXY(3,1)/fX(3) = 0.25/0.55 = 0.454

    The probability that Y=2 given X=3 is:

    P(Y=2| X=3) = P(X=3, Y=2)/P(X=3)

    = fXY(3,2)/fX(3) = 0.2/0.55 = 0.364

  • + 5-1 Two Discrete Random Variables

    5-1.3 Conditional Probability Distributions

  • + 5-1 Two Discrete Random Variables

    5-1.3 Conditional Probability Distributions

  • +Example 5-4

    Conditional probability distributions of Y given X, fY|x(y) in Example 5-6.

  • 5-1 Two Discrete Random Variables

    Definition: Conditional Mean and Variance

  • + 5-1 Two Discrete Random VariablesExample 5-5

    Based on example 5-1, the conditional mean of Y given X=1 is

    obtained from the conditional distribution in figure 5.3:

    E(Y|1) = Y|1 = 1(0.05) + 2(0.1) + 3(0.1) + 4(0.75) = 3.55 The conditional mean is interpreted as the expected number of times

    the city name is stated given that one bar of signal is present.

    The conditional variance of Y given X=1 is:

    V(Y|1) = Y|1 = (1-3.55)2 (0.05)+ (2-3.55)2 (0.1) + (3-3.55)2 (0.1) + (4-3.55)2 (0.75) = 0.748

  • + 5-1 Two Discrete Random Variables

    5-1.4 IndependenceIn some random experiments, knowledge of the values of X does not

    change any of the probabilities associated with the values of Y.

    Example 5-6 In a plastic molding operation, each part is classified as to whether it

    conforms to color and length specifications.

    Define the random variable X & Y as:

  • + 5-1 Two Discrete Random Variables

    Example 5-6 (cont.) Assume the joint probability

    distribution of X & Y is defined by

    fXY(x,y) in the Figure. The marginal probability

    distribution of X & Y are also

    shown.

    Note that fXY(x,y) = fX(x) fY(y)

  • + 5-1 Two Discrete Random Variables

    Example 5-6 (cont.)

    The conditional probability mass

    function fY|X(y) is shown in Figure.

    Notice that for any x, fY|x(y) = fY(y). That is, knowledge of whether or not the part meets specifications

    does not change the probability that it meets length specifications.

  • + 5-1 Two Discrete Random Variables

    5-1.4 Independence

  • + Announcement

    Midterm Exam 2 will take place on MON 17th December

    SAT 22 DEC 10-11:30

  • 5-2 TWO OR MORE CONTINUOUS RANDOM VARIABLES

  • + 5-1 Two Discrete Random Variables

    In previous chapters, we studied the probability distributions

    of a single random variable.

    Sometimes it is useful to study more than one random

    variable in a random experiment.

    Example 1: transmitted signals

    X= the number of high quality signals

    Y = the number of low quality signals

  • + 5-1 Two Discrete Random Variables

    Example 2: Injection molded parts

    X = length of dimension of injected part

    Y = length of another dimension of the injected part

    Let the specs of X is: 2.95 to 3.05, and for Y is: 7.60 to 7.80

    We may be interested in the probability that a part satisfies both specs;

    that is P(2.95 X 3.05) and P(7.69 Y 7.80) .

    In general, if X and Y are two random variables, the probability distribution

    that defines their simultaneous behavior is called a joint probability

    distribution.

  • + 5-1 Two Discrete Random Variables

    Example 5-1

    Calls are made to check the airline schedule at your departure city.

    You monitor the number of bars of signal strength on your cell phone

    and the number of times you have to state the name of your departure

    city before the voice system recognizes the name.

    In the first 4 bits transmitted, let:

    X denote the number of bars of signal strength on your cell phone

    Y denote the number of times you need to state your departure city

  • +

    Joint probability distribution of X and Y in Example 5-1.

    By specifying the probability of each of the points, we define the range of

    the random variables (X,Y) to be the set of points (x,y) in two-dimensional

    space for which the probability that X=x & Y=y is positive.

    The joint probability distribution of the two random variables is

    sometimes referred to as the bivariate probability distribution or

    bivariate distribution.

  • + 5-1 Two Discrete Random Variables

    5-1.1 Joint Probability Distributions

  • + 5-1 Two Discrete Random Variables

    5-1.2 Marginal Probability Distributions

    Example 5-2

    Find the marginal probability distribution of X in example 5.1:

    P(X=3) = P(X=3, Y=1)+ P(X=3, Y=2)+ P(X=3, Y=3)+ P(X=3, Y=4)

    = 0.25 + 0.2 + 0.05 + 0.05 = 0.55

  • + 5-1 Two Discrete Random Variables

    Example 5-3

    Based on example 5-1:

    P(Y=1| X=3) = P(X=3, Y=1)/P(X=3)

    = fXY(3,1)/fX(3) = 0.25/0.55 = 0.454

    The probability that Y=2 given X=3 is:

    P(Y=2| X=3) = P(X=3, Y=2)/P(X=3)

    = fXY(3,2)/fX(3) = 0.2/0.55 = 0.364

  • + 5-1 Two Discrete Random Variables

    5-1.3 Conditional Probability Distributions

  • + 5-1 Two Discrete Random Variables

    5-1.3 Conditional Probability Distributions

  • +

    Example 5-4

    Conditional probability distributions of Y given X, fY|x(y) in Example 5-6.

  • 5-1 Two Discrete Random Variables

    Definition: Conditional Mean and Variance

  • + 5-1 Two Discrete Random Variables

    Example 5-5

    Based on example 5-1, the conditional mean of Y given X=1 is

    obtained from the conditional distribution in figure 5.3:

    E(Y|1) = Y|1 = 1(0.05) + 2(0.1) + 3(0.1) + 4(0.75) = 3.55

    The conditional mean is interpreted as the expected number of times

    the city name is stated given that one bar of signal is present.

    The conditional variance of Y given X=1 is:

    V(Y|1) = Y|1 = (1-3.55)2 (0.05)+ (2-3.55)2 (0.1) +

    (3-3.55)2 (0.1) + (4-3.55)2 (0.75) = 0.748

  • + 5-1 Two Discrete Random Variables

    5-1.4 Independence

    In some random experiments, knowledge of the values of X does not

    change any of the probabilities associated with the values of Y.

    Example 5-6

    In a plastic molding operation, each part is classified as to whether it

    conforms to color and length specifications.

    Define the random variable X & Y as:

  • + 5-1 Two Discrete Random Variables

    Example 5-6 (cont.)

    Assume the joint probability

    distribution of X & Y is defined by

    fXY(x,y) in the Figure. The marginal probability

    distribution of X & Y are also

    shown.

    Note that fXY(x,y) = fX(x) fY(y)

  • + 5-1 Two Discrete Random Variables

    Example 5-6 (cont.)

    The conditional probability mass

    function fY|X(y) is shown in Figure.

    Notice that for any x, fY|x(y) = fY(y). That is, knowledge of whether or not the part meets specifications

    does not change the probability that it meets length specifications.

  • + 5-1 Two Discrete Random Variables

    5-1.4 Independence

  • +

    Covariance and Correlation

  • + 5-3 Covariance and Correlation

    Definition

    Covariance is a measure of Linear Relationship between the random variables

  • + 5-3 Covariance and Correlation

    Determine the covariance for the following joint

    probability.

  • + 5-3 Covariance and Correlation

  • + 5-3 Covariance and Correlation

    Definition

    If cov(X,Y)>0, then Y tends to increase as X increases,

    If cov(X,Y)

  • + 5-3 Covariance and Correlation

    Definition

  • + 5-3 Covariance and Correlation

    Figure 5-13 Joint probability distributions and the sign of covariance between X and Y.

  • + 5-3 Covariance and Correlation

    If cov(X,Y)>0, then Y tends to increase as X increases,

    If cov(X,Y)

  • + 5-3 Covariance and Correlation

    Definition

  • + 5-3 Covariance and Correlation

    +1 in the case of a perfect positive (increasing) linear relationship

    (correlation),

    1 in the case of a perfect decreasing (negative) linear relationship (anticorrelation)

    and some value between 1 and 1 in all other cases, indicating the degree of linear dependence between the variables. As it approaches zero there is

    less of a relationship (closer to uncorrelated). The closer the coefficient is

    to either 1 or 1, the stronger the correlation between the variables.

  • + 5-3 Covariance and Correlation

    Example 5-26

    0 1 2 3

    0 0.2

    1 0.1 0.1

    2 0.1 0.1

    3 0.4

    Y

    X

  • + 5-3 Covariance and Correlation

    Example 5-26 (continued)

  • 6-1 Numerical Summaries

    Definition: Sample Mean

  • 6-1 Numerical Summaries

    Example 6-1

  • 6-1 Numerical Summaries

    Fulcrum

    The sample mean as a balance point for a system of weights.

  • 6-1 Numerical Summaries

    Population Mean

    For a finite population with N measurements, the

    mean is

    The sample mean is a reasonable estimate of the

    population mean.

  • 6-1 Numerical Summaries

    Definition: Sample Variance

    Why (n-1)?

    (1) xi tend to be closer to the sample mean x than the true mean . To compensate for that and avoid obtaining a variance thats often smaller than the true variance 2 we use (n-1).

    (2) The number of degrees of freedom of the sum is (n-1).

  • 6-1 Numerical Summaries

    Example 6-2

    The table below displays the quantities needed for calculating the sample variance and sample standard deviation for the pull-off force data.

  • 6-1 Numerical Summaries

    Example 6-2

  • 6-1 Numerical Summaries

    Computation of s2

  • 6-1 Numerical Summaries

    Population Variance

    When the population is finite and consists of N values, we

    may define the population variance as

    The sample variance is a reasonable estimate of the population variance.

  • 6-1 Numerical Summaries

    Definition

  • 6-2 Stem-and-Leaf Diagrams

    A stem-and-leaf diagram is a good way to obtain an informative visual

    of a data set x1, x2, , xn, where each number xi consists of at least

    two digits.

    To construct a stem-and-leaf diagram, use the following steps:

  • 6-2 Stem-and-Leaf Diagrams

    Data of compressive strengths in pounds per square inch of 80 specimens of a new aluminum-lithium alloy under going evaluation

    As a possible material for aircraft structural elements.

  • 6-2 Stem-and-Leaf Diagrams

    Stem-and-leaf diagram for

    the compressive strength

    data in Table 6-2.

    Units:

    Stem: Tens and hundreds

    (psi)

    Leaf: Ones (psi)

  • 6-2 Stem-and-Leaf Diagrams

    The last column is the frequency count of the number of

    leaves associated with each stem.

    Inspection of this display reveals that most of the

    compressive strength lie between 110 and 200 psi and that

    a central value is somewhere between 150 and 160.

    The stem-and-leaf diagram enables us to determine quickly

    some features of the data that were not immediately

    obvious in the original display in the table.

  • 6-2 Stem-and-Leaf Diagrams

    Example 6-5

    Too few stems, moderate number of stems, or too many?

    It does not provide much information

  • 6-2 Stem-and-Leaf Diagrams

    Example 6-5

    Too few stems, moderate number of stems, or too many?

    L: 0,1,2,3, & 4 U: 5,6,7,8, & 9 It provides more

    information

  • 6-2 Stem-and-Leaf Diagrams

    Example 6-5

    Too few stems, moderate number of stems, or too many?

    z (0,1), t (2,3), f(4,5), s(6,7), e (8,9)

    It does not tell much about

    the shape of the data.

  • 6-2 Stem-and-Leaf Diagrams

    Ordered Stem-and-

    leaf diagram

    produced by

    Minitab software.

    Median: measure the central tendency

    40th and 41st values

    (160+163)/2 =161.5

    Sample Mode: most frequent occurring

    Sample =158

  • Data Features

    The range is a measure of variability that can be easily

    computed from the ordered stem-and-leaf display. It is the

    maximum minus the minimum measurement. From previous

    slide, the range is 245 - 76 = 169.

    6-2 Stem-and-Leaf Diagrams

  • Data Features

    The median is a measure of central tendency that divides the

    data into two equal parts, half below the median and half above.

    If the number of observations is even, the median is halfway

    between the two central values.

    In previous slide, the 40th and 41st values of strength as 160

    and 163, so the median is (160 + 163)/2 = 161.5. If the number of

    observations is odd, the median is the central value.

    6-2 Data Features: Quartiles

  • Data Features

    When an ordered set of data is divided into four equal parts, the division points are called quartiles.

    The first or lower quartile, q1 , is a value that has approximately one-fourth (25%) of the observations below it and approximately 75% of the observations above. The second quartile, q2, has approximately one-half (50%) of the observations below its value. The second quartile is exactly equal to the median. The third or upper quartile, q3, has approximately three-fourths (75%) of the observations below its value. As in the case of the median, the quartiles may not be unique.

    6-2 Data Features: Quartiles

  • Data Features

    The compressive strength data in Figure 6-6 contains

    n = 80 observations. Minitab software calculates the first and third

    quartiles as the (n + 1)/4 and 3(n + 1)/4 ordered observations and

    interpolates as needed.

    For example, (80 + 1)/4 = 20.25 and 3(80 + 1)/4 = 60.75.

    Therefore, Minitab interpolates between the 20th (143) and 21st (145)

    ordered observation to obtain q1 = 143.50 and between the 60th and 61st

    observation to obtain q3 =181.00.

    6-2 Data Features: Quartiles

  • Data Features

    The interquartile range (IQR) is the difference between

    the upper and lower quartiles (q3,q1), and it is sometimes

    used as a measure of variability.

    IQR = q3-q1

    6-2 Data Features: Quartiles

  • 6-2 Data Features: Quartiles

    #

    Data Sort data

    A--->Z

    1 80 11

    2 95 15

    3 20 20

    4 67 55

    5 93 67

    6 11 75

    7 15 80

    8 55 93

    9 75 95

    10 96 96

    First Quartile (q1)

    (n+1)/4 = (10+1)/4=2.75 q1 is between X(2) and X(3), by interpolation:

    q1=X2+0.75*(X3-X2)=15+0.75(20-15)=18.75

    Second Quartile (q2) = median

    (n+1)/2 = (10+1)/4=5.5 q2 is between X(5) and X(6).

    q2=(X5+X6)/2=(67+75)/2=71

    Third Quartile (q3)

    3(n+1)/4 = 3(10+1)/4=8.25 q3 is between X(8) and X(9), by interpolation:

    q3=X8+0.25*(X9-X8)=93+0.25*(95-93)=93.5

    IQR=q3-q1=

    93.5-18.75

    =74.75

    Or

    q1=(X2+X3)/2=

    (20+15)/2=17.5

    Or

    q3=(X8+X9)/2=

    (95+93)/2=94

  • 6-2 Data Features: Quartiles

    First Quartile (q1)

    (n+1)/4 = (11+1)/4=3 q1 =X(3)=12

    Second Quartile (q2) = median

    (n+1)/4 = (11+1)/4=6 q2 =X(6)=23

    Third Quartile (q3)

    3(n+1)/4 = 3(11+1)/4=9 q3 =X(9)=73

    #

    Data Sort data

    A--->Z

    1 3 1

    2 74 3

    3 1 12

    4 99 13

    5 12 20

    6 73 23

    7 40 27

    8 23 40

    9 20 73

    10 13 74

    11 27 99

    q1

    q2

    q3

  • 6-4 Box Plots

    The box plot is a graphical display that simultaneously

    describes several important features of a data set, such

    as center, spread, departure from symmetry, and

    identification of observations that lie unusually far from

    the bulk of the data (outlier).

    Whisker

    Outlier

    Extreme outlier

  • 6-4 Box Plots

    Description of a box plot

    Whisker extends to

    smallest data point

    within 1.5(IQR ) ranges

    from the first quartile

    Whisker extends to

    largest data point

    within 1.5(IQR ) ranges

    from the third quartile

    interquartile range (IQR)

    IQR = q3-q1

  • Box plot for compressive strength data in Table 6-2.

    6-4 Box Plots

    Example:

    IQR = q3-q1 = 181-143.5

    Q3+1.5(IQR)=237.25

    q1-1.5(IQR)=87.25

    Extreme

    outlier

    Outliers

  • 6-4 Box Plots Exp1

    First Quartile (q1)

    (n+1)/4 = (11+1)/4=3 q1 =X(3)=12

    Second Quartile (q2) = median

    (n+1)/4 = (11+1)/4=6 q2 =X(6)=23

    Third Quartile (q3)

    3(n+1)/4 = 3(11+1)/4=9 q3 =X(9)=73

    #

    Data Sort data

    A--->Z

    1 3 1

    2 74 3

    3 1 12

    4 99 13

    5 12 20

    6 73 23

    7 40 27

    8 23 40

    9 20 73

    10 13 74

    11 27 99

    q1

    q2

    q3

  • 6-2 Box Plots Exp1

  • Box plots are preferable to stem and leaf displays when

    -there is a large amount of data.

    -three or more groups are to be compared.

    More information:

    http://onlinestatbook.com/2/graphing_distributions/boxplots.html

    6-4 Box Plots

  • 6-4 Box Plots

  • 6-3 Frequency Distributions and Histograms

    A frequency distribution is a more compact summary of data than a stem-and-leaf diagram. To construct a frequency distribution, we must divide the range of the data into intervals, which are usually called class intervals, cells, or bins.

    number of bins is usually between 5 and 20. the number of bins increases as the number of observations increases. Rule of thumb: number of bins = (n)0.5.

  • Histogram of compressive strength for 80 aluminum-lithium

    alloy specimens.

    It looks like a bell shape!!

    6-3 Frequency Distributions and Histograms

  • A histogram of the compressive strength data from Minitab

    with 17 bins.

    It is recommended to use the histograms if the number of

    observations is more than 75.

    6-3 Frequency Distributions and Histograms

  • A histogram of the compressive strength data

    from Minitab with nine bins.

    6-3 Frequency Distributions and Histograms

  • A cumulative distribution plot of the compressive

    strength data from Minitab.

    6-3 Frequency Distributions and Histograms

  • Histograms for symmetric and skewed distributions.

    6-3 Frequency Distributions and Histograms

    Meanx : Medianx :~

    Positive or right skew

    mode mean

  • 6-5 Time Sequence Plots

    A time series or time sequence is a data set in which the

    observations are recorded in the order in which they occur.

    A time series plot is a graph in which the vertical axis denotes the

    observed value of the variable (say x) and the horizontal axis

    denotes the time (which could be minutes, days, years, etc.).

    When measurements are plotted as a time series, we

    often see

    trends,

    cycles, or

    other broad features of the data

  • 6-5 Time Sequence Plots

    Company sales by year

    Upward trend

  • 6-5 Time Sequence Plots

    Company sales by quarter

    Cycle

  • 6-5 Time Sequence Plots

    A digidot plot of the compressive strength data

    A combined plot of time series and stem-and-leaf plot. There is no pattern.

  • 6-5 Time Sequence Plots

    A digidot plot of chemical process concentration readings,

    observed hourly.

    Until observation 20, the average was about 85 grams/liter

    The average went down after that point!!

  • 6-6 Probability Plots

    Probability plotting is a graphical method for determining

    whether sample data conform to a hypothesized distribution

    based on a subjective visual examination of the data.

    Probability plotting typically uses special graph paper,

    known as probability paper, that has been designed for the

    hypothesized distribution. Probability paper is widely

    available for the normal, lognormal, Weibull, and various chi-

    square and gamma distributions.

  • 6-6 Probability Plots

    Example

    Ten observations on the effective service life in minutes of

    batteries used in a portable personal computer are as follows:

    176, 191, 214, 220, 205,192, 201, 190, 183, 185.

    It is assumed that battery life is modeled by a normal distribution.

    Use the probability plotting to investigate this assumption.

    First, arrange the observations in ascending order and calculate

    their cumulative frequency (j-0.5)/10 as shown below.

  • 6-6 Probability Plots

    Example

  • 6-6 Probability Plots

    Normal probability plot for battery life.

  • 6-6 Probability Plots

    A straight line is drawn through the plotted points.

    A good rule of thumb is to draw the line approximately

    between the 25th & 75th percentile points.

    If all points are covered by imaginary fat pencil (straight

    line), a normal distribution adequately describes the data.

    Example

  • Normal probability plot obtained from standardized normal scores.

    On ordinary

    paper

  • 7-1 Introduction

    The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population.

    These methods utilize the information contained in a sample from the population in drawing conclusions.

    Statistical inference may be divided into two major areas:

    Parameter estimation

    Hypothesis testing

  • 7-1 Introduction

    Suppose that we want to obtain a point estimation of a population

    parameter.

    The observations are random variables, say X1, X2, , Xn.

    Therefore, any function of the observations, or any statistic, is also a

    random variable.

    For example, the sample mean X, and the sample variance S2 are

    statistics and they are also random variables.

    Since a statistic is RV, it has a probability distribution.

    We call the probability distribution of a statistic a sampling

    distribution.

  • 7-1 Introduction

  • Definition

    7-1 Introduction

    is the uppercase of

  • 7-1 Introduction

  • 7.2 Sampling Distributions and the Central Limit Theorem

    Statistical inference is concerned with making decisions about a population based on the information contained in a random sample from that population.

    Definitions:

  • 7.2 Sampling Distributions and the Central Limit Theorem

    If we are sampling from a population that has an unknown

    probability distribution, the sampling distribution of the

    sample mean will still be approximately normal with

    mean and variance 2/n, if the sample size is large.

    This is one of the most useful theorems in statistics, and it

    is called central limit theorem.

  • 7.2 Sampling Distributions and the Central Limit Theorem

    X is N(, 2/n) If the population is continuous, unimodal and symmetric, n=5 is enough. If n30, that will be enough regardless of the shape of the population. If n30, still it will be fine if the population distribution is not severely non-normal

  • 7.2 Sampling Distributions and the Central Limit Theorem

    Example 7-1

    An electronic company manufactures resistors that have a mean

    resistance of 100 ohms and a standard deviation of 10 ohms.

    The distribution of resistance is normal

    Find the probability that a random sample of n=25 resistors will have

    an average resistance less than 95 ohms.

    Note that the sampling distribution of X is normal, with mean =100

    ohms and a standard deviation of:.

  • 7.2 Sampling Distributions and the Central Limit Theorem

    Example 7-1

    Therefore, the desired probability, shaded area shown in the figure,

    can be estimated by standardizing the point X = 95:

    And therefore

    Probability for previous example

  • 7.2 Sampling Distributions and the Central Limit Theorem

    Example 7-2

    Suppose that the random variable X has a continuous uniform

    distribution f(x)=0.5 for 4 x 6, 0 otherwise.

    Find the distribution of the sample mean of a random sample of size

    n=40.

    Mean and variance are (4+6)/2=5, and (6-4)2/12=1/3 respectively.

    According to the central limit theorem, X is normally distributed with:

    5x120/1)40(3/1/22 nx

  • 7.2 Sampling Distributions and the Central Limit Theorem

    Example 7-2

  • 7-3 General Concepts of Point Estimation

    7-3.1 Unbiased Estimators

    An estimate should be close to the true value of the unknown

    parameter.

    Formally, we say that is unbiased estimator of if the expected value of is equal to .

    Definition

  • Example

    7-3 General Concepts of Point Estimation

    Suppose we have a random sample of size 2n from a population denoted by X, and E(X)= and V(X)=2.

    Let

    n

    i

    iXn

    X2

    11 2

    1and

    n

    i

    iXn

    X1

    2

    1

    Be two estimators of . Which is unbiased estimator of ?

    nn

    XEnn

    X

    EXn

    ii

    n

    ii

    22

    1

    2

    1

    2

    2

    1

    2

    11

  • Example (cont.)

    7-3 General Concepts of Point Estimation

    are unbiased estimators of .

    nn

    XEnn

    X

    EXn

    ii

    n

    ii 11

    1

    12

    1X 2Xand

  • Example

    7-3 General Concepts of Point Estimation