60
Business Statistics August 2015 Examination Theory Concepts Revision

Revision Theory

Embed Size (px)

DESCRIPTION

Business Statistics in NP

Citation preview

Properties of mean, median and mode

Business StatisticsAugust 2015 ExaminationTheory Concepts RevisionFrequency DistributionA grouping of data into mutually exclusive classesIt shows the number of observations in each class10/04/2015Slide number 2Frequency Distribution - TermsClass midpoint:A point that divides a class into two equal parts. Thisis the average of the upper and lower class limits.

Class frequency:The number of observations in each class.

Class interval: (Class Width)The class interval is obtained by subtracting the lower limit of a class from the lower limit of the next class.10/04/2015Slide number 3Constructing a Frequency Distribution10/04/2015Slide number 4Preferably between 5 15 classesIf possible, the classes interval should be the same for all classesThe classes must be mutually exclusive, i.e. avoid overlapping classes. Each data point must fall in only one class.The classes must be all inclusive, i.e. the classes mustprovide a place to record every value in the data set.Preferably no open-ended classes.open-ended classes:classes without lower or upper limitexample: below 7.5;above 37.5A relative frequency distribution shows the percent of observations in each class.Relative Frequency10/04/2015Slide number 5Graphical Presentation of a FrequencyDistribution

HistogramsClasses marked on the horizontal axisFrequency marked on the vertical axisFrequencies of each class arerepresented by the height of the barsThe bars are adjacent to each other10/04/2015Slide number 6Graphical Presentation of a FrequencyDistribution

Frequency Polygonmid-point of the classes are marked onthe horizontal axisFrequency marked on the vertical axisLine segments connect the points that represent the frequencies of their respective classes.10/04/2015Slide number 7Other Graphical Presentation of Data10/04/2015Slide number 8Line Graphused to show the change or trend in a variable over timeBar Chartdepicts both the qualitative and quantitative dataPie Chartis useful for displaying a relative frequency distribution.A circle is divided proportionally to the relative frequency and portions of the circle are allocated for the different groups.Cumulative Frequency10/04/2015Slide number 9A cumulative frequency distribution is used to determine how many or what proportion of the data values are below or above a certain value.The cumulative frequency of a particular class is found by adding the frequency of that class to the cumulative frequency of the previous class.Properties of mean, median and modePlease refer to the text bookpp 59 and 63 - 64 for the properties of the mean, median and mode.

Note the disadvantages of mean:Mean is affected by extreme valuesExtreme values = very large or very small values

Inappropriate if there is an open-ended class in grouped data because we cannot find the mid-point of an open-ended class Open-ended class: classes without lower limit orupper limite.g. $50 or less; or$1000 or moreRelation: mean median & modeThe values of the mean, median and mode will determine the shape of the distribution.Shapes of the distribution:SymmetricRight-Skewed (Positive Skewed)Left-Skewed (Negative Skewed)Relation: SymmetricSymmetric:the areas on both sides of the distribution are equal Mean = Median = ModeThere are no extreme values (values that are very large or very small)Relation: Right-skewedRight-skewed (Positive skewed):More values in the lower end than the higher end Long tail at the rightMean > Median > ModeArises when the mean is increased bysome unusually high valuesRelation: Left-SkewedLeft-skewed (Negative skewed):More values at the higher end then the lower end Long tail at the leftMean < Median < ModeArises when the mean is reduced bysome unusually low valuesChoice of Measures of Central TendencyIf the distribution is symmetrical, no choice is needed. We can use themean, median and mode, since they are all of the same value.

If the distribution is skewed, either to the right or left, then the median is often the best measure because mean will be distorted by the extreme values in these cases.Measures of DispersionWhat is dispersion ?refers to the spread or variability of values in a distribution with respect to its central location(i.e.) it shows the extent to which the observationsare scatteredImportanceenables us to judge the reliability of the measures ofCentral Tendencyenables us to compare dispersions of various datasetsMeasures of DispersionData sets may have the same mean, but differin the dispersionCurve A has a very small spreadCurve B has a moderate spread

Curve C has a very large spreadThe bigger the spread, the less reliable it is to use the values of the measures of central of tendency as representatives of the values in the data setDispersion - RangeMajor drawbacks of range:involves only 2 values and thus ignores how other data are distributedit is affected by extreme valuesTwo data sets with same range but different spread

Widely spread outMore concentrated at lower valuesbut affected by an extremely high valueVariance and Standard DeviationMost commonly used measures fordispersionTake into account how data aredistributedShow data dispersion (variation) with respect to the central location, meanMean and Standard DeviationUngrouped data Vs Grouped DataGrouped Data are more organizedThe calculation of the mean and standard deviation with grouped data is less accurate as the values are estimated by the mid-points instead of the actual values of the dataThe mean and standard deviation cannot be computed with grouped data with open- ended classes because the mid-points of the open-ended classes cannot be ascertain.Dispersion Coefficient of VariationMeasure of relative dispersionShow data dispersion (variation) in terms of the percentage (%) of the central location, meanFree of units a good choice of measure for comparing dispersion of 2 or more data sets when there is substantial difference inthe size of the mean values the units of measurement3- 7Mutually Exclusive EventsEvents are mutually exclusive if theoccurrence of any one event means that none of the otherscan occur at the same time.Events areindependent if the occurrence of oneevent does not affect the occurrence of another.Events are collectively exhaustive if at least one of the events must occur when an experiment is conducted.Multiplicative rule3- 8DefinitionsThere are three definitions of probability: classical, empirical, and subjective.The classical definition applies when there are n equallylikely outcomes.The empirical definition applies when the number of times the event happens is divided by the number of observations.Subjective probability is based on whatever information is available3- 6Definitions continuedAn experiment is the observation of some activityor the act of taking some measurement.An outcome is the particular result of anexperiment.An event is the collection of one or more outcomes of an experiment.3- 12Basic Rules of ProbabilityIf two events A and B are mutually exclusive, the special rule of addition states that the probability of A or B occurring equals the sum of their respective probabilities:P(A or B) = P(A) + P(B)ABP(A andB) = 0 Mutually exclusive3- 15The Complement RuleThe complement rule is used to determine the probability of an event occurring by subtracting the probability of the event not occurring from 1.

If P(A) is the probability of event A and P(~A) isthe complement of A,P(A) + P(~A) = 1 orP(A) = 1 - P(~A).3- 20The General Rule of AdditionIf A and B are two events that are not mutually exclusive, then P(A or B) is given by the following formula:Not Mutually exclusive

P(A or B) = P(A) + P(B) - P(A and B)P(A U B) = P(A) + P(B) - P(A B)P(A U B) = P(A) + P(B)Mutually exclusiveP(A B)=03- 25Special Rule of MultiplicationThe special rule of multiplication requires that two events A and B are independent.

Two events A and B are independent if the occurrence of one has no effect on the probability of the occurrence of the other.This rule is written:P(A and B) = P(A)P(B)3- 29Joint ProbabilityA joint probability measures the likelihood that two ormore events will happen concurrently.

An example would be the event that a student has both astereo and TV in his or her dorm room.3- 30Conditional ProbabilityA conditional probability is the probability of a particular event occurring, given that another event has occurred.The probability of the event A given that theevent B has occurred is written P(A|B).3- 31General Multiplication RuleThe general rule of multiplication is used to find the joint probability that two events will occur.

It states that for two events A and B, the joint probability that both events will happen is found by multiplying the probability that event A will happen by the conditional probability of B given that A has occurred.3- 35Tree DiagramsA tree diagram is useful for portraying conditional and joint probabilities.It is particularly useful for analyzing business decisions involving several stages.3- 37Bayes TheoremBayes Theorem is a method for revising a probabilitygiven additional information.

It is computed using the following formula:P(A1 )P(B / A1 ) P(A2 )P(B / A2 )P(A1 )P(B / A1 )P(A | B) 1A1A2Random Variables & Probability Distribution A random variable is a numerical value determined by the outcome of an experiment.A probability distribution is the listing of all possible outcomes of an experiment and the corresponding probability.

Types of Probability Distributions A discrete probability distribution can assume only certain outcomes.

A continuous probability distribution can assume an infinite number of values within a given range.

Variable-Discretee.g. CountingVariable-Continuouse.g. Measurement

Features of a Discrete DistributionThe sum of the probabilities of all possible outcomes is 1.00The probability of a particular outcome is between 0 and 1.00.The outcomes are mutually exclusive.0 P(x) 1

Binomial Probability DistributionThe binomial distribution has the following characteristics:An outcome of an experiment is classified into one of two mutually exclusive categories, such as a success or failureThe data collected are the results of countsThe probability of success stays the same for each trialThe trials are independent

Poisson Probability DistributionThe Poisson probability distribution describes the number of times some event occurs during a specific interval. The interval may be time, distance, area or volume.

The random variable is the number of time some event occurs during a defined interval

The probability of the event is proportional to the size of the interval

The intervals are independent and non-overlapping

Poisson DistributionWe will use the Poisson distribution to estimate a binomial probability when n, the number of trials is large and , the probability of success is small ( 0X & Y change at the same direction, direct relationship observed

When b < 0

X & Y change in opposite direction, indirect/inverse relationship

observedWhen b = 0No evident relationship observed

Slope of regression line has same sign as coefficient of

correlation

The y-intercept, aRepresents the number of units expected in the dependent variable Y with zero units in the independent variable XCautionary Notes and LimitationsBe careful with the units of measurement.

The results of the Correlation Analysis only indicates the strength of the linear relationship between two variables. It does NOT indicate causal effect, even though in reality, the two variables may be causally related.

Beware of spurious (false) correlation.

Both variables have risen over time, but for different reasons.

Influence of other variables on the dependent and independent variables.

Extrapolation beyond the range of the original observed data

Using historical data to estimate future trendsThe conditions in the past may have changed, rendering the relationship invalid.