Upload
charshum
View
214
Download
0
Embed Size (px)
Citation preview
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
1/12
Page 1
1
Quantitative Methods andBusiness Statistics for
Decision Making (MSA606)
Ramesh [email protected]
Department of Mechanical Engineering, NIT CalicutKerala, India -673 601.
Dedicated to
Professor. S. G. DeshmukhProfessor. S. G. DeshmukhProfessor. S. G. DeshmukhProfessor. S. G. Deshmukh
3
Objectives of this course
Appreciate the role of statistics in various decision making situations Summarize data with frequency distributions and graphic
presentation. Interpret descriptive statistics for central tendency, dispersion and
location Define and interpret probability. Utilize discrete and continuous
probability distributions to determine probabilities in variousmanagerial applications.
Apply the central limit theorem to determine probabilities of samplemeans and compute and interpret point and interval estimates.
Conduct Hypothesis tests for means Utilize linear regression to estimate and predict variables.
Understand basic concepts of design-of-experiment
Understand importance of non-parametric tests
Lab/tutorial
The laboratory content will require pre-
requisite of working with Excel. There willbe quizzes/assignments every week. Thelab assignments are to be submitted on
that day itself. Students will be alsorequired to visit and consult useful web
resources.
Mode of Evaluation andGrades
Grades are based on total points earned
from test 1 &2,lab/tutorial/assignments,mini-project and end semester
examination.
Test1
Article critique &
Presentation
EndSemester
Lab/tutorial/quizes
/assignments
(every week)
Mini-Project
30 % 10 % 40 % 10 % 10%
Reference
Meyer PL, Introductory Probability and StatisticalApplications, Oxford and IBH Publishers
Miller IR, Freund JE, Johnson R, Probability and
Statistics for Engineers, Prentice-Hall (I) Ltd Walpole RE and Myers RH, Probability &
Statistics for Engineers and Scientists,Macmillan
Levin, R. I. and Rubin, D.S., Statistics forManagement (Pearson Education )
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
2/12
Page 2
7
Statistics..
Plays an important role in many facets of
human endeavour Occurs remarkably frequently in our
everyday lives
It is often incorrectly thought of as just acollection of data, graphs and diagrams
Statistics in Business
Accounting auditing and cost estimation
Economics regional, national, and internationaleconomic performance
Finance investments and portfolio management Management human resources, compensation,
and quality management Management Information Systems (ERP):
performance of systems which gather, summarize,and disseminate information to various manageriallevels
Marketing market analysis and consumerresearch
International Business market and demographicanalysis
9
What is Statistics?
Science of gathering, analyzing, interpreting,and presenting data
Branch of mathematics
Facts and figures
Measurement taken on a sample
Statistics is the scientific method thatenables us to make decisions as responsiblyas possible.
Statistics
The science of data to answer researchquestions Formulate a research question(s) (hypothesis)
Collect data
Analyze and summarize data
Draw conclusions to answer researchquestions
Statistical Inference
In the presence of variation
11
Answers Questions from EverydayLife
Business: Will a new marketing strategy beprofitable?
Industry: Will a products life exceed the
warranty period? Medicine: Will this years flu vaccine reduce thechance of flu?
Education: Will technology improve learning?
Government: Will a change in interest ratesaffect inflation?
Statistics: Science ofvariability..?
Virtually everything varies
Variation occurs among individuals
Variation occurs within any one individualas time passes
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
3/12
Page 3
13
Can Statistics Be Trusted?There are three kinds of lies:
Lies, damned lies, and statistics.--Mark Twain
It is easy to lie with statistics. But it is
easier to lie without them.--Frederick Mosteller
Figures wont lie but liars will figure.--Charles Grosvenor
Population Versus Sample Population the whole
a collection of persons, objects, or items under study
The entire group of individuals in a statistical study wewant information about.
Census gathering data from the entirepopulation
Sample a portion of the whole a subset of the population a part of the population from which we actually collect
information, used to draw conclusions about thewhole (statistical inference
15
Statistics can be split into twobroad categories
1. Descriptive statistics
2. Statistical inference
Descriptive Statistics
Collect data
ex. Survey
Present data
ex. Tables and graphs
Characterize data
ex. Sample mean =i
X
n
17
Descriptive statistics..
Encompasses the following:
Graphical or pictorial display
Condensation of large masses of data into a
form such as tables
Preparation of summary measures to give aconcise description of complex information(e.g. an average figure)
Exhibition of patterns that may be found insets of information
Inferential Statistics
Estimation
ex. Estimate thepopulation mean weight
using the sample meanweight
Hypothesis testing
ex. Test the claim that thepopulation mean weightis 120 poundsDrawing conclusions and/or making decisions
concerning a population based on sample results.
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
4/12
Page 4
19
Inferential Statistics..
Especially relates to:
Determining whether characteristics of asituation are unusual or if they havehappened by chance
Estimating values of numerical quantities anddetermining the reliability of those estimates
Using past occurrences to attempt to predictthe future
Process of Inferential Statistics
Population
(parameter)
Sample
x
(statistic )
Calculate xto estimate
Select a
random sample
Population vs. Sample
Population Sample
Measures used to describe the
population are called parameters
Measures computed from
sample data are called statistics22
Parameter vs. Statistic
Parameter descriptive measure of thepopulation
Usually represented by Greek letters
Statistic descriptive measure of asample
Usually represented by Roman letters
23
Symbols for PopulationParameters
denotes population parameter
2
denotes population variance denotes population standard deviation
Symbols for Sample Statistics
x denotes sample mean
2S denotes sample variance
S denotes sample standard deviatio
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
5/12
Page 5
Types of Variables
Categorical (qualitative) variables have values
that can only be placed into categories, such asyes and no.
Numerical (quantitative) variables have values
that represent quantities.
Types of Variables
Data
Categorical Numerical
Discrete Continuous
Examples:
Marital Status
Political Party Eye Color
(Defined categories)Examples:
Number of Children
Defects per hour
(Counted items)
Examples:
Weight
Voltage
(Measured characteristics)
27
Levels of Data Measurement
Nominal Lowest level of measurement
Ordinal
Interval
Ratio Highest level of measurement
Levels of Measurement
A nominal scale classifies data into distinct
categories in which no ranking is implied.
Categorical Variables Categories
Personal ComputerOwnership
Type of Stocks Owned
Internet Provider
Yes/ No
Microsoft Network / AOL
Growth Value Other
Levels of Measurement
An ordinal scale classifies data into distinct
categories in which ranking is implied
Categorical Variable Ordered Categories
Student c lass des igna tion Freshman, Sophomore , Junio r,Senior
Product satisfaction Satisfied, N eutral, Unsatisfied
Faculty rank Professor, Associate Professor,Assistant Professor, Instructor
Standard & Poors bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,
C, DDD, DD, D
Student Grades A, B, C, D, F
Levels of Measurement
An interval scale is an ordered scale in which thedifference between measurements is a meaningfulquantity but the measurements do not have a true
zero point.
A ratio scale is an ordered scale in which thedifference between the measurements is ameaningful quantity and the measurements have a
true zero point.
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
6/12
Page 6
Interval and Ratio Scales
32
Usage Potential of VariousLevels of Data
Nominal
Ordinal
IntervalRatio
33
Data Level, Operations,and Statistical Methods
Data Level
Nominal
Ordinal
Interval
Ratio
Meaningful Operations
Classifying and Counting
All of the above plus Ranking
All of the above plus Addition,Subtraction
All of the above plusmultiplication and division
StatisticalMethods
Nonparametric
Nonparametric
Parametric
Parametric
Data preparation rules
Data presented must be
factual
relevant
Before presentation always check:
the source of the data
that the data has been accurately
transcribed
the figures are relevant to the problem
35
Methods of visual presentationof data
Table
1st Qtr 2nd Qtr 3rd Qtr 4th QtrEast 20.4 27.4 90 20.4
West 30.6 38.6 34.6 31.6
North 45.9 46.9 45 43.9
Methods of visual presentationof data
Graphs
0
10
20
30
40
50
6070
80
90
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
7/12
Page 7
37
Methods of visual presentationof data
Pie chart
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Methods of visual presentationof data
Multiple bar chart
0 20 40 60 80 100
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
North
West
East
39
Methods of visual presentationof data
Simple pictogram
0
20
40
60
80
100
1 st Q tr 2 nd Q tr 3 rd Q tr 4 th Q tr
East
North
West
Frequency distributions
Frequency tables
ClassInterval Frequency Cumulative Frequency
< 20 13 13
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
8/12
Page 8
43
Example of UngroupedData
4230
53
50
52
30
55
49
61
74
2658
40
40
28
36
30
33
31
37
3237
30
32
23
32
58
43
30
29
3450
47
31
35
26
64
46
40
43
5730
49
40
25
50
52
32
60
54
Ages of a Sample of
Managers from
XYZ
Frequency Distribution ofAges
Class Interval Frequency20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11
60-under 70 3
70-under 80 1
45
Data Range
42
30
53
50
52
30
55
49
61
74
26
58
40
40
28
36
30
33
31
37
32
37
30
32
23
32
58
43
30
29
34
50
47
31
35
26
64
46
40
43
57
30
49
40
25
50
52
32
60
54
Smallest
Largest
Range = Largest - Smallest
= 74 - 23
= 51
Number of Classes and ClassWidth
The number of classes should be between 5 and 15.
Fewer than 5 classes cause excessive summarization.
More than 15 classes leave too much detail.
Class Width
Divide the range by the number of classes for anapproximate class width
Round up to a convenient number
10=WidthClass
8.5=6
51=WidthClasseApproximat
47
Class Midpoint
Class Midpoint =beginning class endpoint + ending class endpoint
2
= 30 + 402
= 35
( )
Class Midpoint = class beginning point +1
2class width
= 30 +1
210
= 35
Relative FrequencyRelative
Class Interval Frequency Frequency
20-under 30 6 .12
30-under 40 18 .36
40-under 50 11 .22
50-under 60 11 .22
60-under 70 3 .06
70-under 80 1 .02
Total 50 1.00
6
50=
18
50=
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
9/12
Page 9
49
Cumulative Frequency
Cumulative
Class Interval Frequency Frequency20-under 30 6 6
30-under 40 18 24
40-under 50 11 35
50-under 60 11 46
60-under 70 3 49
70-under 80 1 50
Total 50
18 + 6
11 + 24
Class Midpoints, Relative Frequencies,
and Cumulative Frequencies
Relative Cumulative
Class IntervalFrequency Midpoint Frequency Frequency
20-under 30 6 25 .12 6
30-under 40 18 35 .36 24
40-under 50 11 45 .22 35
50-under 60 11 55 .22 46
60-under 70 3 65 .06 49
70-under 80 1 75 .02 50
Total 50 [email protected]
51
Cumulative Relative Frequencies
Relative Cumulative Cumulative Relative
Class IntervalFrequencyFrequencyFrequency Frequency
20-under 30 6 .12 6 .12
30-under 40 18 .36 24 .48
40-under 50 11 .22 35 .70
50-under 60 11 .22 46 .92
60-under 70 3 .06 49 .98
70-under 80 1 .02 50 1.00
Total 50 [email protected] 52
Common Statistical Graphs
Histogram -- vertical bar chart of frequencies
Frequency Polygon -- line graph of frequencies
Ogive -- line graph of cumulative frequencies
Pie Chart -- proportional representation forcategories of a whole
Stem and Leaf Plot
Pareto Chart
Scatter Plot
53
Histogram
Class Interval Frequency
20-under 30 6
30-under 40 1840-under 50 11
50-under 60 11
60-under 70 3
70-under 80 1 0
10
20
0 10 20 30 40 50 60 70 80
Years
Frequency
Histogram Construction
Class Interval Frequency
20-under 30 6
30-under 40 1840-under 50 11
50-under 60 11
60-under 70 3
70-under 80 10
10
20
0 10 20 30 40 50 60 70 80
Years
Frequency
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
10/12
Page 10
55
Frequency Polygon
Class Interval Frequency20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11
60-under 70 3
70-under 80 10
10
20
0 1 0 20 3 0 4 0 5 0 60 7 0 8 0
Years
Frequency
Ogive
Cumulative
Class Interval Frequency
20-under 30 6
30-under 40 24
40-under 50 35
50-under 60 46
60-under 70 49
70-under 80 500
20
40
60
0 10 20 30 40 50 60 70 80
Years
Frequency
57
Relative Frequency Ogive
Cumulative
Relative
Class Interval Frequency
20-under 30 .12
30-under 40 .48
40-under 50 .70
50-under 60 .92
60-under 70 .98
70-under 80 1.00
0.000.100.200.300.400.500.600.700.800.901.00
0 10 20 30 40 50 60 70 80
Years
CumulativeRelativeFrequency
Complaints by Passengers
COMPLAINT NUMBER PROPORTION DEGREES
Stations, etc. 28,000 .40 144.0
TrainPerformance
14,700 .21 75.6
Equipment 10,500 .15 50.4
Personnel 9,800 .14 50.6
Schedules,etc.
7,000 .10 36.0
Total 70,000 1.00 [email protected]
59
Complaints by Passengers
Stations, Etc.
40%Train
Performance
21%
Equipment15%
Personnel
14%
Schedules,
Etc.
10%
SecondQuarter Truck
Production
2d QuarterTruck
ProductionCompany
A
B
C
D
ETotals
357,411
354,936
160,997
34,099
12,747920,190
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
11/12
Page 11
61
39%
39%
17%
4%1%
A B C D E
Second QuarterTruck Production
Pie Chart Calculations forCompany A
2d Quarter
TruckProduction
Proportion DegreesCompany
A
B
C
D
ETotals
357,411
354,936
160,997
34,099
12,747920,190
.388
.386
.175
.037
.0141.000
140
139
63
13
5360
357,411
920,190=
. 388 360 =
63
Pareto Chart
0
10
20
30
40
50
60
70
80
90
100
Poor
Wiring
Short in
Coil
Defective
Plug
Other
Frequency
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Scatter Plot
RegisteredVehicles(1000's)
Gasoline Sales(1000's ofGallons)
5 60
15 120
9 90
15 140
7 60
0
100
200
0 5 10 15 20RegisteredVehicles
GasolineSales
Principles of Excellent Graphs
The graph should not distort the data.
The graph should not contain unnecessary
adornments (sometimes referred to as chart junk).
The scale on the vertical axis should begin at zero.
All axes should be properly labeled.
The graph should contain a title.
The simplest possible graph should be used for a
given set of data.
Graphical Errors: Chart Junk
1960: $1.00
1970: $1.60
1980: $3.10
1990: $3.80
Minimum Wage
Bad Presentation
Minimum Wage
0
2
4
1960 1970 1980 1990
$
Good Presentation
8/3/2019 1_introduction to Statistics_June-22, 2011 [Compatibility Mode]
12/12
Page 12
Graphical Errors:Compressing the Vertical Axis
Good PresentationQuarterly Sales Quarterly Sales
Bad Presentation
0
25
50
Q1 Q2 Q3 Q4
$
0
100
200
Q1 Q2 Q3 Q4
$
Graphical Errors: No Zero Pointon the Vertical Axis
Monthly Sales
36
39
42
45
J F M A M J
$
Graphing the first six months of sales
Monthly Sales
0
39
42
45
J F M A M J
$
36
Good PresentationsBad Presentation
69
Thank You
http://www.stats.gla.ac.uk/steps/glossary/p
resenting_data.html
http://www.ilir.uiuc.edu/courses/lir593/