Upload
selvanathan-ramasamy
View
212
Download
0
Embed Size (px)
DESCRIPTION
Introduction to basic statistics.
Citation preview
PowerPoint Presentation
Basic statisticsDescriptive Analysis (Graphical)
1
Dealing with Uncertainty Everyday decisions are based on incomplete information
Consider:
Will the job market be strong when I graduate?
Will the price of HELPs stock be higher in six months than it is now?
Dealing with Uncertainty (continued)Numbers and data are used to assist decision making
Statistics is a tool to help process, summarize, analyze, and interpret data
Population vs. Sample Population
Population vs. Sample (continued)Values calculated using population data are called parameters
Values computed from sample data are called statistics
Symbols
Descriptive and Inferential StatisticsTwo branches of statistics:Descriptive statisticsUsing graphical and numerical procedures to summarize and process dataInferential statisticsUsing data to make predictions, forecasts, and estimates to assist decision making
Descriptive StatisticsCollect datae.g., Survey Present datae.g., Tables and graphs Summarize datae.g., Sample mean =
Inferential StatisticsEstimatione.g., Estimate the population mean weight using the sample mean weightHypothesis testinge.g., Test the claim that the population mean weight is 140 pounds
Inference is the process of drawing conclusions or making decisions about a population based on sample results
Types of DataExamples:Marital StatusAre you registered to vote?Eye Color (Defined categories or groups)Examples:Number of ChildrenDefects per hour (Counted items)Examples:WeightVoltage(Measured characteristics)
10
Measurement Levels
Interval Data
Ordinal Data
Nominal Data Categories (no ordering or direction)Ordered Categories (rankings, order, or scaling) Differences between measurements but no true zeroRatio Data
Differences between measurements, true zero existsQuantitative DataQualitative Data
Measurement Levels
Variables can be split into categorical and continuous, and within these types there are different levels of measurement :CategoricalNominalThe lowest scaleNumbers assigned to identify attributesNo order/ sequence
Ordinal scaleThe same as a nominal variable but thecategories have a logical orderArrange from lowest to highest or vice versa
Continuous variableInterval ScaleEqual intervals on the variable represent equaldifferences in the property being measuredArbitrary zero
Ratio Scale Same as an interval variable, equal intervals on the variable represent equal differences in the property being measuredTrue zero
What are these variables measurement ScaleSpeed(Km/h) 9. Favorite FoodMotivation scores 10. Speaking AbilityNumber of SMS received.NationalityPerception scores Quality of work life scores.Income categoriesMusical ability
What are these variables measurement ScaleSpeed(Km/h)- Ratio 9. Favorite Food- NominalMotivation scores-Interval 10. Speaking Ability-OrdinalNumber of SMS - Ratio received.Nationality- NominalPerception scores IntervalQuality of work life - Interval scores.Income categories OrdinalMusical ability- Ordinal
Descriptive statistics : Graphical Procedures
Data in raw form are usually not easy to use for decision making
Some type of organization is neededTableGraph
The type of graph to use depends on the variable being summarized
16
Descriptive statistics : Graphical Procedures(continued)
CategoricalVariables Frequency distribution Bar chart Pie chart
NumericalVariables
Frequency distribution Histogram and ogive Stem-and-leaf display Scatter plot
17
Tables and Graphs for Categorical Variables
Graphing Data
Pie Chart
Bar Chart
Frequency Distribution TableTabulating DataCategorical Data
18
The Frequency Distribution Table
Example: Hospital Patients by Unit
Hospital Unit Number of Patients
Cardiac Care 1,052 Emergency 2,245Intensive Care 340Maternity 552Surgery 4,630
(Variables are categorical)
Bar and Pie ChartsBar charts and Pie charts are often used for qualitative (category) data
Height of bar or size of pie slice shows the frequency or percentage for each category
Bar Chart Example
Hospital Number Unit of Patients
Cardiac Care 1,052Emergency 2,245Intensive Care 340Maternity 552Surgery4,630
Pie Chart Example(Percentages are rounded to the nearest percent)
Hospital Number % of Unit of Patients Total
Cardiac Care 1,052 11.93Emergency 2,245 25.46Intensive Care 340 3.86Maternity 552 6.26Surgery 4,630 52.50
Graphs to Describe Numerical Variables
Numerical Data
Stem-and-LeafDisplay
Histogram
OgiveFrequency Distributions andCumulative Distributions
Frequency DistributionsWhat is a Frequency Distribution?A frequency distribution is a list or a table
containing class groupings (categories or ranges within which the data fall) ...
and the corresponding frequencies with which data fall within each class or category
24
Why Use Frequency Distributions?A frequency distribution is a way to summarize data
The distribution condenses the raw data into a more useful form...
and allows for a quick visual interpretation of the data
25
Frequency Distribution Example
Example: A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Frequency Distribution ExampleIntervalFrequencyRelative FrequencyPercentage10 but less than 2030.150.1520 but less than 3060.300.3030 but less than 4050.250.2540 but less than 5040.200.2050 but less than 6020.100.10TOTAL201100
Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
27
HistogramA graph of the data in a frequency distribution is called a histogram
The interval endpoints are shown on the horizontal axis
The vertical axis is either frequency, relative frequency, or percentage
Bars of the appropriate heights are used to represent the number of observations within each class
Histogram Example
Temperature in Degrees(No gaps between bars)
Interval10 but less than 20 320 but less than 30 630 but less than 40 540 but less than 50 450 but less than 60 2Frequency
The Cumulative Frequency Distribution
Class10 but less than 20 3 15 3 1520 but less than 30 6 30 9 4530 but less than 40 5 25 14 7040 but less than 50 4 20 18 9050 but less than 60 2 10 20 100 Total 20 100
Percentage
Cumulative Percentage
Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58FrequencyCumulative Frequency
30
The Ogive Example
Interval endpoints
IntervalLess than 10 10 010 but less than 20 20 1520 but less than 30 30 4530 but less than 40 40 7040 but less than 50 50 9050 but less than 60 60 100
Cumulative PercentageUpper interval endpoint
Stem-and-Leaf Diagram
A simple way to see distribution details in a data set
METHOD: Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves)
Example
21 is shown as38 is shown as
Stem Leaf 2 1 3 8Data in ordered array:21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Here, use the 10s digit for the stem unit:
ExampleData in ordered array:21, 24, 24, 26, 27, 27, 30, 32, 38, 41Completed stem-and-leaf diagram:StemLeaves21 4 4 6 7 730 2 841
(continued)
Using other stem units
Using the 100s digit as the stem:Round off the 10s digit to form the leaves
613 would become 6 1776 would become 7 8 . . .1224 becomes 12 2
Stem Leaf
Using other stem units
Using the 100s digit as the stem:The completed stem-and-leaf display:
Stem Leaves 6 1 3 6 7 2 2 5 8 8 3 4 6 6 9 9 9 1 3 3 6 8 10 3 5 6 11 4 7 12 2Data:
613, 632, 658, 717,722, 750, 776, 827,841, 859, 863, 891,894, 906, 928, 933,955, 982, 1034, 1047,1056, 1140, 1169, 1224
(continued)
Relationships Between Variables
Graphs illustrated so far have involved only a single variableWhen two variables exist other techniques are used:Categorical(Qualitative)VariablesNumerical(Quantitative)Variables Cross tables Scatter plots
Scatter DiagramsScatter Diagrams are used for paired observations taken from two numerical variables
The Scatter Diagram:one variable is measured on the vertical axis and the other variable is measured on the horizontal axis
Scatter Diagram Example
Volume per dayCost per day231252614029146331603816742170501885519560200
Cross Table and Side by Side Bar Chart
Sales by quarter for three sales territories:
41
Sheet1Hospital UnitNumber of PatientsCardiac Care1052Emergency2245Intensive Care340Maternity552Surgery4630
Chart4105222453405524630
Number of patients per yearHospital Patients by Unit
Chart1105222453405524630
Hospital Patients by Unit
Sheet1Hospital UnitNumber of PatientsCardiac Care1052Emergency2245Intensive Care340Maternity552Surgery4630
Sheet2
Sheet3
Chart20365420
FrequencyFrequencyHistogram: Daily High Temperature
Sheet4BinFrequency105200300400500More0
Sheet5BinFrequency100203307404504602More0
Sheet50000000
FrequencyBinFrequencyHistogram
Sheet6BinFrequencyCumulative %BinFrequencyCumulative %50.00%35630.00%15210.00%45555.00%25430.00%25475.00%35660.00%15285.00%45585.00%55295.00%55295.00%651100.00%651100.00%50100.00%More0100.00%More0100.00%
Sheet60000000000000000
FrequencyCumulative %BinFrequencyHistogram
Sheet7BinFrequency5001531025620355304544055250More060
Sheet7
FrequencyFrequencyHistogram
Sheet8BinFrequency00102204306405502601More0
Sheet800000000
FrequencyFrequencyHistogram
Sheet9BinFrequency00103207304404502More0
Sheet90000000
FrequencyBinFrequencyHistogram
Sheet10
Sheet11BinFrequency9.9319.9620.9130.9440.9450.92More0
Sheet110000000
FrequencyBinFrequencyHistogram
Sheet12BinFrequency9.9319.9629.9539.9449.9259.90More0
Sheet120000000
FrequencyFrequencyHistogram
Sheet229.9319.9729.91139.91249.91459.91617172021252728313334364348
Sheet3103206305404502
Chart1015457090100
FrequencyCumulative PercentageOgive: Daily High Temperature
Sheet4BinFrequency105200300400500More0
Sheet5BinFrequency100203307404504602More0
Sheet50374420
FrequencyBinFrequencyHistogram
Sheet6BinFrequencyCumulative %BinFrequencyCumulative %50.00%35630.00%15210.00%45555.00%25430.00%25475.00%35660.00%15285.00%45585.00%55295.00%55295.00%651100.00%651100.00%50100.00%More0100.00%More0100.00%
Sheet660.350.5540.7520.8520.95110101
FrequencyCumulative %BinFrequencyHistogram
Sheet7BinFrequency100201530454070509060100
Sheet70000000
Frequency
Sheet80000000
FrequencyOgive
Sheet9BinFrequency00102204306405502601More0
Sheet902465210
FrequencyFrequencyHistogram
Sheet10BinFrequency00103207304404502More0
Sheet100374420
FrequencyBinFrequencyHistogram
Sheet11
Sheet12BinFrequency9.9319.9620.9130.9440.9450.92More0
Sheet123614420
FrequencyBinFrequencyHistogram
Sheet2BinFrequency9.9319.9629.9539.9449.9259.90More0
Sheet23654200
FrequencyFrequencyHistogram
Sheet329.9319.9729.91139.91249.91459.91617172021252728313334364348
103206305404502
Chart2125140146160167170188195200
Cost per dayVolume per DayCost per DayCost per Day vs. Production Volume
Sheet1Volume per dayCost per day231252614029146331603816742170501885519560200
Sheet1000000000
Cost per dayVolume per DayCost per DayProduction Volume vs. Cost per Day
Sheet2
Sheet3