Upload
tranlien
View
217
Download
0
Embed Size (px)
Citation preview
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 1
Chapter 1The Nature of
Statistics
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 1
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 2
First Week Coverage:
Section 1.1: Statistics Basics
Section 2.1: Variables and Data
Section 2.2: Organizing Qualitative Data
Section 2.3: Organizing Quantitative Data
Section 2.4: Distribution Shapes
STAT 2510, Section 130
Wed/Fri, Aug. 17/19
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 3
Section 1.1
Statistics Basics
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 4
Copyright © 2012, 2008, 2005 Pearson
Education, Inc.
Definition 1.1
Descriptive statistics includes the construction of
graphs, charts, and tables and the calculation of
various descriptive measures such as averages,
measures of variation, and percentiles.
Descriptive Statistics
Descriptive Statistics consists of methods for organizing
and summarizing information.
Descriptive statistics describe samples
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 5
Example 1.1
The 1948 Baseball Season. In 1948, the Washington
Senators played 153 games, winning 56 and losing
97. They finished seventh in the American League
and were led in hitting by Bud Stewart, whose
batting average was .279.
The work of baseball statisticians is an illustration of
descriptive statistics.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 6
Example 1.1.b
This section of STAT 2510 is a sample from all
sections of STAT 2510 students Fall 2016
(population). Here are a few descriptive statistics,
Sample size: n = 95, students registered in class
Gender: 32 Males (33.7%) and 63 Females
(66.3%)
Colleges: Mode: 33.7% of students are in Nursing
Class: Mode: 53.7% of students are Sophomores
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 7
Definition 1.2
Population and Sample
Population: The collection of all individuals or items under
consideration in a statistical study.
Sample: That part of the population from which information
is obtained.
Parameters are numerical values describing a population
Statistics are numerical values describing a sample.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 8
Example 1.2
Political polling provides an example of inferential
statistics. Interviewing everyone of voting age in the
United States on their voting preferences would be
expensive and unrealistic. Statisticians who want to
gauge the sentiment of the entire population of U.S.
voters can afford to interview only a carefully chosen
group of a few thousand voters. This group is called
a sample of the population.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 9
Example 1.2.b
There are a total of 11 sections of STAT 2510 for a
total population of size N=624 students. The
various sample sizes are n1 = 27, n2 = 48, n3 = 26,
n4 = 95, n5 = 27, n6 = 132, n7 = 27, n8 = 93, n9 = 95,
n10 = 27, and n11 = 27.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 10
Copyright © 2012, 2008, 2005 Pearson
Education, Inc.
Definition 1.3
Statisticians analyze the information obtained from a
sample of the voting population to make inferences
(draw conclusions) about the preferences of the entire
voting population. Inferential statistics provides
methods for drawing such conclusions.
Inferential Statistics
Inferential statistics consists of methods for drawing and
measuring the reliability of conclusions about a population
based on information obtained from a sample of the
population.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 11
Copyright © 2012, 2008, 2005 Pearson
Education, Inc.
Figure 1.1Relationship between population and sample
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 12
Figure 1.2Relationship between population and sample
Population All Sections (N = 624)
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 13
Figure 1.2Relationship between population and sample
Sample
Section 130 (n =95)
Population All Sections (N = 624)
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 14
Relationship between population and sample
STAT 2510 Example:
Sample Size %Female
n1 = 27, 70.4%
n2 = 48, 58.3%
n3 = 26, 42.3%
n4 = 95, 75.8%
n5 = 27, 70.4%
n6 = 132, 57.6%
n7 = 27, 59.3%
n8 = 93, 68.8%
n9 = 95, 66.3%
n10 = 27, 63.0%
n11 = 27, 74.1%
Population
Size: N=624
%Female: 64.9%
Statistics
Parameters
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 15
On Your Own
Classifying Statistical Studies
(pages 4-6)
Observational Studies versus Designed Experiments
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 16
Chapter 2Organizing Data
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 16
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 18
Section 2.1
Variables and Data
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 19
Definition 2.1
Variables
Variable: A characteristic that varies from one person or
thing to another.
Qualitative variable: A nonumerically valued variable.
Quantitative variable: A numerically valued variable.
Discrete variable: A quantitative variable whose possible
values can be listed. In particular, a quantitative variable
with only a finite number of possible values is a discrete
variable.
Continuous variable: A quantitative variable whose
possible values form some interval of numbers.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 20
Figure 2.1
Types of variables
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 21
Definition 2.2
Data
Data: Values of a variable.
Qualitative data: Values of a qualitative variable.
Quantitative data: Values of a quantitative variable.
Discrete data: Values of a discrete variable.
Continuous data: Values of a continuous variable.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 22
Section 2.2
Organizing Qualitative Data
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 23
Definition 2.3
Frequency Distribution of Qualitative Data
A frequency distribution of qualitative data is a listing of
the distinct values and their frequencies.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 25
Table 2.1
Political party affiliations of the students in introductory
statistics
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 26
Table 2.2
Table for constructing a frequency distribution for the
political party affiliation data in Table 2.1
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 27
Definition 2.4
Relative-Frequency Distribution of Qualitative Data
A relative-frequency distribution of qualitative data is a
listing of the distinct values and their relative frequencies.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 29
Table 2.3
Relative-frequency distribution for the political party
affiliation data in Table 2.1
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 30
Definition 2.5
Pie Chart
A pie chart is a disk divided into wedge-shaped pieces
proportional to the relative frequencies of the qualitative data.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 32
Figure 2.2
Pie chart of the political party affiliation data in Table 2.1
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 34
Definition 2.6
Bar Chart
A bar chart displays the distinct values of the qualitative
data on a horizontal axis and the relative frequencies (or
frequencies or percents) of those values on a vertical axis.
The relative frequency of each distinct value is represented
by a vertical bar whose height is equal to the relative
frequency of that value. The bars should be positioned so
that they do not touch each other.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 36
Figure 2.3
Bar chart of the political party affiliation data in Table 2.1
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 38
Section 2.3
Organizing Quantitative Data
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 39
Table 2.4
Number of TV sets in each of 50 randomly selected
households.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 40
Table 2.5
Frequency and relative-frequency distributions, using single-
value grouping, for the number-of-TVs data in Table 2.4
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 41
Table 2.6
Days to maturity for 40 short-term investments
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 42
Table 2.6 Days to maturity for 40 short-term investments
Organizing first by ordering.
36 38 39 47 50 51 51 53
55 55 56 57 60 62 63 64
64 65 66 67 68 69 70 70
70 71 75 78 79 80 81 83
85 86 87 89 95 98 99 99
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 43
Table 2.6 Days to maturity for 40 short-term investments
Organizing first by ordering.
36 38 39 47 50 51 51 53
55 55 56 57 60 62 63 64
64 65 66 67 68 69 70 70
70 71 75 78 79 80 81 83
85 86 87 89 95 98 99 99
Order Statistics are the ordered values of the data. The first order statistic is the
minimum, 36, the last order statistic is the maximum, 99.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 44
Table 2.6 Days to maturity for 40 short-term investments
Organizing first by ordering.
36 38 39 47 50 51 51 53
55 55 56 57 60 62 63 64
64 65 66 67 68 69 70 70
70 71 75 78 79 80 81 83
85 86 87 89 95 98 99 99
Order Statistics are the ordered values of the data. The first order statistic is the
minimum, 36, the last order statistic is the maximum, 99.
Is this enough summary?
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 45
Table 2.6 Days to maturity for 40 short-term investments
Organizing first by ordering.
36 38 39 47 50 51 51 53
55 55 56 57 60 62 63 64
64 65 66 67 68 69 70 70
70 71 75 78 79 80 81 83
85 86 87 89 95 98 99 99
Order Statistics are the ordered values of the data. The first order statistic is the
minimum, 36, the last order statistic is the maximum, 99.
Is this enough summary? Since we are looking at whole numbers, maybe we
could group these data into groups of 10’s (30s, 40s, 50s, etc).
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 46
Table 2.7
Frequency and relative-frequency distributions, using limit
grouping, for the days-to-maturity data in Table 2.6
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 47
Definition 2.7
Terms Used in Limit Grouping
Lower class limit: The smallest value that could go in a class.
Upper class limit: The largest value that could go in a class.
Class width: The difference between the lower limit of a class
and the lower limit of the next-higher class.
Class mark: The average of the two class limits of a class.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 48
Frequency and relative-frequency distributions, using limit
grouping, for the days-to-maturity data in Table 2.6
Lower Class Limit
Upper Class Limit
Classes
Class width = 10 = 40 - 30 = 50 – 40 = 60 – 50 = 70 – 60 = 80 -70 = 90 - 80
Class Marks
34.5
44.5
54.5
64.5
74.5
84.5
94.5
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 49
Cut Point Grouping (continuous data
with decimals)
129.2 185.3 218.1 182.5 142.8
155.2 170 151.3 187.5 145.6
167.3 161 178.7 165 172.5
191.1 150.7 187 173.7 178.2
161.7 170.1 165.8 214.6 136.7
278.8 175.6 188.7 132.1 158.5
146.4 209.1 175.4 182 173.6
149.9 158.6
Table 2.8 (Example 2.14): Weights of 18- to 24-
year old Males (in lbs)
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 50
Definition 2.8
Terms Used in Cutpoint Grouping
Lower class cutpoint: The smallest value that could go in a
class.
Upper class cutpoint: The largest value that could go in the
next-higher class (equivalent to the lower cutpoint of the next-
higher class).
Class width: The difference between the cutpoints of a class.
Class midpoint: The average of the two cutpoints of a class.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 51
Choosing the Grouping Method
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 52
Definition 2.9
Histogram
A histogram displays the classes of the quantitative data on a
horizontal axis and the frequencies (relative frequencies, percents) of
those classes on a vertical axis. The frequency (relative frequency,
percent) of each class is represented by a vertical bar whose height
is equal to the frequency (relative frequency, percent) of that class.
The bars should be positioned so that they touch each other.
• For single-value grouping, we use the distinct values of the
observations to label the bars, with each such value centered under
its bar.
• For limit grouping or cutpoint grouping, we use the lower class
limits (or, equivalently, lower class cutpoints) to label the bars.
Note: Some statisticians and technologies use class marks or class
midpoints centered under the bars.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 54
Figure 2.4Single-value grouping. Number of TVs per household:
(a) frequency histogram; (b) relative-frequency histogram
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 55
Limit grouping. Days to maturity: (a) frequency histogram; (b) relative-
frequency histogram
Figure 2.5
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 56
Cutpoint grouping. Weight of 18- to 24-year old males: (a) frequency
histogram; (b) relative-frequency histogram
Figure 2.6
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 57
Definition 2.10
Dotplot
A dotplot is a graph in which each observation is plotted as
a dot at an appropriate place above a horizontal axis.
Observations having equal values are stacked vertically.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 59
Table 2.11 & Figure 2.7
Prices, in dollars, of 16 DVD players
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 1, Slide 60
Definition 2.11
Stem-and-Leaf Diagrams
In a stem-and-leaf diagram (or stemplot), each observation
Is separated into two parts, namely, a stem—consisting of
all but the rightmost digit– and a leaf, the rightmost digit.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 62
Table 2.13 & Figure 2.9
Cholesterol levels
for 20 high-level patients
Stem-and-leaf diagram for cholesterol levels:
(a) one line per stem; (b) two lines per stem
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 63
Table 2.12 & Figure 2.8Days to maturity for
40 short-term investments
Constructing a stem-and-leaf diagram
for the days-to-maturity data
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 64
Section 2.4
Distribution Shapes
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 65
Definition 2.12
Distribution of a Data Set
The distribution of a data set is a table, graph, or
formula that provides the values of the observations and
how often they occur.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 66
Figure 2.10Relative-frequency histogram and approximating smooth curve
for the distribution of heights
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 67
Figure 2.11Examples of (a) unimodal, (b) bimodal, and (c) multimodal
distributions
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 68
Figure 2.12Examples of symmetric distributions: (a) bell shaped, (b) triangular,
and (c) uniform
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 69
Figure 2.13Generic skewed distributions: (a) right skewed (b) left skewed
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 70
Figure 2.14Reverse-J-shaped distribution
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 71
Figure 2.15Relative-frequency histogram for household size
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 72
Definition 2.13
Population and Sample Data
Population data: The values of a variable for the entire
population.
Sample data: The values of a variable for a sample of the
population.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 73
Definition 2.14
Population and Sample Distributions; Distribution of a Variable
The distribution of population data is called the population
distribution, or the distribution of the variable.
The distribution of sample data is called a sample distribution.
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 74
Figure 2.16
Population distribution and
six sample distributions for
household size
Copyright © 2016, 2012, 2008 Pearson Education, Inc. Chapter 2, Slide 75
Key Fact 2.1
Population and Sample Distributions
For a simple random sample, the sample distribution
approximates the population distribution (i.e., the
distribution of the variable under consideration). The
larger the sample size, the better the approximation
tends to be.