View
217
Download
0
Category
Preview:
Citation preview
Chapter 4 Section 1 day 3 2016s Notes.notebook
1
March 18, 2016
Aug 23-8:26 PM
Honors Statistics
Aug 23-8:31 PM
Daily Agenda
3. Notes 4.2 Quiz
4. Check homework C4#3
Chapter 4 Section 1 day 3 2016s Notes.notebook
2
March 18, 2016
Mar 12-12:06 PM
Apr 6-9:53 AM
line 107
Chapter 4 Section 1 day 3 2016s Notes.notebook
3
March 18, 2016
Oct 27-7:07 PM
Use two digit numbers from 01 to 40
Start at line 107 choose two digit numbers.
Continue until 5 two digit numbers from 01 to 40 are selected
82 73 95 78 90 81 67 65 53 00 94
Sample consists of the following classmates.
Johnson (20)
Rider (31) Calloway (07)
Oct 27-7:09 PM
Use two digit numbers from 01 to 33
Start at line 117 choose two digit numbers.
Continue until 3 two digit numbers from 01 to 33 are selected
Sample consists of the following complexes
Fairington (16) Fowler (18)
79 85
Chapter 4 Section 1 day 3 2016s Notes.notebook
4
March 18, 2016
Oct 31-12:11 PM
Oct 27-7:09 PM
Use numbers from 1 to 1410
RANDOM: Use calculator command to generate random numbers
RANDINT(1,1410,1)
Stopping Procedure: Continue until 141 DIFFERENT numbers from 1 to 1410 are selected
Sample consists of the following plots. (do only first 3)
Seed Calculator with number 222 (just for classroom purposes)
Chapter 4 Section 1 day 3 2016s Notes.notebook
5
March 18, 2016
Oct 27-7:09 PM
Use numbers from 1 to 55,914
RANDOM: Use calculator command to generate random numbers
RANDINT(1,55914,1)
Stopping Procedure: Continue until 395 DIFFERENT numbers from 1 to 55914 are selected
Sample consists of the following gravestones.(do only first 3)
gravestones 43962, 1387, 4182
Seed Calculator with number 555 (just for classroom purposes)
Oct 27-7:10 PM
the theoretical probability of four 0 in 40 digits is
exactly 4 but this will not always occur with a "random" sample
of 40 digits.
with 10 digits possible there is a 1/100
chance of two digits being 00 (or 100)
, this will not happen very often but there is a chance that
four digits could be the number 0000. One out of 10,000 chance that
it will occur. ALSO, this number 0000 is just as likely as any other 4
digits!! Think about it.
Chapter 4 Section 1 day 3 2016s Notes.notebook
6
March 18, 2016
Oct 27-7:10 PM
Picking the same "Random" sample each time will not
result in your samples being "Random" any longer....
Oct 27-7:10 PM
To find the sample of 20 randomly selected phones might be a
problem if you waited until the end of the day. Perhaps the managers
could select 20 numbers from 1 to 1000 ahead of time. But then you
would not want to tell the people manufacturing the phones which ones
were selected for inspection ...
b) The last 20 phones are not a random sample so Bias could result.
Perhaps the last 20 phones are of poorer quality than the first phones
manufactured. Tired people or tired machines (over heated, etc.) could
result in the ending phones not having the same quality.
This is NOT a Simple Random Sample because all possible groups
of 20 phones would not be equally likely to be selected. It would be
impossible (probability = 0) for the 20 phones in a row to be selected.
Perhaps 20 phones in a row might be a desirable group occasionally.
Chapter 4 Section 1 day 3 2016s Notes.notebook
7
March 18, 2016
Oct 27-7:10 PM
It would take a very long time (and time is money) to count and label all of the trees in the Rocky Mountain National Park.
An SRS is not a pratical way to sample in this setting.
sample could result in a bias sample. Perhaps the car
emissions are contributing to the trees dying and would
produce a result that overestimates the dead trees.
c) Scientists should not conclude that exactly 35% of all pine trees on the
west side of the park are infested because the sample yielded 35%. If the
sampling method was carried out correctly they should have confidence
that the approximate tree infestation is 35%. Sampling variability is the
concept that each time a sample is selected it could result in different
estimates of the true population "answer". These estimates should only
slightly vary with proper sampling techniques.
Oct 27-11:38 AM
Chapter 4 Section 1 day 3 2016s Notes.notebook
8
March 18, 2016
Oct 27-11:26 AM
Apr 5-5:14 PM
4.23 Is it a SRS?
Population: 2000 male engineers 500 female engineers 200 male engineers 50 female engineers
Is this sample a SRS?
Simple Random Sample
simple random sample (SRS) of size consists of individuals from the population chosen in such a way that every set of individuals has an equal chance to be in the sample actually selected.
While all people have the same probability
of being selected (1 out of 10) not all
sample groups are possible because of the
pre-grouping. (which is good for this but not SRS)
There cannot be all females (a group of
250 females) or all males (250 males) sent
to the "activity" training, meeting, etc.
Chapter 4 Section 1 day 3 2016s Notes.notebook
9
March 18, 2016
Oct 26-4:54 PM
Nov 5-12:24 PM
Chapter 4 Section 1 day 3 2016s Notes.notebook
10
March 18, 2016
Apr 5-5:13 PM
methods of sampling
probability samplingsimple random sampling (SRS)stratified random samplingcluster samplingsystematic random samplingmultistage sampling
voluntary responseconvenience sample
involves studying a part in order to gain information about the whole.
Sampling Frame
A sampling frame is the LIST of individuals from which a sample is selected. This list should include all of the intended population of interest.
Oct 27-3:08 PM
Picking a Sample ... Names out of a hat
Label
Random
Rules
Identify
Chapter 4 Section 1 day 3 2016s Notes.notebook
11
March 18, 2016
Oct 31-12:11 PM
Oct 28-7:59 PM
Stratified random sampling works best when the individuals within each stratum are similar with respect to what is being measured and when there are large differences between strata.
Chapter 4 Section 1 day 3 2016s Notes.notebook
12
March 18, 2016
Oct 28-8:05 PM
Oct 28-9:35 PM
Chapter 4 Section 1 day 3 2016s Notes.notebook
13
March 18, 2016
Nov 7-10:14 AM
Simple Random Sampling:
1. Obtain a list of all KHS students and their unique student ID #s,2. Use computer software (or the random digit table) to generate
students' ID #s that will be included in our sample.(Every student every group has an equally likely chance of being chosen for the sample.)
Stratified Random Sampling:
1. Divide students into meaningful groups, for example, by grade level.2. Obtain a list of all KHS students and their unique student ID #s,
divided up by grade level,3. Use computer software (or the random digit table) to generate students'
ID #s within each grade level that will be included in our sample.(Every student does not have an equally likely chance of being chosen out of the population to be in the sample. Every student in a sub-group has an equally likely chance of being chosen from that sub-group to be in the sample.)
Cluster Sampling:
1. Divide students into meaningful groups, for example, by homerooms.2. Obtain a list of all homerooms and assign unique numbers to each one,3. Use computer software (or the random digit table) to generate
homeroom numbers that will be included in our sample.(Every student does not have an equally likely chance of being chosen out of the
chosen out of all the homerooms to be in the sample.)
Oct 28-9:34 PM
CLUSTER SAMPLING because they randomly
selected the blocks and then contacted all the families in the
selected block.
b) The company most likely chose this method because it would
less time to gather the information from "blocks" of families
rather than travel all over the subdivision to seek out the
randomly selected families for the survey.
Chapter 4 Section 1 day 3 2016s Notes.notebook
14
March 18, 2016
Apr 5-5:18 PM
100% of largest accounts are verified500 in amounts $1000 to $50,000 4400 in amounts under $1000
LABEL:
TABLE:
STOPPING PROCEDURE:
IDENTIFY SAMPLE:
(let's only do the first 3 for practice)
Oct 30-11:24 AM
This would take much time and effort, you may also get a sample that mostly includes the "cheap" seats and
therefore your sample would under-estimate the population proportion of financial status.
Using lettered rows would be the best choice. You could then
randomly select from expensive to cheap seats.
For clusters choose the numbered sections. Each section
contains all the different prices of seats so you would
have a sample that represents the population.
Chapter 4 Section 1 day 3 2016s Notes.notebook
15
March 18, 2016
Oct 29-12:43 PM
Convenience Sampling often produces unrepresentative data.
Bias is NOT just bad luck in one sample.
It's the result of a bad study design!
People who choose to participate in call-in, text-in, or phone-
in surveys are usually NOT representative of some larger
population of interest. They have strong feelings about the
Oct 29-3:46 PM
Be sure you understand the difference between
CLUSTER - a group of individuals that are a smaller image of the population.
Cluster - diverse like the population.
Some students misuse the term "Voluntary Response" to
explain why some randomly selected people do not respond
in a sample survey. (This is called NON- RESPONSE)
Chapter 4 Section 1 day 3 2016s Notes.notebook
16
March 18, 2016
Oct 31-11:45 AM
VISCOG
http://www.theinvisiblegorilla.com/gorilla_experiment.html
Apr 6-9:53 AM
line 107
Chapter 4 Section 1 day 3 2016s Notes.notebook
17
March 18, 2016
Oct 28-8:12 PM
I would recommend to make strata based on the ticket type.
I recommend this because the tickets are different prices and
the athletic department would be able to receive opinions from the
people in the three different price ranges of seats. They would
most likely be purchasing different concessions (or different
amounts of concessions). Perhaps I would also add an additional strata of home sideline, visitor sideline, home corner, visitor
corner, home endzone, and visitor endzone. This way I would
include visiting opinions if the survey would benefit from that type
of information.
I would recommend the same grouping for the clusters. Home and
visitor sideline, corner, and endzone seating.
Using a stratified random sample will involve selecting random people from
the strata that are all over the stadium. This would be very difficult to keep
track of and very time consuming.
A cluster sample would be easier to obtain because you could randomly
pick a seat section from the "sideline" cluster. (Home sideline 26, 25, 24,
23, 22, 21, 20) and use all people in this section to answer the survey
question. This could be done by placing a survey on their seats before the
game starts and then having a person collect the survey from the people in
the section.
Oct 28-9:33 PM
(30)(40) = 1200 rooms
The Hotel manager will want to obtain information from people who stay
does not guarantee that a variety of guests will be selected (you could get
every one from the bottom floor golf course side ...)
The Strata should be grouped by view side and floor level. Level 1 golf,
Level 1 water, Level 2 golf, Level 2 water, .... Level 30 golf, Level 30
Then choose 2 guests (or rooms) from each strata (there are 60 strata, 2
views per 30 floors).
b) The floors could be used as clusters. This would make collection of the data easier (only survey 3 floors) but ...
from my experience, my stay is effected by those guests in rooms directly around, above and below me, so clustering could add an unintended BIAS ...
Chapter 4 Section 1 day 3 2016s Notes.notebook
18
March 18, 2016
Oct 28-9:34 PM
An SRS provides for individuals to have the same
chance or probability of being selected but also
possible group can be selected as well.
an SRS because it would be impossible to
select a group of 5 students over age 21.
The "give-away" is that we started by the selecting by
Oct 28-9:34 PM
CLUSTER SAMPLING because they randomly
selected the blocks and then contacted all the families in the
selected block.
b) The company most likely chose this method because it would
less time to gather the information from "blocks" of families
rather than travel all over the subdivision to seek out the
randomly selected families for the survey.
Chapter 4 Section 1 day 3 2016s Notes.notebook
19
March 18, 2016
Oct 28-9:34 PM
CLUSTER SAMPLING because they randomly
selected the rectangles and then examined all the trees in the
selected rectangle.
b) The company most likely chose this method because it would
less time to gather the information from 20 rectangles of
trees rather than travel all over the forest to seek out the
randomly selected trees for the survey. (plus you would have to
label all of the trees in each rectangle)
Oct 28-9:34 PM
a) The sample result will not be exactly the same should be
very close to the true population proportion. This is sampling
variability. (each sample will have different students and could
have a slightly different result)
SRS of 100 students will provide more data and should
average out closer to the true population proportion.
Larger random samples are better than smaller random samples
Chapter 4 Section 1 day 3 2016s Notes.notebook
20
March 18, 2016
Oct 28-9:34 PM
a) The sample result will not be exactly the same should be
very close to the true population proportion. This is sampling
variability. (each sample will have different students and could
have a slightly different result)
SRS of 100 students will provide more data and should
average out closer to the true population proportion.
Larger random samples are better than smaller random samples
Oct 28-9:34 PM
This survey will yield a result because you wanted to know the
average amount of money spent by the fans on opening day and your
survey does NOT include the "big-spenders". Thus your result should
under-estimate the average amount of money spent by the fans. The
This BIAS is called UNDERCOVERAGE (we did not select anyone from
the $$ seats)
Chapter 4 Section 1 day 3 2016s Notes.notebook
21
March 18, 2016
Oct 28-9:34 PM
This survey should yield a
because it does not include any students that are
given a ride to school by their parents or that drive
to school themselves. I believe the survey will
over-estimate of the time that students get
out of bed before school starts because the
busses take longer to transport students to school
(they have to stop for everyone on the route).
The estimate will be
Recommended