# Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 7: Sampling

• Published on
15-Mar-2016

• View
37

5

Embed Size (px)

DESCRIPTION

Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 7: Sampling. Objectives. Samples, in general Probability sampling Probability sampling methods Nonprobability sampling Central Limit Theorem Applications of CLT Sources of bias and error. Why Worry about Sampling?. - PowerPoint PPT Presentation

Transcript

• Slides to accompany Weathington, Cunningham & Pittenger (2010),

Chapter 7: Sampling*

• ObjectivesSamples, in generalProbability samplingProbability sampling methodsNonprobability samplingCentral Limit TheoremApplications of CLTSources of bias and error*

• Why Worry about Sampling?Dont worry, just appreciate itObjective sampling helps us avoid the Idols of the CaveImproving external validity of our conclusionsGood sampling allows us to make comparisons and predictions from our data*

• Samples...are (hopefully) valid representatives of the population you are studyingcan grant you better (more objective, empirical) data than you will find in anecdotesallow you to avoid reliance on one persons opinions, perspectives, and biases*

• Probability ExamplesProbability of Heads in one flip of a fair coin: p(H) = 1/2; p(T)=1/2=.5 p(H and T) in two flips = 2/4=.5 p(correct answer on 4-option mc question) = .25Pr. of choosing a woman in a single random selection from a class of 223 students with 150 women: p(w)=150/223=.673 *

• Probability SamplingRandom: each outcome has an equal probability of occurring, every timeEvery time I flip a coin, the probability is .5 that it will be H or TRandom sampling depends on this independence of outcomesLaw of large numbers: On average, a large selection of items will have the same characteristics as those in the population*

• Populations and SamplesTarget vs. sampling populationTarget: (universe) e.g. all depressed personsSampling: (accessible) all diagnosed as depressedSampling Frame (all who can be reached)Subject (participant pool) (willing to participate)Descriptive data helps us compare our sample against the populationExternal validity depends largely on representativeness in sampling*

• Probability Sampling CharacteristicsEach population member has an equal chance of being a potential sample memberNo systematic exclusionsSampling procedures are based on a protocolPrevents bias effects on sample selectionProbability of any specific sample can be calculatedHelps connect results with population*

• Simple Random SamplingEach population member has equal probability of selection to the sampleIf selection is random, the sample of any size should represent the population from which it was chosenRandom numbers are in tables and Excel-type computer programs*

• Simple Random Sampling: How-ToGenerate a list of possible participants (population) in Microsoft ExcelIn the next column insert the function =RAND() Creates a random number between 0 and 1 Sort both columns by the random numbersSelect the first N individuals for your sample*

• Sequential/Systematic SamplingRandom is not always practicalAll sampling population members are listed and each kth member is selected to the sample

k = sampling interval =Population sizedesired sample N*

• Stratified SamplingGood option when sample needs to include subgroups from a populationBased on gender, age, education, etc.Size of subgroups in final sample must be equivalent to size in populationCan use simple random or sequential sampling to fill each relative subgroup*

• Cluster SamplingGood option when participants are already in groups that cannot be easily separatede.g., Study of coachings impact on different sports teamsInstead of randomly selecting team members, you randomly select teamsIf need certain subgroup representation, this may limit your option of teams*

• Nonprobability SamplingSampling based on some other factor besides probabilityMay be more convenientMay not be as representativeCant establish probabilities associated with sample membershipCan still be useful if treated with caution*

• Convenience SamplingPerson on the street approachSampling from easy to find population members (a special subset)Sample determined in part by researchers sampling methodNot by probabilityCan bias/distort resultsSometimes the only option*

• Snowball SamplingGood for cohort studies or when trying to reach a dispersed populationUsing one cohort member to find others, and so on...Pros: Good for research on difficult populations to reach (e.g., homeless)Cons: No representative sample guarantee*

• Central Limit TheoremRefers to distribution of characteristics within the probability samplesAs N (sample size) increases, the shape of the sampling distribution of means will approach a normal distribution M = (mean of sample means =pop mean) M = /n (SEM)*

• CLT Sampling Distribution ShapeFigure 7.4 Note how the M becomes closer to as N increases M = mean of means = (sum of all sample means)/(number of samples)M = unbiased estimate of M = std. dev. of the sampling distribution of M As n increases, distribution of sample means will cluster closer to more accurate estimate*

• *

• CLTIf we use probability sampling, M = unbiased estimate of M becomes a better estimate of when n increasesWe can determine the probability of obtaining various M

*

• Standard Error of the MeanRepresents uncertainty of how well M represents SEM = SD of sampling distribution of means / n (n = sample size)http://www.miniwebtool.com/standard-error-calculator/SEM is affected by: as this decreases, SEM decreasesn as this increases, SEM decreases (1/n)M is best estimate of when SEM is low*

• Applying CLTReliability of a sample mean (M)Use SEM to calculate confidence intervals around M (see Fig 7.4, p 212)There will be variability among sample M, but a CI can help you determine the expected rangeAdequacy of a sample size (n)

*

• Confidence IntervalsIn a normal distribution, 68% of M within 1 SEM of , 95% within 1.96 SEM of, 99% within 2.58 SEMCan use CI to predict other M95% CI = 95% of future sample M should fall within this range*

• Sources of Bias and ErrorBias: nonrandom, systematic factors that may make M differ from Could be controlledError: random events that have the same effect, but cannot be controlledFigure 7.7 is a good illustrationIdeally, = , but not in these examplesPossible nonsampling biases at work*

• *

• Bias and ErrorIf the sampling is random, then even if there is a nonsampling bias present, M = Sampling bias: systematic selection bias while samplingTotal error = M - Sum of effects from nonsampling bias, sampling bias, and sampling error*

• What is Next?**instructor to provide details*

** Good = scientific, objective, representative

** This is how we can achieve representativeness in our samples

** The following methods are not necessarily used by themselves sometimes a combination of them is necessary to adequately sample with respect to the resources you have for a study

* Consider demonstrating these techniques using the class roster as the sampling frame or population list, depending on your perspective

Yes, random numbers from Excel are pseudorandom, but usually good enough for their common uses in research and sampling** Above all else, you are still trying to approximate the population by random selection of teams and then the members there in are expected to be mixed as well.

Not necessarily nonrepresentative just cant rely on probability theory for support

*** Basis for our use of samples to describe populations M of sampling distribution of sample means will equal the mean of the population SD of the sampling distribution of means will equal the SD of the population divided by the square root of the sample size

*Remember: n = number of observations in a single sample N = number of samples drawn from the population

5000 sample means for each sampling distribution for sample sizes 2, 20, 40. Note how with larges sample sizes, the shape of the sampling distribution approaches normality. ** To decrease , consider tightening up your definition of the population under study Optimal, NOT maximum n is the goal diminishing returns is the rule here; more to be discussed in later chaptersUsing the online SEM calculator, notice how the SEM decreases as sample size increases (the more scores you enter)

* The how-to is discussed in this chapter of the text If an observed M falls outside this range then you are fairly confident that this is not due to chance, but rather to some real effect