prev

next

out of 15

View

218Download

1

Embed Size (px)

1

Math 1040 Skittles Project - Part I and Worksheet Jason Morton

Part I. For your own single 2.17-ounce bag of Skittles, record the numbers in the table below.

Number of red candies

Number of orange candies

Number of yellow candies

Number of green candies

Number of purple candies

Total

16 9 7 15 15 62 Using the data compiled from the entire class, record the following information: The total number of candies in the sample = ____1252_______

Number of red candies

Number of orange candies

Number of yellow candies

Number of green candies

Number of purple candies

251 238 250 249 264 Proportion 0.200 0.190 0.200 0.199 0.211 Throughout this entire project, use decimals rounded to three places

for all of your proportions. Do not use percents. The total number of candies in your own single 2.17-ounce bag of Skittles = ___62____

The total number of bags in the sample collected by the entire class = ___21_____

The total number of candies in the sample collected by the entire class = ____1252_____

For the entire sample:

= __59.6_____ (the mean number of candies per bag rounded to 1 decimal place)

Method: Add all subtotals of candies per each bag together and divide by the total (21 bags)

s = ___2.75____ (the std. deviation of the number of candies per bag rounded to two decimal

places)

Method:

5- number summary: (round to one decimal place where necessary)

51, 58, 60, 61, 64 (min, Q1, Q2 (median), Q3, max)

Method: sort candy counts for each bag from lowest to highest and number sequentially) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 51 57 58 58 58 58 58 59 59 59 60 60 61 61 61 61 62 62 62 63 64

Minimum = (1) 51 Q1 = (25/100) x 21 = 5.25 (round to 6) = 58 Q2 (median) = (50/100) x 21 = 10.5 (round to 11) = 60 Q2 = (75/100) x 21 = 15.75 (round to 16) = 61 Maximum = (21) 64

2

Fill in the appropriate values on this page and keep it handy as you do your calculations.

Quick Reference for Confidence Intervals

For the interval estimate of the proportion of purple candies:

n = __1252___ x = __264___ = __0.2108___ = __0.05___

For the interval estimate of the mean number of candies in a bag:

n = __21___ = __59.6___ = __0.01___ s = 2.75

For the interval estimate of the standard deviation of the number of candies in a bag:

n = __21___ s = __2.75___ = __0.01___ !! = ___37.566___ !! = ___8.260___

Quick Reference for Hypothesis Tests

For testing the claim that 20% of Skittles are green:

n = __1252___ x = __249___ = __0.19888___ = __0.01___

!: _________p = 0.20____________ !: _______p not equal to 0.20___________

For testing the claim that the mean number of Skittles in a 2.17-oz. bag is 55:

n = __21___ = __59.6___ = __0.05___

!: __________ = 56___________ !: _________ not equal to 56___________

3

Math 1040 Skittles Term Project Part II

Introduction: The goal of the project was to apply statistical methods learned in Math 1040 to better

understand a real world situation, namely the variability in packaging of a commercial product, Skittles candy. Skittles can be purchased in single 2.17-ounce bags. Each member of the class (21 of us) purchased one bag. Each bag contained a variable number of red, orange, yellow, green and purple candies. We each counted the number of each color of candy in our bag and provided these numbers to the instructor, who compiled a spreadsheet of the individual and combined data. From these data, each of us was to analyze the combined data for the class (21 bags), and to compare this to our own bag to learn about the variability in the packaging from bag-to-bag, and how this evens out when a large number of bags of Skittles are considered together. My initial hypothesis was that there would be very little variability between bags, but that proved not to be the case. There appeared to be quite a lot of variability both in the number of candies per bag, as well as the number of each color of candy. However, when all 21 bags were considered together, these differences seemed less impressive.

Categorical Data: Colors The proportion of each color represented in the overall sample gathered by the class (21

bags) was first determined. This was calculated by dividing the total number of candies of each color by the total number of candies for the entire class (1252). As shown in the following Pie and Pareto Charts, these proportions ranged from 0.190 (19% of total) for orange, to 0.211 (21.1% of total) for purple. Visually, the differences between bars and the size of pie sections in these charts seem small, suggesting that the number of colors of candies was similar.

4

The similarity between the proportions of each color of candy in the overall data (from 21 bags) was surprising to me considering that the number and proportions of each color of candy in my own bag were quite different from each other and from the class mean, as shown by the following table and chart. However, with more careful inspection, the number of candies of each color, as well as the total number of candies in my bag was within the standard deviation of the class means, with the exception of the red and yellow candies, which were slightly above and below, respectively, the standard deviation of the mean values of the class values. Whether these exceptions were statistically significant is not clear.

Color of candy Numbers of candies Proportion of total

Class mean (s) My bag Class mean My bag

Red 11.952 (3.008) 16 0.200 0.258

Orange 11.333 (3.039) 9 0.190 0.145

Yellow 11.905 (3.520) 7 0.200 0.113

Green 11.857 (3.395) 15 0.199 0.242

Purple 12.571 (2.993) 15 0.211 0.242

Total 59.619 (2.747) 62

5

Categorical Data (Numbers of candies): An assessment of the numbers of candies in each bag was made. Although each bag

(supposedly) weighed exactly the same amount (2.17 ounces), there were some differences in the number of candies in each bag, although these differences were small. A total of 1252 candies represented the entire sample from the class (from 21 bags), for a mean of 59.6 candies per bag. However, the standard deviation for the number of candies per bag in the overall sample was quite small (2.75). The number of candies in my own bag was 62, which was within a standard deviation of the mean for the overall sample. The frequency distribution of the number of Skittles per bag roughly assumed a normal distribution from 56-64 candies per bag, as shown in the table below and the chart (top of next page), with a single outlier.

Frequency distribution

# of Skittles per bag Frequency 50-52 1 53-55 0 56-58 6 59-61 9 62-64 5

6

As shown in the Box Plot below, this outlier represented the minimum, at 51 candies in a bag. The 5-number summary for the data was 51, 58, 60, 61 and 64. The first (Q1), second (Q2) and third (Q3) quartiles for the distribution of the number of Skittles per bag were tightly clustered from 58-61. This shows that the differences for numbers of candies per bag were fairly small.

Summary: The differences between numbers of candies per bag, and numbers of different colors in the overall sample from the class seemed quite small. At first glance, the number of

7

colors of candies in my bag of Skittles seemed quite different from that of the class as a whole. However, the number of candies per bag of each color fell within a standard deviation of the mean values of each color per bag for the overall sample (with the exception of red and yellow, which were slightly outside the standard distribution). There were some differences in the number of candies per bag, although the standard deviation for the overall sample was fairly small as well. Because each bag contains 2.17 ounces, it is possible that differences in numbers of candies per bag could be due to a slight difference in the weight/size of the candies, if the bags are packaged by weight. Alternatively, 2.17 ounces could be an average weight, and the actual weight of each bag could be slightly different.

Reflection Quantitative (numerical) data consist of numbers representing counts or measurement.

The numbers of Skittles in one bag would be an example. An individuals weight and age would also be quantitative data. Using appropriate units of measurement such as dollars, hours, feet and meters is very important. Quantitative data can be either discrete or continuously. Categorical (qualitative) data consists of names or labels that are not numbers representing counts of measurements. Colors of Skittles would be categorical data. Other examples of categorical data include gender, political party affiliation, social security numbers, and sports jersey numbers.

Graphs are commonly used in statistical analysis because they aid in the understanding and interpretation of data. Quantitative data is used to create scatter-plots, time-series plots, dot-plots and stem-plots. Categorical data is used in bar-graphs, Pareto-charts and pie charts. We use a Pareto chart and a pie chart in this project to help us describe and make sense of colors and skittles (categorical data).

A histogram (graph of a frequency distribution) consists of a graph that is easier to interpret than a table of nu