Upload
kmcmullen
View
961
Download
0
Tags:
Embed Size (px)
Citation preview
K McMullen 2012
Further MathematicsDisplaying Bivariate Data
K McMullen 2012
Displaying Bivariate Data
Bivariate Data: data with two variables (two quantities or qualities that change)
Generally one variable depends on the other
The dependent variable depends on the independent variable
Eg. Height and Weight
Eg. Hours studied and test result
Tend to focus more on dependent and independent variables when plotting scatterplots
K McMullen 2012
Displaying Bivariate Data
Back-to-back stem plots: are used to display the relationship between a numerical variable and a two-valued categorical variable
They are used to compare data sets using summary statistics such as measures of centre and measures of spread
Eg. Comparing Further Maths study scores (numerical variable) with gender (male or female- two-valued categorical variable)
K McMullen 2012
Displaying Bivariate Data
Parallel box plots: are used to display the relationship between a numerical variable and a categorical variable with two or more categories
They are used to compare sets of data using summary statistics such as measures of centre and measures of spread- also think of the 5 number summary
Remember that parallel box plots must be placed on the same axis (you can also do this on CAS)
Eg. The results achieved by 4 different further maths classes
K McMullen 2012
Displaying Bivariate Data
Two-way frequency tables: are used to display the relationship between two categorical variables and can be represented graphically as a segmented bar chart
Remember that it is easier to compare data sets if you are working with percentages instead of totals
In a frequency table you should place your independent variable along the top row and your dependent variable along the left column (this will mean that all your columns must add to 100% if done correctly)
K McMullen 2012
Displaying Bivariate Data
Scatterplots: are used to display the relationship (correlation) between two numerical variables
The dependent variable is displayed on the vertical axis
The independent variable is displayed on the horizontal axis
The relationship between variables on a scatterplot can be described in terms of:
Strength (strong, moderate, weak)
Direction (positive, negative)
Form (linear, non-linear)
K McMullen 2012
Displaying Bivariate Data
Scatterplots- continued
Pearson’s product-moment correlation coefficient (r) is used to measure the strength of the scatterplot
The values of r range between -1 (perfect negative) to 1 (perfect positive)
You can approximate the value of r (look at formula on p. 101) but you can also calculate it using CAS (obviously more reliable)
To interpret r look and copy the table on page 100 of your textbook
K McMullen 2012
Displaying Bivariate Data
Scatterplots- continued
• The coefficient of determination (r2): this provides information about the degree to which one variable can be predicted from another variable provided that the variables have a linear correlation
• The coefficient of determination is calculated by squaring the correlation coefficient (r)
• When commenting using r2 always convert your value into a percentage
• Comments
“The coefficient of determination tells us that rr% of the variation in the dependent variable is explained by the variation in the independent variable”
K McMullen 2012
Displaying Bivariate Data
• You must remember the difference between correlation and causation
• To interpret your scatterplot you must stick to the variables given and don’t make any unnecessary assumptions
• If your scatterplot is negative then: “As IV increases the DV decreases)
• If your scatterplot is positive then: “As IV increases the DV increases)
K McMullen 2012
Displaying Bivariate Data
Example: Age and arm span of teenage boys
Comment: As the age of teenage boys increases the length of their arm span also increases
Assumption: As teenage boys get taller their arm span increases
Obviously they get taller but height is not a variable and therefore you should not comment on it
K McMullen 2012
Displaying Bivariate Data
Eg. The number of cigarettes smoked and fitness level
Comment: As the number of cigarettes increase the fitness level of participants decreased
Assumption: Smoking cigarettes causes fitness levels to decrease
You must remember that there can be other factors the can account for low levels of fitness such as lack of exercise or weight etc
K McMullen 2012
Displaying Bivariate Data
Eg. People catching public transport and the sales of designer handbags
Comment: As the number of people catching public transport increase the number of people buying designer handbags decreases
Assumption: A high proportion of people catching public transport has caused a decline in the sales of designer handbags
These two variables are clearly unrelated even though there can be some correlation. You need to always question the validity of stats- what else could have caused public transport use to increase and designer handbags sales to decrease?
K McMullen 2012
Displaying Bivariate Data
Work through Ch 4 questions and chapter review