The Practice of Statistics Unit 4/ Chapter 3 Examining Relationships

Preview:

Citation preview

The Practice of Statistics

Unit 4/ Chapter 3Examining Relationships

Examining RelationshipsWhen are some situations when we might want

to examine a relationship between two variables?

•Height & Heart Attacks•Weight & Blood Pressure•Hours studying & test scores•What else?

In this chapter we will deal with relationships and quantitative variables; the next chapter will deal with more categorical variables.

The response variable is our dependent variable (traditionally y)

The explanatory variable is our independent variable (traditionally x)

Explanatory or Response?Which is the explanatory and which is the

response variable?

Jim wants to know how the mean 2005 SAT Math and Verbal scores in the 50 states are related to each other. He doesn't think that either score

explains or causes the other.

Julie looks at some data. She asks, “Can I predict a state's mean 2005 SAT Math score if I know its

mean 2005 SAT Verbal score?”

Explanatory and Response Variables

When we deal with cause and effect, there is always a definite response variable and

explanatory variable.

But calling one variable response and one variable explanatory doesn't necessarily mean

that one causes change in the other.

When analyzing several-variable data, the same principles appy…

Data Analysis ToolboxTo answer a statistical question of interest involving one or more

data sets, proceed as follows.

• DATAOrganize and examine the data. Answer the key questions.

• GRAPHSConstruct appropriate graphical displays.

• NUMERICAL SUMMARIESCalculate relevant summary statistics• INTERPRETATIONLook for overall patterns and deviationsWhen the overall pattern is regular, use a mathematical model to

describe it.

W 5HW

ScatterplotsLet's say we wanted to examine the relationship

between the percent of a state's high school seniors who took the SAT exam in 2005 and the mean SAT Math score in each state that year. A

scatterplot is an effective way to graphically represent our data.

But first, what is the explanatory variable and what is the response variable in this situation?

ScatterplotsOnce we decide on the response and explanatory variables, we can

create a scatterplot.

explanatory variable

response variable

Scatterplot Tips

Plot the explanatory variable on the horizontal axis. If there is no explanatory-response distinctions, either variable can go on the horizontal axis.

Label both axes!Scale the horizontal and vertical axes. The

intervals must be uniform.If you are given a grid, try to adopt a scale so

that your plot uses the whole grid. Make your plot large enough so that the details can be easily seen.

Interpreting ScatterplotsDirection?

Form?

Strength?

Outliers?

Interpreting ScatterplotsDirection?

Form?

Strength?

Outliers?

Adding Categorical Data

The Mean SAT Math scores and percent

of hish school seniors who take the test, by state, with the southern states highlighted.

Is the South different?

Making scatterplots on a calculator

See page 183

Measuring Linear Association:Correlation

Linear relations are important because, when we discuss the relationship between two

quantitative variables, a straight line is a simple pattern that is quite common.

Measuring Linear Association:CorrelationA strong linear relationship has points that lie

close to a straight line.

A weak linear relationship has points that are widely scattered about a line.

Our eyes are not good measures of how strong a linear relationship is...

A numerical measure along with a graph gives the linear association an exact value.

Correlation on the calculator.

Facts about CorrelationCorrelation makes no distinction between

explanatory and response variables.r doesn't change when we change the units of

measurement of x, y, or both.r is positive when the association is positive and

is negative when the association is negative.The correlation r is always a number between -

1 and 1. Values of r near 0 indicate a very weak linear relationship. The strength of the linear relationship increases as r moves away from 0 toward either -1 or 1.

Patterns closer to a straight line have correlations closer to 1 or -1

Cautionary Notes about Correlation

Correlation requires that both variables be quantitative.

Correlation does not describe curved relationships, no matter how strong they are.

Like the mean and standard deviation, the correlation is not resistant; r is strongly affected by a few outlying observations.

Correlation is not a complete summary of two-variable data. You should give the means and standard deviations of both x and y along with the correlation.

Scoring Figure SkatersUntil a scandal at the 2002 Olympics brought

change, figure skating was scored by judges on a scale from 0.0 to 6.0. The scores were often controversial. We have the scores awarded by two judges, Pierre and Elena, for many skaters. How well do they agree? We calculate that the correlation between their scores is r= 0.9. The mean of Pierre’s scores is 0.9 point lower than Elena’s mean. Do these facts contradict each other?

Recommended