46
Using SAS Using SAS Da Da Nadia Nadia A A MSc MSc. Ca . Ca Brock Un Brock Un to Analyze to Analyze ata ata Akseer Akseer andidate andidate niversity niversity

Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Using SAS to Analyze Using SAS to Analyze Data Data Data Data

Nadia Nadia Akseer Akseer MSc MSc. Candidate . Candidate Brock University Brock University

Using SAS to Analyze Using SAS to Analyze Data Data Data Data

Akseer Akseer . Candidate . Candidate

Brock University Brock University

Page 2: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Agenda Agenda a. a. Examine Individual Variable Distributions Examine Individual Variable Distributions

i i. . Continuous data: Continuous data: Proc Proc Univariate Univariate­ ­ Proc Means Proc Means

ii. ii. Categorical data: Categorical data: Proc Freq Proc Freq­ ­ distributions distributions

b. b. Examine Relationships Between Variables Examine Relationships Between Variables

i i. . Continuous data: Continuous data: Scatter plots Scatter plots Correlation Correlation ­ ­ Spearman, Pearson Spearman, Pearson

ii. ii. Categorical data: Categorical data: Freq tables, probabilities, Chi Freq tables, probabilities, Chi

iii. iii. Continuous and Categorical data: Continuous and Categorical data: Proc t Proc t­ ­test test

Agenda Agenda Examine Individual Variable Distributions Examine Individual Variable Distributions

­ ­ distributions, tests for normality, plots distributions, tests for normality, plots

distributions distributions

Examine Relationships Between Variables Examine Relationships Between Variables

Spearman, Pearson Spearman, Pearson

Freq tables, probabilities, Chi Freq tables, probabilities, Chi­ ­square, Fishers Exact square, Fishers Exact Continuous and Categorical data: Continuous and Categorical data:

Page 3: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

Height, Weight, Height, Weight, BMI, sex and activity level BMI, sex and activity level measurements are available for a group of measurements are available for a group of physically active students physically active students

**Note: 5 activity questions asked ‘1=none…4=very active’ ** **Note: 5 activity questions asked ‘1=none…4=very active’ **

Example Example

BMI, sex and activity level BMI, sex and activity level measurements are available for a group of measurements are available for a group of physically active students physically active students

**Note: 5 activity questions asked ‘1=none…4=very active’ ** **Note: 5 activity questions asked ‘1=none…4=very active’ **

Page 4: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

n n What are the mean, median, mode of the BMI? What are the mean, median, mode of the BMI? n n How dispersed is the BMI data? How dispersed is the BMI data? n n Is BMI normally distributed? Is BMI normally distributed?

Example Example Con’td Con’td

What are the mean, median, mode of the BMI? What are the mean, median, mode of the BMI? How dispersed is the BMI data? How dispersed is the BMI data? Is BMI normally distributed? Is BMI normally distributed?

Page 5: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Proc Proc Univariate Univariate

n n Provides information on: Provides information on: n n Measures of central tendency Measures of central tendency

n n (mean, median, mode etc.) (mean, median, mode etc.) n n Measures of dispersion Measures of dispersion n n Measures of dispersion Measures of dispersion

n n (standard deviation, range, IQR etc.) (standard deviation, range, IQR etc.) n n Allows us to visualize data Allows us to visualize data

n n (stem (stem­ ­leaf, normality & box plots) leaf, normality & box plots)

n n Used for a continuous variable Used for a continuous variable

Univariate Univariate

Provides information on: Provides information on: Measures of central tendency Measures of central tendency

(mean, median, mode etc.) (mean, median, mode etc.) Measures of dispersion Measures of dispersion Measures of dispersion Measures of dispersion

(standard deviation, range, IQR etc.) (standard deviation, range, IQR etc.) Allows us to visualize data Allows us to visualize data

leaf, normality & box plots) leaf, normality & box plots)

Used for a continuous variable Used for a continuous variable

Page 6: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Proc Proc Univariate Univariate

Proc Proc univariate univariate data= data=bmi bmi Var Var bmi bmi; ; Histogram/normal; Histogram/normal; Run; Run;

Univariate Univariate Syntax Syntax

bmi bmi plot normal ; plot normal ;

6 6

Page 7: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Proc Proc Univariate Univariate Univariate Univariate Output Output

Page 8: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Is the data normally distributed? If not? Which way is it skewed? Is the data normally distributed? If not? Which way is it skewed?

8 8

Page 9: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Variable is normally distributed if p Variable is normally distributed if p Variable is normally distributed if p Variable is normally distributed if p­ ­value>0.05 value>0.05

Page 10: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

n n How many individuals have complete data for How many individuals have complete data for height, weight and BMI? height, weight and BMI?

n n What is the range of data for all three What is the range of data for all three variables? variables? variables? variables?

n n What are the means and standard deviations? What are the means and standard deviations?

Example Example Con’td Con’td

How many individuals have complete data for How many individuals have complete data for height, weight and BMI? height, weight and BMI? What is the range of data for all three What is the range of data for all three

What are the means and standard deviations? What are the means and standard deviations?

Page 11: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Proc Means Proc Means Used to obtain mean, standard deviation, min Used to obtain mean, standard deviation, min and max for multiple continuous variables and max for multiple continuous variables

Proc means data= Proc means data=bmi bmi; ; Var Var bmi bmi ht wt; ht wt; Var Var bmi bmi ht wt; ht wt; Run; Run;

Proc Means Proc Means Used to obtain mean, standard deviation, min Used to obtain mean, standard deviation, min and max for multiple continuous variables and max for multiple continuous variables

11 11

Page 12: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

n n What proportion of the sample are boys? What proportion of the sample are boys? n n What proportion are girls? What proportion are girls? n n What proportion of the sample are not What proportion of the sample are not physically active in the first activity question? physically active in the first activity question? physically active in the first activity question? physically active in the first activity question?

n n What are the physical activity trends in all 5 What are the physical activity trends in all 5 activity questions? activity questions?

Example Example Con’td Con’td

What proportion of the sample are boys? What proportion of the sample are boys? What proportion are girls? What proportion are girls? What proportion of the sample are not What proportion of the sample are not physically active in the first activity question? physically active in the first activity question? physically active in the first activity question? physically active in the first activity question? What are the physical activity trends in all 5 What are the physical activity trends in all 5

Page 13: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Proc Freq Proc Freq

n n Looks at distribution of categorical variables Looks at distribution of categorical variables n n Gives information about frequency and Gives information about frequency and proportions proportions Can look at multiple variables at a time Can look at multiple variables at a time n n Can look at multiple variables at a time Can look at multiple variables at a time

Proc Freq data= Proc Freq data=bmi bmi; ; Table sex active1 Table sex active1­ ­active5; active5; Run; Run;

Proc Freq Proc Freq

Looks at distribution of categorical variables Looks at distribution of categorical variables Gives information about frequency and Gives information about frequency and

Can look at multiple variables at a time Can look at multiple variables at a time Can look at multiple variables at a time Can look at multiple variables at a time

active5; active5;

13 13

Page 14: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

14 14

Page 15: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Correlation Correlation n n Correlation Correlation

n n Two variables are considered to be Two variables are considered to be when there is a when there is a relationship relationship

n n ρ ρ (rho) a.k.a. “Correlation Coefficient (r)” (rho) a.k.a. “Correlation Coefficient (r)” n n ρ ρ (rho) a.k.a. “Correlation Coefficient (r)” (rho) a.k.a. “Correlation Coefficient (r)” n n Used to express the strength of the association Used to express the strength of the association between the two variables between the two variables

n n Has a range of values: Has a range of values: n n ||ρ ρ|= 1 |= 1 è è perfect perfect linear linear relationship relationship n n ρ ρà à 0 0è è weak weak linear linear relationship relationship n n ρ ρà à 1 1è è strong strong linear linear relationship relationship

Correlation Correlation

Two variables are considered to be Two variables are considered to be correlated correlated relationship relationship between them between them

(rho) a.k.a. “Correlation Coefficient (r)” (rho) a.k.a. “Correlation Coefficient (r)” (rho) a.k.a. “Correlation Coefficient (r)” (rho) a.k.a. “Correlation Coefficient (r)” Used to express the strength of the association Used to express the strength of the association between the two variables between the two variables Has a range of values: Has a range of values: ­ ­1 ≤ 1 ≤ ρ ρ ≤ 1 ≤ 1

relationship relationship relationship relationship relationship relationship

Page 16: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Correlation Correlation

n n Hypotheses Hypotheses n n What is our H What is our H 0 0 in correlation? in correlation?

ρ ρ = 0 = 0è è There is no There is no linear linear

n n What is our H What is our H A A in correlation? in correlation? ρ ρ ≠ 0 ≠ 0è è There is a There is a linear linear

Correlation Correlation

in correlation? in correlation? linear linear correlation correlation

in correlation? in correlation? linear linear correlation correlation

Page 17: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Correlation Correlation

n n Procedure for determining if there is a Procedure for determining if there is a correlation between two variables correlation between two variables

1. 1. Run a scatter plot Run a scatter plot 2. 2. Check Assumptions Check Assumptions­ ­ 2. 2. Check Assumptions Check Assumptions­ ­ 3. 3. Run either a Pearson or a Spearman Run either a Pearson or a Spearman 4. 4. Determine if you reject/fail to reject Ho Determine if you reject/fail to reject Ho 5. 5. If you reject, look at correlation coefficient If you reject, look at correlation coefficient

How strong is the relationship? How strong is the relationship?

Correlation Correlation

Procedure for determining if there is a Procedure for determining if there is a correlation between two variables correlation between two variables

Normal distribution Normal distribution Normal distribution Normal distribution

Run either a Pearson or a Spearman Run either a Pearson or a Spearman Determine if you reject/fail to reject Ho Determine if you reject/fail to reject Ho If you reject, look at correlation coefficient If you reject, look at correlation coefficient – – How strong is the relationship? How strong is the relationship?

Page 18: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Review Review 5. 5. If H If H 0 0 is rejected, determine the strength of the is rejected, determine the strength of the

relationship relationship

ρ >0.7 >0.7

0.4 – 0.7 <0.4

Review Review is rejected, determine the strength of the is rejected, determine the strength of the

Relationship Strong Strong Medium Weak

Page 19: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example n n Both the Pearson Correlation and the Spearman Both the Pearson Correlation and the Spearman Correlation will be used on the same example data Correlation will be used on the same example data to show the differences between the two methods to show the differences between the two methods

Table 9.1. Lengths and Weights of Male Bears x Length (in.) 53.0 67.5 72.0 y Weight (lb) 80 344 416

Example Example Both the Pearson Correlation and the Spearman Both the Pearson Correlation and the Spearman Correlation will be used on the same example data Correlation will be used on the same example data to show the differences between the two methods to show the differences between the two methods

Table 9.1. Lengths and Weights of Male Bears 72.0 72.0 73.5 68.5 73.0 37.0

416 348 262 360 332 34

Page 20: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

1. 1. Run a Scatter Plot Run a Scatter Plot

proc plot; proc plot; proc plot; proc plot; plot weight*length; plot weight*length; run; run;

Example Example

proc plot; plot y*x; title ‘….’; run;

Page 21: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Can you see an association?? association??

Page 22: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example Check Assumptions Check Assumptions

n n Random sample Random sample n n Points approximately on a straight line Points approximately on a straight line n n Outliers examined Outliers examined

Normal distribution for Normal distribution for

þ þ þ ý n n Normal distribution for Normal distribution for þ

Weight

ý P­values<0.05

Weight

Example Example

Points approximately on a straight line Points approximately on a straight line

Normal distribution for Normal distribution for both both variables variables Normal distribution for Normal distribution for both both variables variables

Length

Page 23: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

3. 3. Decide Between a Pearson and a Spearman Decide Between a Pearson and a Spearman n n Only 3/4 assumptions were met, therefore we Only 3/4 assumptions were met, therefore we

should proceed with a…. should proceed with a….

*Normal distribution most important*

Example Example

Decide Between a Pearson and a Spearman Decide Between a Pearson and a Spearman Only 3/4 assumptions were met, therefore we Only 3/4 assumptions were met, therefore we should proceed with a…. should proceed with a….

*Normal distribution most important*

Page 24: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

Pearson Pearson Proc Proc corr corr; ; Var Var weight length; weight length; Var Var weight length; weight length; Run; Run;

Example Example

Page 25: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

Correlation coefficient (r) p­value

Example Example

Is there a linear relationship? Strength?

Page 26: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

Spearman Spearman Proc Proc corr corr spearman; spearman; Var Var weight length; weight length; Var Var weight length; weight length; Run; Run;

Example Example

Page 27: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example Example Example

Page 28: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

4. 4. Determine the fate of H Determine the fate of H 0 0 5. 5. Determine the strength of the relationship Determine the strength of the relationship

n n Spearman Spearmanè è r=0.35929, p=0.3821 r=0.35929, p=0.3821 ∴ ∴

n n Spearman Spearmanè è r=0.35929, p=0.3821 r=0.35929, p=0.3821 ∴ ∴ FTR H FTR H 0 0 – – There is no There is no linear linear weight and length of a bear weight and length of a bear

n n Pearson Pearson è è r=0.897, p=0.0025 r=0.897, p=0.0025 ∴ ∴ Reject H Reject H 0 0 – – There is a There is a strong strong between the weight and length of a bear between the weight and length of a bear

Example Example

0 0

Determine the strength of the relationship Determine the strength of the relationship

r=0.35929, p=0.3821 r=0.35929, p=0.3821 r=0.35929, p=0.3821 r=0.35929, p=0.3821 linear linear relationship between the relationship between the

weight and length of a bear weight and length of a bear

r=0.897, p=0.0025 r=0.897, p=0.0025 strong strong linear linear relationship relationship

between the weight and length of a bear between the weight and length of a bear

Page 29: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Chi Chi­ ­square Tests square Tests

n n Chi Chi­ ­Square testing is generally used to test Square testing is generally used to test claims about claims about categorical categorical frequency counts for different categories frequency counts for different categories

n n Uses Chi Uses Chi­ ­square distribution square distribution n n Uses Chi Uses Chi­ ­square distribution square distribution n n Many different types of tests: Many different types of tests:

i.e. Independence, Homogeneity, Goodness of fit, Fisher’s i.e. Independence, Homogeneity, Goodness of fit, Fisher’s exact, exact, McNemars McNemars

square Tests square Tests

Square testing is generally used to test Square testing is generally used to test categorical categorical data consisting of data consisting of

frequency counts for different categories frequency counts for different categories square distribution square distribution square distribution square distribution

Many different types of tests: Many different types of tests: i.e. Independence, Homogeneity, Goodness of fit, Fisher’s i.e. Independence, Homogeneity, Goodness of fit, Fisher’s

Page 30: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example: Test of Independence Example: Test of Independence

n n Lets do a test of independence between sex (M Lets do a test of independence between sex (M or F) and BMI group (Normal, Overweight, or F) and BMI group (Normal, Overweight, Obese) Obese)

n n H H : Sex and BMI group are independent : Sex and BMI group are independent n n H H 0 0 : Sex and BMI group are independent : Sex and BMI group are independent n n H H A A : Sex and BMI group are not independent : Sex and BMI group are not independent

Example: Test of Independence Example: Test of Independence

Lets do a test of independence between sex (M Lets do a test of independence between sex (M or F) and BMI group (Normal, Overweight, or F) and BMI group (Normal, Overweight,

: Sex and BMI group are independent : Sex and BMI group are independent : Sex and BMI group are independent : Sex and BMI group are independent : Sex and BMI group are not independent : Sex and BMI group are not independent

Page 31: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

SAS Syntax SAS Syntax

proc freq data=mydata.newbmi; proc freq data=mydata.newbmi; table sex*owt/nopercent norow nocol expected table sex*owt/nopercent norow nocol expected chisq; chisq; run; run; run; run;

Explanation of Syntax: Explanation of Syntax: n n expected expected = based on the independent assumption to calculate = based on the independent assumption to calculate

the expected frequency the expected frequency n n chisq chisq = chi = chi­ ­square test square test

SAS Syntax SAS Syntax

proc freq data=mydata.newbmi; proc freq data=mydata.newbmi; table sex*owt/nopercent norow nocol expected table sex*owt/nopercent norow nocol expected

= based on the independent assumption to calculate = based on the independent assumption to calculate

Page 32: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

What is our conclusion?

P­value>0.05 FTR Ho

What is our conclusion?

Page 33: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Fisher’s Exact Test Fisher’s Exact Test

n n When the expected values is <5, then Chi When the expected values is <5, then Chi square test is not valid square test is not valid

n n In this case, we use In this case, we use Fisher’s Exact test Fisher’s Exact test Example: Example: Association between wearing helmets and Association between wearing helmets and n n Example: Example: Association between wearing helmets and Association between wearing helmets and getting face injuries? getting face injuries?

Helmet

Face Injury

yes

yes 2

no 6

Fisher’s Exact Test Fisher’s Exact Test

When the expected values is <5, then Chi When the expected values is <5, then Chi­ ­

Fisher’s Exact test Fisher’s Exact test Association between wearing helmets and Association between wearing helmets and Association between wearing helmets and Association between wearing helmets and

no

13

19

Page 34: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

SAS Syntax SAS Syntax Data Data helmet; helmet; Input helmet $ faceinj $ count @@; Input helmet $ faceinj $ count @@; Datalines; Datalines; yes yes 2 no yes 13 yes yes 2 no yes 13 yes no 6 no no 19 yes no 6 no no 19 ; ; ; ; run run; ;

proc proc freq freq order=data; order=data; weight count; weight count; table faceinj*helmet/nopercent norow nocol expected; table faceinj*helmet/nopercent norow nocol expected; exact chisq; exact chisq; run run; ;

SAS Syntax SAS Syntax

Input helmet $ faceinj $ count @@; Input helmet $ faceinj $ count @@;

table faceinj*helmet/nopercent norow nocol expected; table faceinj*helmet/nopercent norow nocol expected;

Page 35: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

This tells us Chi square test is not valid, therefore use Fisher’s exact p­ value

Page 36: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Using the two sided p­value and significance level=0.05, what is our conclusion?

value and significance level=0.05,

Page 37: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

T T­ ­Test Test

n n Used to compare a continuous variable between two Used to compare a continuous variable between two populations or groups of a categorical variable populations or groups of a categorical variable

n n Assess difference Assess difference between between the two means the two means n n Assumptions: Assumptions: n n Assumptions: Assumptions: 1. 1. Equal variance for both populations Equal variance for both populations 2. 2. The sample data need to be randomly sampled The sample data need to be randomly sampled 3. 3. The two samples are independent The two samples are independent 4. 4. Small sample size (<30) if it is ND, or Small sample size (<30) if it is ND, or 5. 5. Larger sample size if it not ND Larger sample size if it not ND

Test Test

Used to compare a continuous variable between two Used to compare a continuous variable between two populations or groups of a categorical variable populations or groups of a categorical variable

the two means the two means

Equal variance for both populations Equal variance for both populations The sample data need to be randomly sampled The sample data need to be randomly sampled The two samples are independent The two samples are independent Small sample size (<30) if it is ND, or Small sample size (<30) if it is ND, or Larger sample size if it not ND Larger sample size if it not ND

Page 38: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

n n Let’s examine if the systolic blood Let’s examine if the systolic blood pressure is different between pressure is different between blood pressure group (n=15) and blood pressure group (n=15) and hypertensive (n=10) group hypertensive (n=10) group hypertensive (n=10) group hypertensive (n=10) group

n n Ho: µNormal=µHypertensive=0 Ho: µNormal=µHypertensive=0

n n Ha: µNormal=µHypertensive≠0 Ha: µNormal=µHypertensive≠0

Example Example

Let’s examine if the systolic blood Let’s examine if the systolic blood pressure is different between pressure is different between a normal a normal blood pressure group (n=15) and blood pressure group (n=15) and hypertensive (n=10) group hypertensive (n=10) group hypertensive (n=10) group hypertensive (n=10) group

Ho: µNormal=µHypertensive=0 Ho: µNormal=µHypertensive=0

Ha: µNormal=µHypertensive≠0 Ha: µNormal=µHypertensive≠0

38 38

Page 39: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Input Data into SAS Input Data into SAS Normal BP (mmHg)

Hypertensive BP (mmHg)

114 117 130 155 115 115 125 138 148 132 121 100 132 121 100 115 122 156 122 162 140 151 110 156 122 162 130 158

Input Data into SAS Input Data into SAS

data bp; input SYM $ sbp @@; datalines; . . .

39 39

.

. ; Run;

Page 40: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Mean SBP For Both Groups Mean SBP For Both Groups

Proc sort; Proc sort; By sym; By sym; Run; Run;

proc means; proc means; Var sbp; Var sbp; By sym; By sym; Run; Run;

Mean SBP For Both Groups Mean SBP For Both Groups

40 40

Page 41: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

n n Assumption: Check to see if the groups are Assumption: Check to see if the groups are normally distributed? normally distributed?

Proc univariate normal plot; Proc univariate normal plot; Proc univariate normal plot; Proc univariate normal plot; Var sbp; Var sbp; by sym; by sym; Run; Run;

Assumption: Check to see if the groups are Assumption: Check to see if the groups are

Proc univariate normal plot; Proc univariate normal plot; Proc univariate normal plot; Proc univariate normal plot;

41 41

Page 42: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Normality Check Normality Check Normality Check Normality Check Is the normal group ND?

42 42

Is the hypertensive group ND?

Page 43: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

n n Assumption: Are variances are equal? Assumption: Are variances are equal? n n Yes Yes­ ­> use the pooled method (t) > use the pooled method (t) n n No No­ ­> use > use satterthwaite’s satterthwaite’s

Proc Proc ttest ttest; ; Class sym; Class sym; Var Var sbp sbp; ; Run; Run;

Assumption: Are variances are equal? Assumption: Are variances are equal? > use the pooled method (t) > use the pooled method (t)

satterthwaite’s satterthwaite’s method (t’) method (t’)

43 43

Page 44: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Difference between means is significant CI does not include O

44 44 F­value>1 and p value<0.05; variances not equal

Difference between means is significant (p<0.05) so REJECT NULL

Page 45: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Example Example

n n Variances are not equal (p=0.0499<0.05) Variances are not equal (p=0.0499<0.05) n n Satterthwaite Satterthwaite p=0.0276 <0.05 p=0.0276 <0.05

­ ­> Reject null > Reject null ­ ­> Blood pressure between the normal and > Blood pressure between the normal and ­ ­> Blood pressure between the normal and > Blood pressure between the normal and hypertensive groups is significantly different hypertensive groups is significantly different

* *Interpret with caution since normal distribution assumption not met* Interpret with caution since normal distribution assumption not met*

Example Example

Variances are not equal (p=0.0499<0.05) Variances are not equal (p=0.0499<0.05) p=0.0276 <0.05 p=0.0276 <0.05

> Blood pressure between the normal and > Blood pressure between the normal and > Blood pressure between the normal and > Blood pressure between the normal and hypertensive groups is significantly different hypertensive groups is significantly different

Interpret with caution since normal distribution assumption not met* Interpret with caution since normal distribution assumption not met*

Page 46: Nadia ia Akseer MSc MSc . Candidate - SAS Group Presentations/Hamilton-User-Group...Example e Height, Weight, BMI, sex and activity level measurements are available for a group of

Further Readings Further Readings

n n Step Step­ ­By By­ ­Step Basic Statistics Using SAS: Step Basic Statistics Using SAS: Exercises Exercises Author: Larry Hatcher Author: Larry Hatcher

n n Data analysis using SAS for Windows: Basic Data analysis using SAS for Windows: Basic Author: Author: Mirka Mirka Ondrack Ondrack

Further Readings Further Readings

Step Basic Statistics Using SAS: Step Basic Statistics Using SAS:

Data analysis using SAS for Windows: Basic Data analysis using SAS for Windows: Basic