1
PARAMETRIC TESTS
DR DEEPIKA G
1ST YEAR PG
DEPT OF PHARMACOLOGY
2
CONTENTS:
INTRODUCTION
STATISTICAL DEFINITIONS
MEASURES OF CENTRAL TENDENCY AND
DISPERSION
DISTRIBUTION AND HYPOTHESIS
PARAMETRIC TESTS
REFERENCES
3
INTRODUCTION
• Statistics:- science of data
- study of uncertainty
• Biostatistics: data from: Medicine, Biological
sciences (business, education, psychology,
agriculture, economics...)
• Types: Descriptive statistics
Inferential statistics
4
1. Descriptive Statistics - overview
of the attributes of a data set. These include
measurements of central tendency (frequency
histograms, mean, median, & mode) and
dispersion (range, variance & standard
deviation)
2. Inferential Statistics - provide measures of how
well data support hypothesis and if
data are generalizable beyond what was
tested (significance tests)
5
Data: Observations recorded during research
Types of data:
1. Nominal data synonymous with categorical
data, assigned names/ categories based on
characters with out ranking between categories.
ex. male/female, yes/no, death /survival
6
2. Ordinal data ordered or graded data,
expressed as Scores or ranks
ex. pain graded as mild, moderate and severe
3. Interval data an equal and definite interval
between two measurements
it can be continuous or discrete
ex. weight expressed as 20, 21,22,23,24
interval between 20 & 21 is same as 23 &24
7
Measures of Central Tendencies:
•In a normal distribution, mean and median are the same
•If median and mean are different, indicates that the data are not normally distributed
•The mode is of little if any practical use
8
MEASURES OF VARIABILITY
Range: It is the interval between the highest and lowest observations.• Ex. Diastolic BP of 5 individuals is90,80,78,84,98.
Highest observation is 98
Lowest observation is 78
Range is 98-78= 20.
9
Standard deviation(SD): it is defined as positive square root of arithmetic mean of the square of the deviations taken from the arithmetic mean.
• It describes the variability of the observation about the mean.
Variance: average square deviation around the
mean.
variance =∑(X-X-)2 or ∑(X-X-)2
n n-1 valuesofNumber
Value) Mean - Value l(Individua of Sum SD
2
10
Coefficient Of Variance(cv):
It is the standard deviation(SD) expressed as a
percentage of the mean.
CV= SD / mean* 100
• It is dimensionless (independent of any unit of
measurement)
11
Correlation coefficient:
It measures relationship between two variables.
denoted by ‘r’ , unitless quantity,
it is a pure number.
values lie between -1 and +1
if variables not correlated CC will be zero.
12
PROBABILTY DISTRIBUTIONS1. Binomial Distribution:The conditions to be fulfilled i. There is fixed number(n) of trials;
ii. Only two outcomes, ‘success’ and ‘failure’, are possible at each trial;
iii. The trials are independent,
iv. There is constant probability ) of success at each trial;
v. The variable is the total number of successes in n trials.
13
2. Poisson Distribution:
• There are situations in which number of times an
event occurs is meaningful and can be counted
but the number of times the event did not occur is
meaningless or can not be counted.
• It is discrete and has an infinite number of
possible values.
• It has single parameter .
14
3.Gaussian or Normal Distribution:
Important characteristics are:
i. The shape of the distribution resembles a bell
and is symmetric around the midpoint;
ii. At the centre of distribution which is peaked,
mean median and mode coincide;
15
iii. The area under the curve between any two
points which correspond to the proportion of
observations between any two values of the
variate can be found out in terms of a
relationship between the mean and the
standard deviation.
iv. Parameters used mean() and SD()
16
• Standard Error Of Mean:The square root of the variance of the sample means
SE of sample mean = SD/
SE of sample proportion = • Applications of SEM:
i. To determine whether a sample is drawn from the same population or not when its mean is known.
ii. To work out the limits of desired confidence within which the population mean should lie.
17
Confidence Interval Or Fiducial Limits:• Confidence limits are two extremes of measurements within which 95% of observations would lie.
Lower confidence limit = mean – ( t0.05 X SEM)
Upper confidence limit = mean + ( t0.05 X SEM)
• The important difference between ‘p’ value and confidence interval is confidence interval represents clinical significance and ‘p’ value indicates statistical significance.
18
Standard Normal Distribution
Mean +/- 1 SD encompasses 68% of observations
Mean +/- 2 SD encompasses 95% of observations
Mean +/- 3SD encompasses 99.7% of observations
19
Statistical Hypothesis:• They are hypothesis that are stated in such a way that they may be evaluated by appropriate statistical techniques.
• There are two types of hypothesis testing: • Null hypothesis H0: It is the hypothesis which assumes that there is no difference between two values. H0:
• Alternative hypothesis HA : It is the hypothesis that differs from null hypothesis.
• HA:
20
Hypothesis Errors:
Type-I Error:
• It is probability of finding difference; when
no such difference actually exists.
• Acceptance of inactive compound
• It is also known as error/ false positive
21
Type-II Error:
• It is probability of inability to detect difference;
when such difference actually exists, thus
resulting in rejection of active compound as an
inactive.
• It is called as error/ false negative.
22
Level of significance(l.o.s):• The probability of committing type I error • Denoted by • L.o.s of 0.05% means risk of making wrong decisions only is 5 out of 100 cases i.e 95% confident
Power of the test:• It is probability of committing type II error• Denoted by 1- is power of the test• Power is probability of rejecting H0 when H0 is false i.e correct decision.
23
• The p-value is defined as the smallest
value of α for which the null hypothesis can
be rejected.
• If the p-value is less than α ,we reject the
null hypothesis (pα)
• If the p-value is greater than α ,we do not
reject the null hypothesis (p α)
24
Critical RegionOne tailed test:
• The rejection is in one or other tail of distribution
• The difference could only be their in one
direction/ possibility
• Ex. English men are taller than Indian men.
25
Two Tailed Test:
• The rejection is split between two sides or tails of
distribution
• The difference could be in both direction/
possibility
• Ex. Comparative study of drug ‘X’ with atenolol
for antihypertensive property
26
27
SAMPLE SIZE:• Large Sample : sample of size is more than 30• Small Sample: sample of size less than or equal to 30
• Many statistical test are based upon the assumption that the data are sampled from a Gaussian distribution.
• Procedures for testing hypotheses about parameters in a population described by a specified distributional form, (normal distribution) are called parametric tests.
28
Types of Parametric tests
1. Large sample tests
Z-test
2. Small sample tests
t-test
* Independent/ unpaired t-test
* Paired t-test
ANOVA (Analysis of variance) * One way ANOVA
* Two way ANOVA
29
Z- Test:
• A z-test is used for testing the mean of a
population versus a standard, or comparing
the means of two populations, with large (n
≥ 30) samples whether you know the
population standard deviation or not.
30
• It is also used for testing the proportion of some
characteristic versus a standard proportion, or
comparing the proportions of two populations.
Ex. Comparing the average engineering salaries
of men versus women.
Ex. Comparing the fraction defectives from two
production lines.
31
T- test: Derived by W S Gosset in 1908.• Properties of t distribution:
i. It has mean 0
ii. It has variance greater than one
iii. It is bell shaped symmetrical distribution about mean
• Assumption for t test:
i. Sample must be random, observations independent
ii. Standard deviation is not known
iii. Normal distribution of population
32
Uses of t test:
i. The mean of the sample
ii. The difference between means or to compare
two samples
iii. Correlation coefficient
Types of t test:
a. Paired t test
b. Unpaired t test
33
Paired t test:
• Consists of a sample of matched pairs of similar
units, or one group of units that has been tested
twice (a "repeated measures" t-test).
• Ex. where subjects are tested prior to a
treatment, say for high blood pressure, and the
same subjects are tested again after treatment
with a blood-pressure lowering medication.
34
Unpaired t test:
• When two separate sets of
independent and identically distributed samples are
obtained, one from each of the two populations being
compared.
• Ex: 1. compare the height of girls and boys.
2. compare 2 stress reduction interventions
when one group practiced mindfulness meditation
while the other learned progressive muscle
relaxation.
35
ANALYSIS OF VARIANCE(ANOVA):
• Analysis of variance (ANOVA) is a collection of
statistical models used to analyze the differences between
group means and their associated procedures (such as
"variation" among and between groups),
• Compares multiple groups at one time
• Developed by R.A. Fisher.
• Two types: i. One way ANOVA
ii. Two way ANOVA
36
It compares three or more unmatched groups
when data are categorized in one way
Ex.
1. Compare control group with three different
doses of aspirin in rats
2. Effect of supplementation of vit C in each
subject before , during and after the treatment.
One Way ANOVA:
37
Two way ANOVA:
• Used to determine the effect of two nominal
predictor variables on a continuous outcome
variable.
• A two-way ANOVA test analyzes the effect of the
independent variables on the expected outcome
along with their relationship to the outcome itself.
38
Difference between one & two way ANOVA
• An example of when a one-way ANOVA could be
used is if we want to determine if there is a
difference in the mean height of stalks of three
different types of seeds. Since there is more than
one mean, we can use a one-way ANOVA since
there is only one factor that could be making the
heights different.
39
• Now, if we take these three different types of
seeds, and then add the possibility that three
different types of fertilizer is used, then we would
want to use a two-way ANOVA.
• The mean height of the stalks could be different
for a combination of several reasons:
40
• The types of seed could cause the change,
the types of fertilizer could cause the change,
and/or there is an interaction between the type of
seed and the type of fertilizer.
• There are two factors here (type of seed and type
of fertilizer), so, if the assumptions hold, then we
can use a two-way ANOVA.
41
Summary of parametric tests applied for different type of data
Sl no Type of Group Parametric test
1. Comparison of two paired groups Paired ‘t’ test
2. Comparison of two unpaired groups Unpaired ‘t’ test
3. Comparison of three or more matched groups
Two way ANOVA
4. Comparison of three or more matched groups
One way ANOVA
5. Correlation between two variables Pearson correlation
42
References:1. Dr J V Dixit’s Principles and practice of
biostatistics 5th edition.
2. Rao & Murthy’s applied statistics in health sciences 2nd edition.
3. Sarmukaddam’s fundamentals of biostatistics 1st edition.
4. Internet sources…….
43