Basics in Epidemiology & Biostatistics 2 RSS6 2014

Basics in Epidemiology & Biostatistics

Hashem Alhashemi MD, MPH, FRCPC Assistant Professor, KSAU-HS

• Large samples > 30.

• Normally distributed.

• Descriptive statistics: Range, Mean, SD.

Non-parametric data

• For small samples & variables that are not normally distributed.

• No basic assumptions (distribution free).

• Descriptive statistics: Range, Rank, Median, & the interquartile range. (the middle 50 = Q3-Q1).

• Median is the middle number in a ranked list of numbers (regardless of its frequency).

Parametric data

Non-parametric data

The Mean

• It sums all the values (great digital summary ).

• But, it will be affected by extreme values. So, it is not a good summary if your data is not normal (symmetrical bell shape).

• The sum of data differences above and below the mean will equal = 0.

"حب التناهي شطط خير األمور الوسط "

Stander Deviation

Average of differences from the mean (Squared-SS)

Sample set:

1 ,2 ,3 ,4 , 5 ,6 ,7

X = 28/7= 4

Number of differences = 6

Stander deviation

Unit of deviation of data from the Mean

Differences?

Similar < +/- 1𝛔

Slightly Different

Very Different

Extremely Different

(0.02) > +/-2𝛔

(0.001) >+/- 3𝛔

<+/-2𝛔

• Z distribution, is a hypothetical population (model) with a 𝛍 of 0, & 𝛔 1.

• Six (𝛔 ) make up 0.997 of the area under the curve

Z distribution

Parametric Data

Population

• God knows every thing.

• Dose not need to take samples.

• Commits no mistakes.

Central Limit Theorem

• The mean of all possible sample means will be approximately equal to the mean of the population.

• The distribution of all possible sample means will be normal.

• If you limit your prediction to the center, you will be ok (averages are normally distributed)

(1777 – 1855)

"حب التناهي شطط خير األمور الوسط "

Carl Friedrich Gauss

• t distribution, is a hypothetical population (model) with a 𝛍 of 0, & 𝛔 1 , (Degrees of freedom= n-1).

• Six (𝛔 ) make up 0.997 of the area under the curve

t distribution

Parametric Data

Sample

Sampling distribution

Similar <+/-1 SE

Slightly Different

Very Different

Extremely Different

(0.02) > +/-2 SE

(0.001) >+/- 3 SE

<+/-2 SE

Stander Error

SE is the unit for error in estimating the population mean.

SE is the unit for deviation of all possible samples means from the population mean.

SE is the unit for average difference of all possible samples means from population mean.

n because S is a root product of the variance.

The Average Idea

SE Stander Error S Stander Deviation X mean

A unit for Error in estimation of the population mean.

A unit of Deviation of the data from the sample mean.

Average

A unit for Deviation of all possible samples means from the population mean.

A unit for Average of differences of the data from the sample mean.

A unit for Average of differences of all possible samples means from population mean.

A Fancy World made of

%s & Averages

Biostatistics

Sample size

Estimate

Calculate

Calculate (SE) ?

Estimate

95% Confidence Interval (C.I)

Stander of Error

+/- 2 SE

μ π Ω λ

Estimate Margin of Error

X P OR Rate

General formula

SD vs SE

• Standard Deviation calculates the variability of the data within a sample in relation to the sample mean .

• Standard Error estimates the variability of all possible samples means in relation to the population mean.

So, it helps identify the % of data above and below a certain measurement.

So, it helps identify the degree of error in your estimation.

A Fancy World made of

Biostatistics

Averages & %s

Population (descriptive) :

• Calculate Mean

μ (measures) • Calculate proportion

𝛑 (counts) • Calculate Stander deviation

• Calculate Parameters: μ & 𝛑

Sample (Inferential) :

• Estimate Sample size

• Calculate Mean X

• Calculate Stander deviation S

• Calculate Stander error SE & 95% C.I (Confidence Interval)

• Calculate Statistics

Difference between studying populations & samples:

Estimate Parameters: μ & 𝛑

• Large samples > 30.

• Normally distributed.

• Descriptive statistics: Range, Mean, SD.

Non-parametric data

• For small samples & variables that are not normally distributed.

• Descriptive statistics: Range, Rank, Median, & the interquartile range. (the middle 50 = Q3-Q1).

• Median is the middle number in a ranked list of numbers.

Parametric data

Non-parametric data

• For small samples and variables that are not normally distributed.

• Descriptive statistics: Range, Rank, Median, and the interquartile range (the middle 50 = Q3-Q1).

Quantitative Data

Discrete

Continuous

Binomial (Binary) :

Ratio (real zero) /

Interval (no zero)

Temperature/BP

Multinomial :

1-Categorical : Race

2-Ordinal: Education 3-Numerical: number pregnancies/residents

Measure

Non-parametric data

• For small samples and variables that are not normally distributed.

• Descriptive statistics: Range, Rank, Median, and the interquartile range (the middle 50 = Q3-Q1).

Differences?

Objectives

• Definitions.

• Types of Data.

• Data summaries.

• Mean Χ , Stander deviation S.

• Stander Error SE, Confidence interval C.I of μ .

Quantitative Data

Discrete

Continuous

Dichotomous:

Binary: Sex

Multichotomous:

1-No order : Race

2-Ordinal: Education

Numerical: number pregnancies/residents

Ratio (real zero) /

Interval (no zero)

Temperature/BP

(Non-Parametric Data)

Quantitative Data

Discrete

Continuous

Categorical :

1- Di-chotomous:

2- Multi-chotomous:

Race,Education

Numerical: number of

pregnancies/residents

Ratio (real zero) /

Interval (no zero)

Temperature/BP

Types of

Data Count

Non-Parametric Data

Parametric Data

Summaries

Visual Numerical

X, 𝛍, s, 𝛔 Histogram

P, 𝛑, s, 𝛔 Bar & Pie Chart (Counts) Categories

(Measures) Any value

Data Presentation

Normality & Approximation to Normality

Approximation to Normality

• If choices are equally likely to happen

• If repeated numerous number of times

• It will look normal.

• Whether it was a coin or a dice

(Di-chotomous or Multi-chotomous)

Normality & Approximation to Normality

Clinical Relevance?

Choices equally likely to happen….. i.e. Out come of interest probability is unknown (Research ethics)

Repeated numerous number of times….

i.e. Large sample size

Normality assumption helps us predict the Probability of our outcome

The Bell / Normal curve

Stander deviation(SD)/ sample curve True error (SE)/ population curve

• Was first discovered by Abraham de Moivre in 1733.

• The one who was able to reproduce it and identified it as the normal distribution (error curve) was Gauss in 1809.

De Moivre had hoped for a chair of mathematics, but foreigners were at a disadvantage, so although he was free from religious discrimination, he still suffered discrimination as a Frenchman in England.

Born 1667 in Champagne, France

Died 1754 in London, England

Largest Value - Smallest Value SD estimate

Basics in Epidemiology & Biostatistics 2 RSS6 2014

Health & Medicine

Clinical Trials, Epidemiology and Biostatistics in Skin Disease

Biostatistics and Epidemiology International Journal Vol.1 ... fileBiostatistics and Epidemiology International Journal Welcome to Volume I - Issue II of the Biostatistics and Epidemiology

Epidemiology and Biostatistics - Ask Mishaskmish.com/wp-content/uploads/2014/11/Biostatistics-ppt.pdf · 2014. 11. 1. · Epidemiology: history, distribution of disease and rates

Biomedical Engineering, Biostatistics and Epidemiology …anovahealth.co.za/uploads/931-resources-Martin Nieuwo… · · 2017-10-31Biomedical Engineering, Biostatistics and Epidemiology

HS310-03 Epidemiology and Biostatistics Welcome to Seminar

Cancer Epidemiology Unit, Department of Epidemiology ... · PDF fileCancer Epidemiology Unit, Department of Epidemiology, Biostatistics, & Occupational health, McGill University

Dr Eva Batistatou. Outline of this presentation… What is epidemiology? The Fundamentals of Epidemiology course What is biostatistics? The Biostatistics

Nicola Cooper Centre for Biostatistics and Genetic Epidemiology,

BERD and CTSA: Biostatistics, Epidemiology and Research ... seminar.pdf · BERD and CTSA: Biostatistics, Epidemiology and Research Design within the Clinical and Translational Science

Department of epidemiology and biostatistics … Biostatistics 2013-2014 Annual Report.pdfDEPARTMENT OF EPIDEMIOLOGY AND BIOSTATISTICS ANNUAL REPORT 2013-2014. 7 . FIGURE 2: BREAKDOWN

Student Handbook Biostatistics and Epidemiology PhD … · Student Handbook Biostatistics and Epidemiology ... Qualifying Exam ... The Department of Epidemiology and Biostatistics

Epidemiology and Biostatistics for the Usmle

Biostatistics and Epidemiology within the Paradigm of Public Health

Epidemiology, Biostatistics, and Population Health Concepts

Lecture 1: Biostatistics and Epidemiology within the Paradigm of

Data Management and Biostatistics Amarin - Thai …...Data ManagementData Management Section for Clinical Epidemiology and BiostatisticsSection for Clinical Epidemiology and Biostatistics,

DEPARTMENT OF Epidemiology and Biostatistics EPIB 2015.pdf · he Department of Epidemiology and Biostatistics provides rigorous training in the fundamental sciences underlying public

High-Yield Biostatistics, Epidemiology &Public Health (4th Ed.)

Center for Clinical Epidemiology and Biostatistics

EPIDEMIOLOGY AND BIOSTATISTICS: FROM MOLECULES TO …