Upload
jonas-ranstam
View
152
Download
1
Tags:
Embed Size (px)
Citation preview
Statistics
Jonas Ranstam PhD
Statistics
A scientific report
The idea is to try and give all the information to help others to judge the value of your contributions, not just the information that leads to judgment in one particular direction or another.
Richard P. Feynman
Statistics
Description of observed data
mean, median, mode (central tendency)standard deviation, range (dispersion)
Presentation of uncertainty
p-value, statistical significance (hypothesis testing)confidence interval, SEM (interval estimation)
Statistics
What uncertainty?
Generalization uncertainty, for example:
- From one sample of rats to all rats (the uncertainty caused by biological variability)
- From a single measurement on a single rat to all measurements of the same kind on the same rat (the uncertainty caused by using imperfect measurement instruments, i.e. reliability)
Statistics
What determines the degree of uncertainty?
1. The number of observations
2. The variability
Statistics
What is important when presenting results?
1. The number of observations
2. The variability
Reported mean concentration with ±SD (bar chart)
Observed mean concentration (dotplot)
Estimated mean concentration with 95% confidence intervals
Statistics
Why use CI instead of SEM?
Because the CI is the better measure of uncertainty
n SEM CI (for a mean value)
2 ±1 50% 3 ±1 58% 4 ±1 63% 6 ±1 64% 7 ±1 65% ∞ ±1 68%
Statistics
Other problems related to generalization uncertainty
1. Independence of observations
2. Gaussian probability distribution
3. Multiplicity
Statistics
1. Independent observations (2 treatments, 4 rats, n = ?)
n = 4
n = 8
n = 96
Statistics
Independent observations
Distinguish between:
1. Biological variation
2. Measurement reliability
Describe the sources of variation clearly in the manuscript! How many animals, repeated observations, technical replicates, etc. have been analyzed?
Statistics
Recommended reading
Churchill GA. Fundamentals of experimental design for cDNA microarrays. Nature Genetics 2002;32S:490-495.
2. Gaussian distribution
Statistics
Are your results empirically supported?
Or do they rely on your assumptions?
- Student's t-test (Gaussian, identical variance)
- Mann-Whitney U-test (identical shape and variance)
Statistics
Did you check if the assumptions were fulfilled?
- How did you do it?
- What was the result?
Describe it in the manuscript!
Statistics
3. Multiplicity
With more than one tested null hypothesis the real significance level will differ from the nominal
Statistics
3. Multiplicity
- Multiplicity corrections correct the type-1 error rate
- Multiplicity corrections increase the type-2 error rate
Statistics
3. Multiplicity
- Bonferroni is not a good method, several better exist, for example the methods developed
by Holm and Hochberg
- P-value corrections within endpoints do not solve the problem of testing multiple endpoints
Statistics
3. Multiplicity
What is your strategy for dealing with multiplicity? Are Bonferroni corrections necessary? Are all multiplicity issues addressed?
Describe it in the manuscript!
Statistics
Summary
As an author of a scientific report your task is to perform an adequate evaluation and presentation of the uncertainty and limitations of your findings.
This involves more than just calculating a p-value.
Statistics
Summary
When a well-done trial or experiment or observational study is fairly, honestly, and thoroughly reported, it will have so many warts, footnotes, and exceptions that it may be hard for the uninitiated to believe that the work was of high quality.
Frederick Mosteller
Thank you for your attention!