Upload
dayna-howard
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
27 June 2008 Copyright: Ganesha Associates 1
Basic reading, writing and informatics skills for biomedical
researchSegment 6. Developing and
presenting your project
27 June 2008 Copyright: Ganesha Associates 2
Contents
• Writing a project proposal
• Experimental design
• Making a presentation
27 June 2008 Copyright: Ganesha Associates 3
Outline of a research proposal - 1
• Title
• Abstract
• Specific Aims
• Background & Significance
• Preliminary Data
• Methods
• Resources
27 June 2008 Copyright: Ganesha Associates 4
Outline of a research proposal - 2
• Title– What is the problem ?
• Abstract– Write it last– State the problem and the specific aims of the
project– Describe the main methodologies to be used– State the significance of the work– May be the only thing some reviewers read
27 June 2008 Copyright: Ganesha Associates 5
Outline of a research proposal - 3
• Specific Aims– One page– Short conceptual narrative followed by well-defined
objectives and success criteria– Relationship to experimental plan should be clear
• Background & Significance– Helps reviewer understand why you have chosen this
particular problem and how it builds on previous work– Shows you know what the important issues are and
why• Preliminary Data
– Proof that the project is realistic and feasible
27 June 2008 Copyright: Ganesha Associates 6
Outline of a research proposal - 4
• Methods– Presents a detailed plan of attack for each specific aim– Should support costs proposed in the budget– Describes how you will evaluate success in achieving your aims– Provides a flow chart of logic for each experiment's results and
the subsequent steps in the research plan– Addresses sub-optimal methodologies and offers rationale for
their use
• Resources– Includes time table, often at end of section, to make
organizational and resourcing requirements apparent– Budget
27 June 2008 Copyright: Ganesha Associates 7
Methods - choose your model system carefully
• In vivo, in vitro, in silico• Pharmacological, surgical, genetic• Example: Fetal malnutrition and metabolic
syndrome– Animal: Rat, mouse, human ?– Diet: Global under-nutrition, low maternal protein,
high fat diet during pregnancy– Single, or multigenerational study ?– Pharmacological, genetic or surgical model– Disease: diabetes, hypertension, cardiovascular– See review Brit. Med. Bull. 2001, 60, 103-121
27 June 2008 Copyright: Ganesha Associates 8
Methodology – make sure you understand the variables you are measuring
• What is the normal range of variation in measurement values ?
• Do you know why these arise ?
• What is the time course of the effect ?
27 June 2008 Copyright: Ganesha Associates 9
Project proposal – quick check list• Why is the problem under study of importance
– Economic, medical significance ?
– What are the underlying key issues of basic scientific significance
– Establish strong links to the consensus view ?
• How is the problem to be addressed experimentally ?– Has an appropriate model system been chosen ?
– What information needs to be collected ?
– Which methods have been chosen for this purpose and why ?
• Limitations– Have the most-likely reasons for failure been identified ?
– What is the ‘Fail early’ strategy ?
• Literature review– Is it up-to-date ?
– Are all key points of logical development in the text backed by an appropriate reference ?
27 June 2008 Copyright: Ganesha Associates 10
Experimental design
• Hypothesis• Assumptions, expectations• Statistics• Experiment 1• Results• Test assumptions• Experiment 2• Results...
27 June 2008 Copyright: Ganesha Associates 11
Experimental design
• An experimental strategy, often involving specialist statistical techniques, used to test hypotheses involving independent and dependent variables by means of manipulation of variables, controls and randomization.
• A true experiment involves the random allocation of participants to experimental and control groups, manipulation of the independent variable, and the use of a control group for comparison purposes.
27 June 2008 Copyright: Ganesha Associates 12
Early example of experimental design
• In 1747, while serving as surgeon on HM Bark Salisbury, James Lind, the ship's surgeon, carried out a controlled experiment to develop a cure for scurvy.
• Lind selected 12 men from the ship, all suffering from scurvy, and divided them into six pairs, giving each group different additions to their basic diet for a period of two weeks. The treatments were all remedies that had been proposed at one time or another.
27 June 2008 Copyright: Ganesha Associates 13
Early example of experimental design
• They were– A quart of cider per day – Twenty five gutts of elixir vitriol three times a
day upon an empty stomach, – Half a pint of seawater every day – A mixture of garlic, mustard and horseradish,
in a lump the size of a nutmeg – Two spoonfuls of vinegar three times a day – Two oranges and one lemon every day.
27 June 2008 Copyright: Ganesha Associates 14
Early example of experimental design
• The men who had been given citrus fruits recovered dramatically within a week. One of them returned to duty after 6 days and the other became nurse to the rest. The others experienced some improvement, but nothing was comparable to the citrus fruits, which were proved to be substantially superior to the other treatments.
27 June 2008 Copyright: Ganesha Associates 15
Early example of experimental design
• In this study his subjects' cases "were as similar as I could have them", that is he provided strict entry requirements to reduce extraneous variation.
• The men were paired, which provided replication. From a modern perspective, the main thing that is missing is randomized allocation of subjects to treatments.
27 June 2008 Copyright: Ganesha Associates 16
Statistics
• There are many types of statistical tests• Most can be carried out in Excel or with a
specialist statistics package• The problems include:
– Selecting the right test (preferably before you do the experiment)
– Understanding the assumptions on which the test is based (which may have an impact on your experimental design)
– Making sure the power of the test is adequate
27 June 2008 Copyright: Ganesha Associates 17
Variables
• The independent variables are the ones that the researcher expects to be the cause of an outcome of interest.
• The dependent variable is the outcome variable. In experimental research, this variable is expected to depend on a predictor (or independent) variable.
• For example, if a researcher wants to examine the effect of a drug on blood pressure, the drug is the independent variable, the blood pressure response the dependent variable.
• An experiments can have more than one independent or dependent variable, eg. Multivariate ANOVA
27 June 2008 Copyright: Ganesha Associates 18
Some definitions
• For a data set, the mean is the sum of the observations divided by the number of observations.
• The mean is often quoted along with the standard deviation which describes the spread of the data about the mean.
• Standard error – a statistical measure of variation in a population of means
• The variance is a measure of statistical dispersion, the average of the squared differences between sample values and the expected value (mean).
27 June 2008 Copyright: Ganesha Associates 20
Measurement - 1
• Repeated measurements are rarely the same• This variation can be expressed as a frequency histogram• The variation may be due to experimental error or to natural variations in
the variable being measured • The standard deviation about the mean is a statistic that is used to define
this variation precisely
27 June 2008 Copyright: Ganesha Associates 21
Measurement - 2
• When many observations are made, the histogram becomes a curve.• In many cases this curve can be described precisely by a mathematical
equation – called the ‘normal distribution’.• The normal distribution can be defined mathematically by its mean and its
standard deviation.• Note, biological phenomena at best only approximate to the normal curve
27 June 2008 Copyright: Ganesha Associates 22
Measurement - 3
• If you take a sample of n measurements of a variable that has a normal distribution (blue) then you can calculate an estimate of the mean and the standard deviation.
• If you repeat this sampling many times than you will get a second, narrower normal distribution (green - n, red - 4n).
• The standard deviation of these errors is known as the standard error.
27 June 2008 Copyright: Ganesha Associates 23
Measurement - 4
• Imagine that the green curve is the distribution of possible means of n measurements for the blood pressure of control animals, and the purple curve is the corresponding distribution for animals receiving the drug.
• The actual mean recorded for the test animals is shown by the grey arrow on the left, controls on the right.
• Can I tell from these measurements whether the drug had an effect ?
27 June 2008 Copyright: Ganesha Associates 24
Measurement - 5
• No !• All I can do is calculate the probabilty that both sets of measurements
come from the same normal distribution, i.e. Ho, the null hypothesis, ‘there is no effect’
• If the probability is sufficiently low, usually p<0.05, then I may choose to reject the Ho.. But I could still be wrong...
27 June 2008 Copyright: Ganesha Associates 25
Statistical tests
• Most statistical tests begin with the assumption that each data sample (control, test, etc) was drawn from the same population, i.e. that there is no treatment effect
• They assume that the individual measurements are normally distributed (or can be transformed so that they approximate to a normal distribution)
27 June 2008 Copyright: Ganesha Associates 26
Assumptions
• Controls and test subjects must from identical populations– Age, gender, medical history, genetics...
• Data are independent• Effects of multiple testing have been accounted
for• Sources of human bias have been controlled for• The power of the statistical test is sufficient to
detect the change predicted. Use a positive control
27 June 2008 Copyright: Ganesha Associates 27
Assumptions - controls
• Suppose a farmer wishes to evaluate a new fertilizer. She uses the new fertilizer on one field of crops (A), while using her current fertilizer on another field of crops (B).
• The irrigation system on field A has recently been repaired and provides adequate water to all of the crops, while the system on field B will not be repaired until next season.
• She concludes that the new fertilizer is far superior.
• Examples from clinical genetics
27 June 2008 Copyright: Ganesha Associates 28
Assumptions – independence (1)• Statistical tests are based on the assumption that
each subject was sampled independently of the rest. • Consider the following three situations:
– You are measuring blood pressure in animals. – You have five animals in each group, and measure the blood
pressure three times in each animal. – You do not have 15 independent measurements, because
the triplicate measurements in one animal are likely to be closer to each other than to measurements from the other animals.
– You should average the three measurements in each animal.
– Now you have five mean values that are independent of each other.
27 June 2008 Copyright: Ganesha Associates 29
Assumptions – independence (2)– You have done a biochemical experiment three times,
each time in triplicate. You do not have nine independent values, as an error in preparing the reagents for one experiment could affect all three triplicates. If you average the triplicates, you do have three independent mean values.
– You are doing a clinical study, and recruit ten patients from an inner-city hospital and ten more patients from a suburban clinic. You have not independently sampled 20 subjects from one population. The data from the ten inner-city patients may be closer to each other than to the data from the suburban patients. You have sampled from two populations, and need to account for this in your analysis.
27 June 2008 Copyright: Ganesha Associates 30
Assumptions – multiple tests
• If you test several independent null hypotheses, and leave the threshold at 0.05 for each comparison, there is greater than a 5% chance of obtaining at least one "statistically significant" result by chance
• For example, if you test three null hypotheses and use the traditional cutoff of p<0.05 for declaring each p value to be significant, there would be a 14% chance of observing one or more significant p values, even if all three null hypotheses were true.
• To keep the overall chance at 5%, you need to lower the threshold for significance to 0.0170.
27 June 2008 Copyright: Ganesha Associates 31
Assumptions - bias
• Double blind experiments
• A research design where both the experimenter and the subjects are unaware of which is the treatment group (drug) and which is the control (placebo).
27 June 2008 Copyright: Ganesha Associates 32
Types of statistical test - 1
• Number of independent variables– Drug, diet...
• Number of dependent variables– Blood pressure, heart rate, glucose levels...
• Type of data– Parametric, non-parametric
27 June 2008 Copyright: Ganesha Associates 33
Types of statistical test - 2
• Student's t-test • chi-square test • Analysis of variance (ANOVA) • Mann-Whitney U • Regression analysis • Factor Analysis • Correlation • Pearson product-moment correlation coefficient • Spearman's rank correlation coefficient • Time Series Analysis
27 June 2008 Copyright: Ganesha Associates 34
Types of statistical test - 3
• Interval, or parametric– 0.32, 1052, etc– Normal distribution
• Nominal, or non-parametric– Male, pregnant, red– Binary distribution
• Ordinal, or non-parametric– First, third– Order by rank
27 June 2008 Copyright: Ganesha Associates 36
Types of statistical test - 5
When we have more than two groups, it is inappropriate to simply compare each pair using a t-test because of the problem of multiple testing.
The correct way to do the analysis is to use a one-way analysis of variance (ANOVA) to evaluate whether there is any evidence that the means of the populations differ.
If the ANOVA leads to a conclusion that there is evidence that the group means differ, we might then be interested in investigating which of the means are different.
27 June 2008 Copyright: Ganesha Associates 37
Types of statistical test - 6
Tukey's multiple comparison test is one of several tests that can be used to determine which means amongst a set of means differ from the rest.
The results are presented as a matrix showing the result for each pair, either as a P-value or as a confidence interval.
The Tukey multiple comparison test, like both the t-test assumes that the data from the different groups come from populations where the observations have a normal distribution and the standard deviation is the same for each group.
27 June 2008 Copyright: Ganesha Associates 38
“Why most published research findings are false”
• There is increasing concern that most current published research findings are false.
• A research finding is less likely to be true:– when the studies conducted in a field are smaller– effect sizes are smaller– when there is a greater number and lesser pre-selection of tested
relationships– where there is greater flexibility in designs, definitions, outcomes, and
analytical modes– when there is greater financial and other interest and prejudice– when more teams are involved in a scientific field in chase of statistical
significance. • For many current scientific fields, claimed research findings may
often be simply accurate measures of the prevailing bias.
John Ioannidis, PLos Medicine, 30 Aug 2005
27 June 2008 Copyright: Ganesha Associates 39
Learning points
• If you aren’t certain how much variation to expect in your experiment, try a small scale preliminary version.
• The more measurements you take, the greater the precision, but
• First try to identify and eliminate some of the sources of variation
27 June 2008 Copyright: Ganesha Associates 42
Collecting data – check key assumptions
0.0
10.0
20.0
30.0
40.0
50.0
60.0
16 I 16 II 16 III 12 I 12 II 12 III 8 I 8 II 8 III 4 I 4 II 4 III MataI
MataII
MataIII
Subáreas
Col
oniz
ação
(%
) Jan-05
Aug-05
Linear (Jan-05)
Linear (Aug-05)
27 June 2008 Copyright: Ganesha Associates 43
Beware, in biology there are many unknowns
“As we know,There are known knowns.There are things we know we know.We also knowThere are known unknowns.That is to sayWe know there are some thingsWe do not know.But there are also unknown unknowns,The ones we don't knowWe don't know.”
Donald Rumsfeldt, US Secretary of Defense (sic)Feb. 12, 2002, Department of Defense news briefingfrom "The Poetry of D.H. Rumsfeldt"
http://slate.msn.com/id/2081042/
27 June 2008 Copyright: Ganesha Associates 44
Presenting your ideas
• Create a slide show that is an outline, not a script
• Use the slide show... – to select important information and visuals– to organize content – to create a hierarchy
• Many of the subsequent slides were adapted from work done by the Cain Project in Engineering & Professional Communication
• www.owlnet.rice.edu/~cainproj
27 June 2008 Copyright: Ganesha Associates 45
Selecting Content
• Consider your audience – not everyone will have your knowledge of the problem!
• State problem/question clearly, early and repeat (in the title, in the introduction)
• Explain the significance, context• Include background:
organism/system/model• State the point of departure for work
precisely
27 June 2008 Copyright: Ganesha Associates 46
Displaying Text• Remember that your audience...
– skims each slide– looks for critical points, not details – needs help reading/ seeing text – So keep to an outline only
• Help your audience by…– Projecting a clear font– Using bullets– Using content-specific headings– Using short phrases– Using grammatical parallelism
27 June 2008 Copyright: Ganesha Associates 47
Project a clear font
• Serif: easy to read in printed documents– Times New Roman, Palatino, Garamond
• Sans serif: easy to see projected across the room– Arial, Helvetica, Geneva
27 June 2008 Copyright: Ganesha Associates 48
Use bullets – but not too many
• Bullets help your audience– to skim the slide– to see relationships between information– organize information in a logical way
• For example, this is Main Point 1, which leads to...– Sub-point 1
• Further subordinated point 1• Further subordinated point 2
– Sub-point 2
27 June 2008 Copyright: Ganesha Associates 49
Use content-specific headings
• “Results” suggests the content area for a slide
• “Substance X up-regulates gene Y” (with data shown below) shows the audience what is observed
27 June 2008 Copyright: Ganesha Associates 50
Use short phrases
• Be clear, concise, accurate
• Write complete sentences only in certain cases:
Hypothesis / problem statement
Quote
???
Difficult to read
DNA polymerase catalyzes elongation of DNA chains in
the 5’ to 3’ direction
Better
DNA polymerase extends 5’ to 3’
27 June 2008 Copyright: Ganesha Associates 51
Use grammatical parallelism• Use same grammatical form in lists
• Not Parallel:– Cells were lysed in buffer– 5 minute centrifuging of lysate– Removed supernatant
• Parallel:– Lysed cells in buffer– Centrifuged lysate for 5 minutes– Removed supernatant
27 June 2008 Copyright: Ganesha Associates 52
Use grammatical parallelismHow would you revise this list?
Telomeres• Contain non-coding DNA• Telomerases can extended telomeres• Cells enter senescence/apoptosis when telomeres
are too short
27 June 2008 Copyright: Ganesha Associates 53
Use grammatical parallelismOne possible revision…
Telomeres• Contain non-coding DNA• Are extended by telomerase• Cause senescence/apoptosis when shortened too
much
27 June 2008 Copyright: Ganesha Associates 54
Displaying visuals• Select visuals that enhance understanding
– Figures from your work: evidence for argument
– Figures from other sources (web; review articles):
• Model a process or concept• Help explain background, context
• Design easy-to-read visuals– Are the visuals easy to read by all members of
your audience?
• Draw attention to aspects of visuals
27 June 2008 Copyright: Ganesha Associates 55
Simplify and draw attention
http://www.indstate.edu/thcme/mwking/tca-cycle.html
27 June 2008 Copyright: Ganesha Associates 56
Cite others’ visuals
http://www.bioc.rice.edu/~shamoo/shamoolab.html
Harvey et al. (2005) Cell 122:407-20
27 June 2008 Copyright: Ganesha Associates 57
Samples
Features to consider:• Text
– Fonts, use of phrases, parallelism
• Visuals– Readability, drawing attention
• Slide design• Organization/ hierarchy
– Titles, bullets, arrangement of information, font size
27 June 2008 Copyright: Ganesha Associates 63
The Calcium Ion
Calcium is a crucial cell-signaling molecule
–Calcium is toxic at high intracellular concentrations because of the phosphate-based system energy system
–Intracellular concentrations of calcium are kept very low, which allows an influx of calcium to be a signal to alter transcription
27 June 2008 Copyright: Ganesha Associates 64
Microarrays
Phillips G. (2004) Iowa State University College of Veterinary Medicine.
27 June 2008 Copyright: Ganesha Associates 66
Delivery
• Physical Environment• Stance
– Body language– Handling notes
• Gestures• Eye contact• Voice quality
– Volume– Inflection– Pace
27 June 2008 Copyright: Ganesha Associates 67
Handling Questions
• LISTEN
• Repeat or rephrase
• Watch body language
• Don’t pretend to know
27 June 2008 Copyright: Ganesha Associates 68
Practical activity 6a - Developing and presenting your project
• Total duration - ca. 2 hours.• Identify the five most important research articles that frame your
hypothesis, i.e. the fundamental facts and assumptions upon which your idea is based.
• Describe the basis for your hypothesis in a paragraph of no more than seven sentences.
• Read the article by Peter Norvig on experimental design. (For Firefox users the alternative URL is here.)
• What alternative experimental approaches are available to answer your question ?
• How do you intend to verify your hypothesis?• Identify and justify the journal you want to publish the results of your
research in. • Give a 5-slide presentation to justify your choices at the next
session.
27 June 2008 Copyright: Ganesha Associates 69
Practical activity 6b - Thinking about probability and statistics
• Total duration - ca. 3 hours.• First read the series of articles published recently by Wai-Ching Leung in the British Medical Journal.
Although intended for a medical audience, these article provide the basis for a useful primer for all most fields of biomedical research. The articles are:
• Why and when do we need medical statistics • Measuring chances • Summarising information • Testing hypotheses • Now answer the following questions:• I have a plant extract which I believe has an effect on blood pressure. I measure its effects by injecting the
substance into rats and measuring their blood pressure before and after the injection. The statistical test I use tells me that the probability of collecting this sample of results is less than 0.05. What does this mean ?
• 1% of women aged forty who participate in routine mamography screening have breast cancer. 80% of the women with breast cancer get a positive result. 9.6% of women without breast cancer will also get a positive result. So, if a woman from this group gets a positive result, what is the probablity that she has breast cancer ?
• In the UK, car registration plates can typically consist of a string of 6 or 7 alphanumeric characters (A, B, C, etc, 1, 2, 3 etc). So the probability of a specific sequence of characters (e.g. DB1979) is less than 1 in 2 billion. I send a small group of people out into a car park and ask them to look for a registration plate that has personal significance for them. What is the likelihood of this happening ?
• A friend of mine has consistently predicted the results of 5 of the football matches leading to today's final. He is offering to sell me his prediction for the final match so that I can place a bet and make some money. What are the odds that he will predict the outcome of the last match correctly ?
• A murder is committed. Traces of your fingerprints are found on the murder weapon. What is the probability that you are guilty ?
27 June 2008 Copyright: Ganesha Associates 70
Practical activity 6c - Presenting data
• Total duration - ca. 1 hour.
• Read Mary Purugganan's presentation about data visualisation. Identify some examples of illustrations used in recent primary research papers which illustrate some of the points she makes.