Upload
twclark
View
146
Download
0
Tags:
Embed Size (px)
Citation preview
Reproducibility, Argument and Data in Translational Medicine
© 2015 Massachusetts General Hospital
Tim Clark, Ph.D.Assistant Professor of Neurology
Massachusetts General Hospital & Harvard Medical SchoolMassachusetts Alzheimer Disease Research Center
seminar presentation at the Biostatistics Department,Harvard T.H. Chan School of Public Health,
February 4, 2015
“It has become apparent that an alarming number of published results cannot be reproduced by other people.
That is what caused John Ioannidis to write his now famous paper, Why Most Published Research Findings Are False [1].
That sounds very strong. But in some areas of science it is probably right.”
- David Colquhoun [2]
1. Ioannidis, J.P.A. (2005) Why Most Published Research Findings Are False, PLoS Med, 2, e124.2. Colquhoun, D. (2014) An investigation of the false discovery rate and the misinterpretation of p-values, Royal Society Open Science, 1.
Outline• The translation gap
• The false reported discovery rate
• Attrition in pharmaceutical pipelines
• Historical background on reproducibility
• Logical status of scientific articles
• Coping strategies at the ecosystem level
• The global argument graph
• Conclusions & postscript
• ~ 80% to 90% of top-tier academic research is non-reproducible in pharma target discovery labs.
• All phases in pharma discovery, development, preclinical and clinical have significant attrition.
• ~ 90% attrition in clinical trials has huge financial and social impact: risk avoidance.
T1
Hay et al.(2014) Nature Biotechnology 32,40–51.Begley and Ellis (2012) Nature, 483, 531-533.Prinz et al. (2011) Nat Rev Drug Discov, 10, 712.
• Obakata et al. received extraordinary scrutiny because of its surprising conclusions.
But what proportion of more “ordinary” papers receive this type of scrutiny?
It received further scrutiny because upon examination there turned out to be fraud.
What about non-fraudulent, but incorrect papers?
• Furthermore…
(1) It seems possible that Obokata’s fraudulent use of data came from her inability to reproduce the original Vacanti lab experiments in the RIKEN environment.
(2) We do not know whether the technique began with fraud at Harvard, or was simply “reproduced by fraud” when legitimate reproduction failed at RIKEN.
Colquhoun 2014• “Almost universal failure of biomedical papers to
appreciate what governs the false discovery rate.”
• “If you use p=0.05 to suggest that you have made a discovery, you will be wrong at least 30% of the time.”
• “If, as is often the case, experiments are underpowered, you will be wrong most of the time.”
• “To keep your false discovery rate below 5%, you need to use a three-sigma rule, or to insist on p≤0.001.”
False discovery rate in diagnostic tests
• For disorder X, a test correctly diagnoses
• 95% of people without X as “false(X)” (specificity = .95) and
• 80% with X as “true(X)” (sensitivity = .80).
• Prevalence of X in the population = 1%
Diagnostic tests (contd.)
Colquhoun 2014, “An investigation of the false discovery rate and the misinterpretation of p-values”, Royal Society Open Science, 1.
False discovery rate: 86%
Drug screening• Assume drug candidates work in 10% of cases.
• Power = 0.8, sig level 0.05
• False discovery rate = 45/(45+80)=36%
False discovery rate: 36%
“We optimistically estimate the median statistical power of studies in the
neuroscience field to be between about 8% and about 31%.”
Button et al. 2013 Nature Reviews Neuroscience 14: 365-376
Underpowered
sensitivity=0.2 20% test positive(20 true pos tests)
80% test negative(80 false neg tests)
• False discovery rate = 45/(45+20)=69%
False discovery rate: 69%
Pharma attrition & productivity
attrition = 95.9%
$1.78 billion per new drug
Paul, S.M., et al. (2010) How to improve R&D productivity: the pharmaceutical industry's grand challenge, Nat Rev Drug Discov, 9, 203-214.
Pharma attrition & productivity
attrition = 95.9%
$1.78 billion per new drug
Paul, S.M., et al. (2010) How to improve R&D productivity: the pharmaceutical industry's grand challenge, Nat Rev Drug Discov, 9, 203-214.
target selection
?
“Improving the quality of target selection is the single most important factor to transform industry productivity and bring innovative new medicines to patients.”
Bunnage, M.E. (2011) Getting pharmaceutical R&D back on target, Nat Chem Biol, 7, 335-339.
Reproducibility
“Virtual witnessing” for those not presentusing new the information technology ofthe scientific journal & the scientific article.
c. 1660: Robert Boyle and colleagues concerned with scientific vlidity of claims, e.g. “transformation of lead into gold”…
Scientific facts will now be establishedby reproducible demonstration before a “jury of one’s peers”.
adapted from [1] Steven Shapin 1984, Pump and Circumstance: Robert Boyle’s Literary Technology. Social Studies of Science 14(4):481-520
BOYLE: “We took a large and lusty frog and having included him in a small receiver we drew out the air not very much and left him very much swelled and able to move his throat from time to time - though not so fast as when he freely breathed before the exsuction (extraction) of the air. He continued alive about two hours that we took notice of, sometimes removing from one side of the receiver to the other, but he swelled more than before, and did not appear by any motion of his throat or thorax (chest) to exercise respiration. But his head was not very much swelled, nor his mouth forced open. After he had remained there somewhat above 3 hours, for it was not 3 hours and an half, perceiving noe signe of life in him, we let in the air upon him, at which the formerly tumid (swelled) body shrunk very much, but seemed not to have any other change wrought in it and though we took him out of the receiver yet in the free air it self, he continued to appear stark dead nevertheless to see the utmost of the experiment having caused him to be carried into a garden and layd upon the grass all night, the next morning we found him perfectly alive again.” (BP 18, fol. 127r)
adapted from Carusi 2015, “Virtual Witnessing”, in Future of Research Communications & eScholarship, Mathematical Institute, Oxford UK, 11-12 January 2015.
Definition: A scientific article is a
1. defeasible argument for claims; supported by 2. exhibited, reproducible data and methods,
and3. explicit references to other work in the
domain;4. described using domain-agreed technical
terminology.5. It exists in a complex ecosystem of
technologies, people and activities.
Logical status of a scientific article
Efforts to improve the ecosystem
• Mandatory open access
• Direct data citation & archiving
• Methods cataloging & ID
• Open annotation (W3C OA)
• Micro- & nano-publications μPub
• Reproducibility initiative
“Micropublications” may be used to construct a graph of the
discussion and evidence including challenges.
Clark, Ciccarese & Goble: Micropublications: a Semantic Model of Claims, Evidence, Argument and Annotation for Biomedical Communication. Journal of Biomedical Semantics 2014 5:28 (http://www.jbiomedsem.com/content/5/1/28/abstract).
IPS: http://www.ebi.ac.uk/efo/EFO_0004905
Stem Cell: http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C12662
SemanticTags
http://purl.org/mp/mp:claim
http://purl.org/mp/mp:supportedBy
http://purl.org/mp/mp:data
Micropublication
Micropublication semantic summary
{:MP3 rdf:type mp:Micropublication; mp:name "MP(a3)"; mp:description "Digital summary of Spillman et al. 2010"; pav:authoredBy [ a foaf:Person ; foaf:name "Tim Clark" ]; pav:createdBy [ a foaf:Person ; foaf:name "Tim Clark" ]; pav:createdOn "2013-03-06T09:49:12-05:00"^^xsd:dateTime ; mp:argues :C3; mp:supportedBy <info:doi:10.1371/journal.pone.0009979> .} .
:MP3 = {:S1 rdf:type mp:Statement; mp:hasContent "Rapamycin [is] an inhibitor of the mTOR pathway." ; mp:supportedBy <info:doi/10.1038/nature08221> .:S2 rdf:type mp:Statement; mp:hasContent "PDAPP mice accumulate soluble and deposited Aβ and develop AD-like synaptic deficits as well as cognitive impairment and hippocampal atrophy." ; mp:supportedBy <info:doi/10.1073/pnas.96.6.3228> .
:S3 rdf:type mp:Statement; mp:hasContent "Rapamycin-fed transgenic PDAPP mice showed improved learning (Figure 1a) and memory (Figure 1b). We observed significant deficits in learning and memory in control-fed transgenic PDAPP animals." ; mp:supportedBy <http://www.jneurosci.org/content/20/11/4050> .
:M1 rdf:type mp:Procedure; mp:hasName "Rapamycin-supplemented mouse diet protocol" ; mp:hasContent "We fed a rapamycin-supplemented diet... or control chow to groups of PDAPP mice and littermate non-transgenic controls for 13 weeks. At the end of treatment (7 mo), learning and memory were tested using the Morris water maze." .
:M2 rdf:type mp:Material; mp:hasName "PDAPP J20"; mp:hasDescription "Lennart Mucke's PDAPP J20 transgenic mice, as obtained from JAX, stock#006293" ; mp:describedBy: <http://jaxmice.jax.org/strain/006293.html> .
:D1 rdf:type mp:Data; pav:retrievedFrom <http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0009979#pone-0009979-g001>; mp:supportedBy :M1, :M2 .
:C3 rdf:type mp:Claim; mp:hasContent "Inhibition of mTOR by rapamycin can slow or block AD progression in a transgenic mouse model of the disease." ; mp:supportedBy :S1, :S2, :S3, :D1.} .
W3C Open Annotation Model
<body1> a cnt:ContentAsText, dctypes:Text ; cnt:chars "content" ; dc:format "text/plain" .
<target1> dc:format “application/pdf”
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <target1> .
RDF
Micropublication of Obakata’soriginal claims & data
Micropublication of discussion from PubPeer & Riken
But is this really such a great idea?
Does failure to reproduce invalidate the original experi-
ment, or the reproduction experiment?
Transparency vs. Reproducibility
• Require significant effort to achieve progress but transparency is more pragmatic.
• Transparency should naturally lead to more rapid correction/validation/responsibility.
• Open licenses will facilitate assessment of reproducibility in transparent content.
• Innovation and standardization needed in filtering and identification of most reproducible works.
42
adapted with thanks, from a talk by Iain Hrynaszekwicz, Nature Publishing Group, on “Transparency vs. Reproducibility”, Mathematical Institute, Oxford UK, Jan. 11, 2015
Should Scholarly Research Aim for Reproducibility or Robustness?
Reproducibility: The ability of an entire experiment or study to be reproduced, ideally according to the same reproducible experimental description and procedure
Robustness: A characteristic describing a phenomenon / finding to be detected effectively while the variables of a test system are altered
A robust concept can be observed without failure under a variety of conditions
A robust finding may be (biologically) more relevant than reproducibility
⇨ Robustness of data may be key
adapted with thanks, from a talk by Thomas Steckler, Janssen Pharmaceuticals, on “Reproducibility vs. Robustness”, Mathematical Institute, Oxford UK, Jan. 11, 2015
Conclusions• False reported discovery rate (FRDR) is a
systemic problem in biomedical research and communication.
• FRDR drives up pharmaceutical attrition, cost of health care; negatively impacts translation T1-T4.
• There are statistical, ethical, informatics and social components to scientific reproducibility - all of which need to be addressed.
Ernest Rutherford: “All science is either physics
or stamp collecting.”
Paraphrase: Physics is the best and most rigorous of all scientific enterprises, i.e., the
“gold standard”.
Historical values of the speed of light
• pre-17th century: ∞ (instantaneous)
• 1638 Galileo: at least 10 times faster than sound
• 1675 Ole Roemer: 200,000 Km/sec
• 1728 James Bradley: 301,000 Km/sec
• 1849 Hippolyte Louis Fizeau: 313,300 Km/s
• 1862 Leon Foucault 299,796 Km/s
• Today: 299,792.458 km/s