Confidential: For Review Only
Impact of funding source and disease studied on clinical
trial dissemination and design. A large-scale integrative analysis of clinicaltrials.gov and PubMed data.
Journal: BMJ
Manuscript ID BMJ.2017.041628.R1
Article Type: Research
BMJ Journal: BMJ
Date Submitted by the Author: 12-Jan-2018
Complete List of Authors: Zwierzyna, Magdalena; BenevolentBio Ltd, Biomedical Informatics; University College London, Institute of Cardiovascular Science Davies, Mark; BenevolentBio Ltd Hingorani, Aroon; University College London, Institute of Cardiovascular Science; Farr Institute of Health Informatics Hunter, Jackie; BenevolentBio Ltd; St Georges Hospital Medical School
Keywords: clinical trials, research transparency, study design, data analysis, data integration, trial reporting, publication bias
https://mc.manuscriptcentral.com/bmj
BMJ
Confidential: For Review OnlyImpact of funding source and disease studied on clinical trial
dissemination and design
A large-scale integrative analysis of clinicaltrials.gov and PubMed data
Magdalena Zwierzyna, biomedical data scientist1,2,
*, Mark Davies, VP of
biomedical informatics1, Aroon D Hingorani, professor, consultant
2,3, Jackie
Hunter, CEO, professor1,4
1BenevolentBio Ltd, London, United Kingdom
2Institute of Cardiovascular Science, University College London, London, United
Kingdom
3Farr Institute of Health Informatics, London, United Kingdom
4St George’s Hospital Medical School, London, United Kingdom
*Corresponding author
Magdalena Zwierzyna
Email: [email protected] (MZ)
Tel: +44 (0) 203 096 0720
BenenevolentBio Ltd
40 Churchway
NW1 1LW London
United Kingdom
Word count: 5,084
Page 1 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyAbstract
Objective: To investigate the distribution, design characteristics, and dissemination of clinical trials
by funding organisation and medical specialty.
Design: Cross-sectional descriptive analysis.
Data sources: Trial protocol information from clinicaltrials.gov, metadata of journal articles in which
trial results were published (PubMed), and quality metrics of associated journals from SCImago
Journal & Country Rank database.
Selection criteria: All 45,620 clinical trials evaluating small-molecule therapeutics, biologics,
adjuvants, and vaccines, completed after January 2006 and prior to July 2015, including randomized
controlled trials (RCTs) and non-randomized studies across all clinical phases
Results: Industry was more likely to fund large international RCTs compared to non-profit funders,
although methodological differences have been decreasing with time. Amongst 27,835 completed
efficacy trials (Phase 2-4), 54.2% had disclosed their findings publicly. Industry was more likely to
disseminate trial results compared to non-profit trial funders (59.3% vs 45.3%), with large
pharmaceutical companies having higher disclosure rates than small ones (66.7% vs 45.2%). NIH-
funded trials were disseminated more often than those of other non-profit institutions (60.0% vs
40.6%). Big Pharma and NIH-funded study results were more likely to appear in clinicaltrials.gov
than those from non-profit funders, which were published mainly as journal articles. Trials reporting
the use of randomization were more likely to be published in a journal article compared to non-
randomized studies (34.9% vs 18.2%), and journal publication rates varied across disease areas,
ranging from from 42% for autoimmune diseases to 20% for oncology.
Conclusions: Trial design and result dissemination varies substantially depending on the type and
size of funding institution as well as the disease area under study.
What this paper adds
What is already known on this subject:
• Suboptimal study design and slow dissemination of findings are common amongst registered
clinical trials; almost half of all studies remaining unpublished years after completion.
• A wider range of organisations is now funding clinical trials
• Reporting rates and other quality measures may vary by the type of funding organization.
What this study adds:
• Design characteristics (as a marker of methodological quality) vary by funder and medical
specialty, and associate with reporting rates and the likelihood of publication in high-impact
journals.
• Interventional trials with industrial funding are, on average, more robustly designed than
those funded by non-commercial organizations.
• Industry-funded trials are also more likely to be disclosed than those from other funders,
although small pharmaceutical organizations lag behind “Big Pharma”.
• Trials with NIH funding are more likely to disclose results compared to other academic and
non-profit organizations.
• Big Pharma and NIH-funded studies show high rates of reporting within the clinicaltrials.gov
results database.
Page 2 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only• Trials funded by other academic and non-profit organizations were disclosed primarily as
journal articles, and rarely submitted to clinicaltrials.gov.
• Result dissemination rate varies by medical specialty, mainly due to differential journal
publication rates, ranging from 42% for autoimmune diseases to 20% for oncology.
INTRODUCTION
Well conducted clinical trials are widely regarded as the best source of evidence on the efficacy and
safety of medical interventions.[1] Trials of first-in-class drugs also provide the most rigorous test of
causal mechanisms in human disease.[2] Clinical trial findings inform regulatory approvals of novel
drugs, key clinical practice decisions and guidelines, and fuel the progress of translational
medicine.[3, 4] However, this model relies on trial activity being high-quality, transparent, and
discoverable.[5] Randomization, blinding, adequate power, and a clinically relevant patient
population are among the hallmarks of high-quality trials.[1, 5] In addition, timely reporting of
findings, including negative and inconclusive results, ensures fulfilment of ethical obligations to trial
volunteers and avoids unnecessary duplication of research or creating biases in clinical knowledge
base.[6, 7]
To improve the visibility and discoverability of clinical trials, the Food and Drug Administration (FDA)
mandated the registration of interventional efficacy trials and public disclosure of their results.[8-11]
Currently, 17 registries store records of clinical studies.[12, 13] Clinicaltrials.gov, the oldest and
largest of such platforms,[14, 15] serves as both a continually updated trial registry and a database
of trial results, thus offering an opportunity to explore, examine, and monitor the clinical research
landscape.[10, 14]
Previous studies of clinical research activity used registered trial protocols and associated published
reports to investigate trends in quality, reporting rate, and potential for publication bias (e.g.[14, 16-
21]). However, many previous studies relied on time-consuming manual literature searches, which
confined their scope to pivotal randomized controlled trials or research funded by major trial
sponsors. Large-scale analysis of relationships between a wide range of protocol characteristics and
result dissemination rates, and the differences between large and small organizations has not been
previously undertaken. This is important, given a reported growth in trials funded by small industrial
and academic institutions.[22-26]
To address this gap, we performed the most comprehensive analysis to date of all interventional
studies of small molecule and biologic therapies registered on clinicaltrials.gov, up to July 19, 2017.
We linked trial registry entries to the publication of findings in journal articles, and used descriptive
statistics, semantic enrichment, and data visualization methods to investigate and illustrate the
landscape and evolution of registered clinical trials. Specifically, we investigated the influence of
funder type and disease area on the design and publication rates of trials across a wide range of
medical journals; changes over time; and relationships to reported attrition rates in drug
development. To facilitate comparison across diverse study characteristics, we designed a novel
visualization method allowing rapid, intuitive inspection of basic trial properties.
Page 3 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyMETHODS
Our large-scale cross-sectional analysis and field synopsis of the current and historical clinical trial
landscape uses a combination of basic trial protocol information and published trial results. The
developed data processing and integration workflow manipulates and links data from
clinicaltrials.gov with information about journal articles in which trial results were published.
Clinicaltrials.gov data processing and annotation
We downloaded the clinicaltrials.gov database as of July 19, 2017 – nearly two decades after the
FDA Modernization Act (FDAMA) legislation requiring trial registration.[8] The primary dataset
encompassed 249,904 studies in total. We then identified records of all pharmaceutical and
biopharmaceutical trials, defined as clinical studies evaluating small-molecule therapeutics,
biologics, adjuvants, and vaccines.[27] To extract the relevant subset, the Study type field was
restricted to “interventional” and Intervention type was restricted to either “Drug” or “Biological”.
This led to a set of 119,840 clinical trials with a range of organizations, disease areas, and design
characteristics (Suppl. File 1).
To enable robust aggregate analyses, we processed and further annotated the dataset. Firstly, date
formats and age eligibility fields were harmonised. Next, wherever possible, missing categories were
inferred from available fields as described by Califf and colleagues[16]. For example, for single-arm
interventional trials with missing allocation and blinding type annotations, allocation was assigned as
“non-randomized” and blinding as “open label”.[16] The dataset was further enriched with derived
variables. For example, trial duration value was derived from available trial start and primary
completion date; trial locations were mapped to continents and flagged if they involved emerging
markets, and trials making use of placebo and/or comparator controls were labelled based on
information from intervention name, arm type, trial title, and description field. For instance, if a
study had an arm of type “Placebo Comparator” or an intervention whose name included words
“placebo”, “vehicle” or “sugar pill”, it was annotated as “placebo-controlled”; similarly, studies with
an arm of type “Active Comparator” were classified as “comparator-controlled”; see Suppl. File 2 for
details. Finally, each trial was annotated with the number of outcome measures and eligibility
criteria following parsing of the unstructured eligibility data field as described by Chondrogiannis et
al.[28]
To categorize clinical trials by medical specialty, we mapped reported disease condition names to
corresponding Medical Subject Headings (MeSH) terms (from both MeSH Thesaurus and MeSH
Supplementary Concept Records)[29] using a dictionary-based named entity recognition
approach.[30] A total of 21,884 distinct condition terms were annotated with MeSH identifiers,
accounting for 86.4% of all disease names from 88% of the clinical trials. The mapping enabled
normalization of various synonyms to preferred disease names as well as automatic classification of
diseases into distinct categories using the MeSH hierarchy. To compare our results with previously
published clinical trial success rates, we focused on seven major disease areas investigated in the
study by Hay et al.[31]: oncology, neurology, autoimmune, endocrine, respiratory, cardiovascular,
and infectious diseases. Trial conditions annotated with MeSH terms were mapped to these
Page 4 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Onlycategories based on matching terms from levels 2 and 3 of the MeSH term hierarchy; see Suppl. File
2 for details on MeSH term assignment and disease category mapping.
To categorize studies by funder type, we first extended the agency classification available from
clinicaltrials.gov. Specifically, we used two main categories: industry and non-profit organizations;
and four sub-categories: Big Pharma, Small Pharma, NIH, and Other; the last category corresponding
to universities, hospitals, and other non-profit research organizations.[32, 33] Industrial
organizations were classified as “Big Pharma” and “Small Pharma” based on their sales in 2016 as
well as overall number of registered trials they sponsored. Specifically, an industrial organization was
classed as “Big Pharma” if it featured on the list of 50 companies from the 2016 Pharmaceutical
Executive ranking[34] or if it sponsored more than 100 clinical trials; otherwise it was classified as
“Small Pharma”; see Suppl. File 2. Next, we derived the likely source of funding for each trial based
on lead_sponsor and collaborators data fields, using a previously described method[16] extended to
include the distinction between big and small commercial organizations. Briefly, if NIH was listed
either as the lead sponsor of a trial or a collaborator, the study was classified as NIH-funded. If NIH
was not involved, but either the lead sponsor or a collaborator was from industry, the study was
classified as funded by either “Big Pharma” if it included any Big Pharma organization or “Small
Pharma” otherwise. Remaining studies were assigned “Other” as the funding source.
Establishing links between trial registry records and published results
We linked the trial registry records to their disclosed results, either deposited directly on
clinicaltrials.gov or published as a journal article, to establish if the results of a particular trial had
been disseminated. We implemented the approach described by Powell-Smith and Goldacre[20] to
identify eligible trials that should have publicly disclosed results. These were studies concluded after
January 2006 and by July 2015 (to allow sufficient time for publication of results that had not filed an
application to delay submission of the results (based on the firstreceived_results_disposition_date
field in the registry).[20] We included studies from all organizations (even minor institutions)
regardless of how many trials they funded. The selection process led to a dataset of 45,620 studies,
including 27,835 efficacy trials (Phase 2-4); summarized by the flow diagram in Suppl. File 1.
For each eligible trial, we searched for published results using two methods. Firstly, we searched for
structured reports submitted directly to the results database of clinicaltrials.gov (based on the
clinical_results field). Next, we queried the PubMed database with NCT numbers of eligible trials
using an approach that involves searching for the NCT number in the title, abstract, and Secondary
Source ID field of PubMed-indexed articles.[20, 35, 36] To exclude study protocols, commentaries,
and other non-relevant publication types, we used the broad “Therapy” filter – a standard validated
PubMed search filter for clinical queries[37] – as well as simple keyword matching for “study
protocol” in publication titles.[20] In addition, we excluded articles published before trial
completion.
We then determined the proportion of publicly disseminated trial reports for the studies covered by
mandate for result submission (efficacy trials: Phase 2-4) as well as for all completed drug trials
including those not covered by the FDA Amendment Act (Phase 0-1 trials and studies with phase set
to “n/a”). In addition to result dissemination rate, we calculated the dissemination lag for individual
studies to determine the proportion of reports disclosed within the required 12 months after
Page 5 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Onlycompletion of a trial. We defined dissemination lag as time between study completion and first
public disclosure of the results (whether through submission to clinicaltrials.gov or publication in a
journal).
Categorizing scientific journals
To investigate the relation between characteristics of published trials and quality metrics of scientific
journals, we classified journals as high or low impact based on journal metrics information from
SCImago Journal & Country Rank, a publicly available resource that includes journal quality
indicators developed from the information contained in the Scopus database.[38] Specifically, a
journal was classified as high-impact if its 2016 H-index is larger than 400 and as low-impact journal
if its H-index was below 50.
Statistical analysis and data visualization
We examined the study protocol data on 15 characteristics available from clinicaltrials.gov and from
our derived annotations. These were: (1) randomization; (2) interventional study design (e.g. parallel
assignment, sequential assignment or crossover studies); (3) masking (open-label, single-blinded,
double-blinded studies); (4) enrolment (total number of participants in completed studies or
estimated number of participants for ongoing studies); (5) arm types (placebo or active comparator);
(6) duration; (7) number of outcomes (clinical endpoints); (8) number of eligibility criteria; (9) study
phase; (10) funding source; (11) disease category; (12) countries; (13) number of locations; (14)
publication of results; (15) reported use of DMCs (Data monitoring committees). Further
explanation and definitions of trial properties are available in Suppl. File 2.
Based on these criteria characteristics, we examined methodological quality of trials funded by
different types of organizations and compared studies published in higher and lower impact journals
and reported on clinicaltrials.gov. In addition, we analysed trial dissemination rates across funding
categories and disease areas and compared them with clinical success rates reported previously.[31]
We used descriptive statistics and data visualization methods to characterize trial categories. For
categorical data we report frequencies and percentages; using medians and interquartile ranges for
continuous variables. To visualize multiple features of clinical trial design, we used radar plots with
each axis showing the fraction of trials with a different property, such as reported use of
randomization, blinding or active comparators. We converted three continuous variables to binary
categories: we classified trials as “small” if they enrolled fewer than 100 participants, as “short” if
they lasted less than 365 days, and as “multi-country” if trial recruitment centres were in more than
one country. Further details are available in supplementary tables. To facilitate comparisons
between several radar plots, we always standardized the order and length of axes. We used Python
for all data analysis tasks (scikit-learn, pandas, scipy, and numpy libraries), working with geographical
data (geonamescache, geopy, and Basemap libraries) as well as for data processing and visualization
(matplotlib, Basemap, and seaborn libraries).
Patient involvement
Patients were not involved in any aspect of the study design, conduct, or in the development of the
research question. The study is an analysis of existing publicly available data and therefore there was
no active patient recruitment for data collection.
Page 6 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
RESULTS
Overview of clinical trial activity by funder type
To date, 119,840 pharmaceutical trials have been registered with clinicaltrials.gov. As illustrated by
Fig 1A, the distribution of trials funded by different types of organizations has changed over time. In
particular, there has been a rise in the proportion of studies funded by organizations other than
industry or NIH – mainly universities, hospitals and other academic and non-profit agencies,
classified here as “Other”. Overall, these institutions funded 36.5% of all registered pharmaceutical
trials, followed by Big Pharma with 31%, and Small Pharma with 21.2% of studies. NIH funded 11.3%
of registered pharmaceutical trials (Fig 1B).
Amongst the 10 main study funders ranked by the overall number of trials, two belonged to the
“NIH” category (National Cancer Institute leading with 7,219 trials) and eight to the “Big Pharma”
category (GSK at the top with 3,171 trials). Out of 4,914 “Small Pharma” organizations, 73.6% had
fewer than five trials, whilst 43.1% funded only one study. Similarly, amongst the 6,468 funders
classified as “Other”, 79.2% had fewer than five trials, whilst 32.7% funded only one study.
The approach outlined in the Methods section identified 45,620 clinical trials in our dataset that
should also have available published results. The remaining analysis in the Results section focuses on
this subset of studies.
Trial design characteristics
The dataset included various types of study designs from large randomized clinical trials (RCTs) to
small single-site studies, many of which lacked controls (Suppl. File 2). Comparative analysis showed
that methodological characteristics of trials varied substantially across clinical phases (Fig 2A),
disease areas (Fig 2B), and funding source (Fig 3). The following section will focus on a comparative
analysis by funder type whilst detailed overview of differences across phases and diseases is
available in Suppl. File 2.
Industrial organizations were more likely to fund large international trials and more often reported
the use of randomization, controls and blinding, compared to non-industrial funders. Whilst
differences were evident across all clinical phases (Suppl. File 2), the largest variability was observed
for Phase 2 trials, illustrated by Fig 3. For instance, 68.5% of all industry-funded Phase 2 trials were
randomized (compared to 39.4% and 50.6% of studies funded by NIH and other institutions,
respectively). Studies with industrial funding were also substantially larger (median 80 patients
compared to 43 and 48) and more likely to include international locations (31.3% for industry vs
4.0% for Other and 11.0% for NIH). Despite larger enrollment size, industry-funded studies were
shorter on average compared to trials by other funders (median 547 days compared to 1,431 days
for NIH and 941 for “Other” funding category). Non-industrial funders were more likely to report the
use of data monitoring committees (DMCs): 41.9% of trials with NIH funding and 48.1% of studies
funded by other organizations involved DMCs compared to 26.7% of trials funded by industry).
Design characteristics did not differ substantially between Big and Small Pharma (Suppl. File 2).
Page 7 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Trial design changed substantially over time with the trends most evident in Phase 2. As shown on
Fig 4, the methodological differences between industry and non-industrial funders have generally
decreased over time. Although overall a higher proportion of industry-led studies reported the use
of randomization, the proportion of RCTs funded by non-profit organizations in Phase 2 has recently
increased (Fig 4). Notably, the use of blinding is still much lower in non-commercial than commercial
trials. Additional analysis showed that the average number of trial eligibility criteria and outcome
measures have increased over time, leading to a greater number of procedures performed in an
average study, and hence to increase in trial complexity (Suppl. File 2).
Trial results dissemination and characteristics of published trials
Of the 45,620 completed trials, 27,835 were efficacy studies in Phases 2-4 covered by the mandate
for results publication (Suppl. File 1). Of these, only 15,084 (54.2%) have disclosed their results
publicly. Consistent with previously reported trends,[39, 40] dissemination rates were lower for
earlier clinical phases (not covered by the publication mandate) as only 23.4% of trials in Phases 0-1
had publicly disclosed results (Suppl. File 2).
Detailed analysis of Phase 2-4 trials showed that only 25.2% of published studies disclosed their
results within the required period of 12 months after completion. 18.6 months was the median time
to first public reporting, whether through direct submission to clinicaltrials.gov or publication in a
journal. Dissemination lag was smaller for results submitted to clinicaltrials.gov (median 15.3
months) compared to results published in a journal (median 23.9 months).
10,554 (37.9%) Phase 2-4 studies disclosed the results through structured submission to
clinicaltrials.gov, 8,338 (29.95%) through a journal article, and 3,808 (13.7%) disclosed through both
resources. Scientific articles were published in 1,454 titles of diverse nature and varying impact
factor. These ranged from prestigious general medical journals (BMJ, JAMA, Lancet, NEJM) to
narrower focus, specialist journals with no impact information available.
Journal publication status depended on study design. Trials reporting the use of randomization were
more likely to be published compared to non-randomized studies (34.9% vs 18.2%), and large trials
enrolling more than 100 participants were more likely to be published than studies with less than
100 participants (39.3% vs 20.5%).
In general, studies whose results were disclosed in the form of a scientific publication were more
likely to follow the ideal RCT design paradigms compared to those whose results were submitted to
clinicaltrials.gov only (Fig 5). Furthermore, a similar trend was observed for trials with results
published in higher vs. lower impact journals. For instance, 89.9% of trials published in journals with
H-index > 400 reported the use of randomization, compared to 79% of trials published in journals
with H-index < 50, and 60.4% of trials whose results were submitted to clinicaltrials.gov only (Fig 5).
The characteristics of trials disclosed solely through submission to clinicaltrials.gov did not
substantially differ from those that not disseminated in any form (Suppl. Files 2 and 3).
Trial dissemination rates across funder types
Page 8 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyTrial dissemination rates varied amongst the different funding source categories. In general,
industry-funded studies were more likely to disclose results compared to non-commercial trials
(59.3% vs 45.3% of all completed Phase 2-4 studies). They were also more likely to report the results
on clinicaltrials.gov, overall (45.6% vs 24.4%) and within the required period of 12 months after
completion (23.3% vs 10.3%). Notably, only a small difference was observed for journal publication
rates (30.98% vs 28.4%). Additional analysis showed substantial differences across more granular
funder categories (Fig 6A). Among non-profit funders, there was a gap between NIH-funded studies
and those funded by other academic and non-profit organizations: 60% vs 40.6% of trials being
disclosed from the two categories, respectively. Similarly, among the industry-funded clinical trials,
Big Pharma had a higher result dissemination rate compared to Small Pharma (66.7% vs 45.2% of
studies).
In addition to overall reporting rates, the analysis also highlighted differences in the nature and
timing of results dissemination across trial funding sources (Fig 6A-D). Big Pharma had the highest
dissemination rate, both in terms of clinicaltrials.gov submission (53.2% of studies) and journal
publications (34.8%). By contrast, funders classified as “Other” showed the lowest percentage of
trials published on clinicaltrials.gov (15.9%). Organizations from this category disclosed trial results
primarily through journal articles (29.1% of all studies) and were the only funder type with fewer
trials reported through clinicaltrials.gov submission than through journal publication. The time to
reporting trials at clinicaltrials.gov was fastest for industry funded trials and slowest for studies
funded by organizations classed as “Other”; publication lag for journal articles showed a reverse
profile (see Fig 6C-D and Suppl. File 2 for detailed statistics).
Analysis of the fraction of trials with results disclosed within the required 12 months after
completion showed that dissemination rates have been steadily growing with time with the largest
increase observed after 2007, when FDAAA came into effect.[9] Further analysis showed that the
observed growth was largely driven by increased timely reporting on clinicaltrials.gov, most evident
for studies funded by Big Pharma and NIH (Suppl. File 2).
Trial dissemination rates across disease categories
Disease areas with highest result dissemination were autoimmune diseases and infectious diseases,
whilst neurology and oncology showed the lowest dissemination rates (Fig 7). As shown on Fig 7A,
these differences were largely driven by differential trends in journal publication, with less variability
observed for rates of result reporting on clinicaltrials.gov. Oncology in particular had consistently the
lowest journal publication rate across all clinical phases and funder categories.
Two disease categories with the highest fraction of published trials (autoimmune and infectious
diseases) were also previously associated with highest likelihood of new drug approval (LoA) (12.7%
and 16.7%, respectively). Similarly, two disease areas showing the lowest proportion of published
trials also had the lowest drug approval rates: neurology (LoA of 9.4%) and oncology (6.7%).[31]
Cardiovascular diseases had a relatively high publication rate, but low clinical trial success rate with
only 7.1% of tested compounds eventually approved for marketing.[31]
Page 9 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyDISCUSSION
This analysis, enabled by automated data-driven analytics, provided several important insights on
the scope of research activity, design, visibility and reporting of clinical trials of pharmaceutical
interventions. There has been a growth and diversification of trial activity with a more diverse range
of profit and non-profit organizations undertaking clinical research across a wide range of disease
areas, and the analysis highlighted potentially important differences in trial quality and
dissemination.
In terms of trial design, industry-funded studies tended to use more robust methodology compared
to studies funded by non-profit institutions, although the methodological differences have been
decreasing with time. Result dissemination rates varied substantially depending on disease area as
well as the type and size of funding institution. In general, the results of industry-funded trials were
more likely to be disclosed compared to non-commercial studies, although small pharmaceutical
organizations lagged behind “Big Pharma”. Similarly, trials with NIH funding more often disclosed
their results compared to those funded by other academic and non-profit organizations. Although
overall dissemination rates improved with time, the trend was largely driven by increased results
reporting on clinicaltrials.gov for Big Pharma and NIH-funded studies. By contrast, the results of
trials funded by other academic and non-profit organizations were primarily published as journal
articles and were rarely submitted to clinicaltrials.gov. Finally, differences in dissemination rates
across medical specialties were mainly due to differential journal publication rates ranging from 42%
for autoimmune diseases to 20% for oncology.
Strengths and limitations
One of the major strengths of this work is the larger volume of clinical trials data and the greater
depth and detail of the analysis compared to previous studies in this area. Many factors were
analyzed for the first time in the context of both trial design quality and transparency of research.
These included the distinction between Big Pharma and smaller industrial organizations, analysis of
publication trends across distinct medical specialties, and differences between trials whose results
were disseminated through clinicaltrials.gov submission and publication in higher or lower impact
journals. In addition, the study provided a novel method for visualizing trial characteristics that
enables a rapid assimilation of similarities and differences across a variety of factors.
The study had several limitations related to the underlying dataset and the methodology used to
annotate and integrate the data. Firstly, it was limited to the clinicaltrials.gov database and hence
did not include unregistered studies. However, given that trial registration has been widely accepted
and enforced by regulators and journal editors,[8, 12] it is likely that most of the influential studies
have been captured in the analysis.[41] Additional limitations were due to the data quality issues
such as missing or incorrect entries, ambiguous terminology, and lack of semantic
standardization.[14, 16] Many trials were annotated as placebo or comparator-controlled based on
information mined from their unstructured titles and descriptions, as the correct annotation in the
structured arm type field was often missing. Although a subset of annotations was analyzed
manually, there might still be some trials that our automatic workflow has misclassified. In addition,
due to the non-standardized free-text format of the data, only a simple count of eligibility criteria
and outcome measures was used as an imperfect proxy for assessing trial complexity and the
Page 10 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Onlyhomogeneity of enrolled population.
Finally, our method for linking registry records to associated publications relied on the presence of
registry identifier (NCT number) in the text or metadata of a journal article. Its absence from a trial
publication would lead to misclassified publication status and, consequently, underestimated
publication rates. However, owing to the rigorousness of the search methods, it is unlikely that the
publication status was misclassified in a systematic manner, e.g. depending on study funding source
or disease area. Hence, although absolute estimates might be affected, it is unlikely that the
comparative analysis based on automated record linkage is biased towards one of the categories.
Notably, reporting of registration numbers is encouraged by medical journals through ICMJE and the
CONSORT checklist and it can be argued that compliance is part of investigators’ responsibility to
ensure that their research is transparent and discoverable.[20, 35]
Comparison with other studies
In addition to providing novel contributions, our analysis further confirmed and extended several
findings reported previously by others. Notably, larger sample size, wide inclusion criteria, and
additional mappings allowed us to calculate more accurate statistics and ask previously unexplored
questions in an unbiased way. For instance, Bala et al.[42] showed that trials published in higher
impact journals have larger enrollment size based on a sample of 1,140 RCTs reports and manual
search. Our study extended this analysis to entire clinicaltrials.gov registry (not only RCTs) and many
additional trial properties beyond enrollment size, including reported use of DMCs, duration of the
studies, and number of countries. Other previously reported trends confirmed and further examined
in this study included lower reporting rate observed for earlier clinical phases[39, 40] and academic
sponsors,[39, 40] longer dissemination lag for journal publications,[43] and overall improvements in
result dissemination.[44]
Several previous studies investigated the transparency of clinical research using various subsets of
clinicaltrials.gov and diverse approaches for classifying study results as published[17, 20, 36, 39, 40,
45, 46]; summarized in detail elsewhere.[47, 48] General trends showed by our study were within
estimates reported previously and our estimate of 54% for overall trial dissemination rate was in
agreement with two recent systematic reviews.[47, 48] Our study is methodologically most similar
to the work by Powell-Smith and Goldacre[20] with one important distinction: we did not restrict
the analysis to trials sponsored by organizations with more than 30 studies as this would exclude
small pharmaceutical companies and research centers, and bias the results towards large
institutions that routinely perform or fund clinical research. As a consequence, the analysis showed
analogous trends, but slightly lower overall reporting rates compared to Ref 20.[20]
Meaning and implications of the study
The higher methodological quality of industry funded trials may reflect the available financial and
organizational resources as well as better infrastructure and expertise in conducting clinical
trials.[49, 50] It is also possible that differential objectives of the industrial and non-profit
investigators (development of new medicines with a view to gaining a license for market access vs
mechanistic research and rapid publication of novel findings) contribute to the observed
differences.[49, 51]
Page 11 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyThe analysis suggests that trials funded by large institutions that regularly conduct clinical research
are more likely to disseminate their results publicly. Although industry led reporting overall, ‘Big
Pharma’ performed much better compared to smaller organizations, likely because many large
companies developed explicit disclosure policies in the recent years.[52-54] Similarly, NIH has
developed specific regulations for disseminating the results of the studies it funds. The most recent
such policy requires the publication of all NIH-funded trials, including (unprecedentedly) Phase 1
safety studies.[55] Whilst, large trial sponsors seem to actively pursue better transparency,[39]
smaller organizations, particularly those in the non-profit sector, were less responsive to the FDAAA
publication mandate. It is possible that such institutions are less able, or less motivated, to allocate
time and resources required to prepare and submit the results, or may be less familiar with the
legislation.[39, 41]
Our study suggests that the FDA mandate for trial results dissemination and the establishment of
clinicaltrials.gov results database led to improved transparency of clinical research. The time to
results dissemination is faster for clinicaltrials.gov compared to traditional publication and the
database stores the results of many otherwise unpublished studies. Lack of journal publications
might be explained by the reluctance of editors to publish smaller studies based on suboptimal
design or studies with “negative” or non-significant outcomes that are unlikely to influence clinical
practice.[42, 56, 57] Our analysis showed that journal publication varied depending on trial design
and disease area, with certain medical specialties and smaller non-randomized studies more likely to
be disseminated solely through clinicaltrials.gov. The difference was particularly evident for
oncology. Only a fifth of completed cancer trials were published in a journal, perhaps due to the high
prevalence of small sample sizes, suboptimal trial designs, and negative outcomes in this field.[14,
31] High clinicaltrials.gov dissemination rates, including cancer trials and small studies, suggests that
the resource plays an important role in reducing publication bias. Thus, to assure better
completeness of available evidence, the database should be used by clinical and drug discovery
researchers in addition to literature reviews.
Importantly, automated analyses and methods for identification of yet to be published studies could
be used by regulators and funders of clinical research to track trial investigators that failed to
disclose the results of their studies. Such a technology would be particularly useful when coupled
with automated notifications and additional incentives (or sanctions) for result dissemination.
Recently, the concept of automated ongoing audit of trial reporting was implemented through an
online TrialTracker tool that monitors major clinical trial sponsors with more than 30 studies.[20]
Improvements to the quality of clinical trial data could facilitate the development of robust
monitoring systems in the future. Firstly, data quality should be taken into account when registering
new trials and standardized nomenclature should be used whenever possible. Ideally, the user
interface of the clinicaltrials.gov registry itself should be designed to encourage and facilitate
normalization. In addition, stricter adherence to the CONSORT requirement for including the trial
registration number with each published article would improve the accuracy of automated mapping
of registry records to associated publications.[41]
In addition to improved monitoring, new regulations might be needed to ensure better transparency
and it is important that small industrial organizations and academic centres are reached by these
efforts as well. One possible solution was proposed by Anderson and colleagues[39]: journal editors
Page 12 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Onlyfrom ICMJE could call for submission of results to clinicaltrials.gov as a requirement for article
publication. Given dramatic increase in trial registration rates that followed previous ICMJE
policies,[58, 59] it is likely that such a call would improve the reporting of ongoing clinical
research.[39]
Page 13 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyAcknowledgements
We thank our colleagues from BenevolentAI, University College London, and Farr Institute of Health
Informatics for much useful discussion and advice, in particular John Overington (currently
Medicines Discovery Catapult), Nikki Robas, Bryn Williams-Jones, Chris Finan, and Peter Cox. We also
thank clinicaltrials.gov team for their assistance and advice.
Funding
This study received no specific external funding from any institution in the public, commercial, or not
for profit sectors and was conducted by Benevolent Bio as part of its normal business.
Competing interests
All authors have completed the ICMJE uniform disclosure form at
http://www.icmje.org/coi_disclosure.pdf and declare: no financial relationships with any
organisations that might have an interest in the submitted work and no other relationships or
activities that could appear to have influenced the submitted work.
Licence
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non
Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this
work non-commercially, and license their derivative works on different terms, provided the original
work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-
nc/3.0/.
Contributorship
Contributors: MZ, MD, AH, and JH conceived and designed this study. MZ and MD acquired the
data. MZ, MD, AH, and JH analysed and interpreted the data. The initial manuscript was drafted by
MZ; all authors critically revised the manuscript and approved its final version. MZ takes
responsibility for the overall integrity of the data and the accuracy of the data analysis. JH is the
guarantor.
Transparency
The authors affirm that the manuscript is an honest, accurate, and transparent account of the study
being reported; that no important aspects of the study have been omitted; and that any
discrepancies from the study as planned (and, if relevant, registered) have been explained.
Data sharing
All the data used in the study are available from public resources. The dataset can be made available
on request from the authors.
Patient consent
Not required
Ethical approval
Not required
Page 14 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyReferences
1. Sibbald B and Roland M. Understanding controlled trials. Why are randomised controlled
trials important? BMJ 1998;316(7126):201.
2. Lindsay MA. Target discovery. Nat Rev Drug Discov 2003;2(10):831.
3. Sackett DL, Rosenberg WMC, Gray JAM, et al. Evidence based medicine: what it is and what
it isn't. BMJ 1996;312:71.
4. Woolf SH. The meaning of translational research and why it matters. JAMA 2008;299(2):211-
213.
5. Kendall J. Designing a research project: randomised controlled trials and their principles.
Emerg Med J 2003;20(2): 164-168.
6. Simes RJ. Publication bias: the case for an international registry of clinical trials. J Clin Oncol
1986;4(10):1529-1541.
7. Yamey G. Scientists who do not publish trial results are “unethical”. BMJ
1999;319(7215):939.
8. Food and Drug Administration Modernization Act (FDAMA) of 1997. Available from:
https://www.fda.gov/RegulatoryInformation/LawsEnforcedbyFDA/SignificantAmendmentst
otheFDCAct/FDAMA/default.htm; accessed 09/01/2018.
9. Food and Drug Administration Amendments Act (FDAAA) of 2007. Available from:
https://www.fda.gov/drugs/guidancecomplianceregulatoryinformation/guidances/ucm0649
98.htm; accessed 09/01/2018.
10. Zarin DA and Tse T. Moving towards transparency of clinical trials. Science
2008;319(5868):1340.
11. Zarin DA, Tse T, Williams RJ, et al. Trial reporting in ClinicalTrials. gov—the final rule. N Engl J
Med 2016;375(20):1998-2004.
12. ICMJE's clinical trial registration policy. Available from:
http://www.icmje.org/recommendations/browse/publishing-and-editorial-issues/clinical-
trial-registration.html; accessed 09/01/2018.
13. WHO International Clinical Trials Registry Platform (ICTRP): Primary Registries. Available
from: http://www.who.int/ictrp/network/primary/en/; accessed 09/01/2018.
14. Hirsch BR, Califf RM, Cheng SK, et al. Characteristics of oncology clinical trials: insights from a
systematic analysis of ClinicalTrials.gov. JAMA Inter Med 2013;173(11):972-979.
15. Zarin DA, Ide NC, Tse T, et al. Issues in the registration of clinical trials. JAMA
2007;297(19):2112-2120.
16. Califf RM, Zarin DA, Kramer JM, et al. Characteristics of clinical trials registered in
ClinicalTrials.gov, 2007-2010. JAMA 2012;307(17):1838-1847.
17. Chen R, Desai NR, Ross JS, Zhang W, et al. Publication and reporting of clinical trial results:
cross sectional analysis across academic medical centers. BMJ 2016;352:i637.
18. Hakala A, Kimmelman J, Carlisle B, et al. Accessibility of trial reports for drugs stalling in
development: a systematic assessment of registered trials. BMJ 2015;350:h1116.
19. Mathieu S,Boutron I, Moher D, et al. Comparison of registered and published primary
outcomes in randomized controlled trials. JAMA 2009;302(9):977-984.
20. Powell-Smith A and Goldacre B. The TrialsTracker: Automated ongoing monitoring of failure
to share clinical trial results by all major companies and research institutions. F1000Res
2016;5:2629.
21. Ross JS, Mulvey GK, Hines EM, et al. Trial publication after registration in ClinicalTrials. Gov:
a cross-sectional analysis. PLoS Med 2009;6(9):e1000144.
22. Small drug companies grow as big pharma outsources. Available from:
http://www.telegraph.co.uk/business/2017/01/22/top-50-privately-owned-pharma-
companies-britain/; accessed 09/01/2018.
23. Barden CJ and Weaver DF. The rise of micropharma. Drug Discov Today 2010;15(3):84-87.
Page 15 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only24. Munos B. Lessons from 60 years of pharmaceutical innovation. Nat Rev Drug Discov
2009;8(12):959-968.
25. Munos BH and Chin WW. How to revive breakthrough innovation in the pharmaceutical
industry. Sci Transl Med 2011;3(89):89cm16.
26. Pammolli F, Magazzini L, and Riccaboni M. The productivity crisis in pharmaceutical R&D.
Nat Rev Drug Discov 2011;10(6): 428-438.
27. Thiers FA, Sinskey AJ, and Berndt ER. Trends in the globalization of clinical trials. Nat Rev
Drug Disov 2008;7(1):13.
28. Chondrogiannis E, Andronikou V, Tagaris A, et al., A novel semantic representation for
eligibility criteria in clinical trials. J Biomed Inform 2017;69:10-23.
29. Coletti MH and Bleich HL. Medical subject headings used to search the biomedical literature.
J Am Med Inform Assoc 2001;8(4):317-323.
30. Jensen LJ, Saric J, and Bork P. Literature mining for the biologist: from information retrieval
to biological discovery. Nat Rev Genet 2006;7(2):119-129.
31. Hay M, Thomas DW, Craighead JL, et al. Clinical development success rates for
investigational drugs. Nat Biotechnol 2014;32(1):40.
32. ClinicalTrials.gov Glossary of common site terms. Available from:
https://clinicaltrials.gov/ct2/about-studies/glossary; accessed 09/01/2018.
33. ClinicalTrials.gov Protocol registration data element definitions
for interventional and observational studies. Available from:
http://prsinfo.clinicaltrials.gov/definitions.html; accessed 09/01/2018.
34. Top 50 Global Pharma Companies. Annual ranking published by Pharmacutical Executive.
Available from: https://www.rankingthebrands.com/The-Brand-
Rankings.aspx?rankingID=370; accessed 09/01/2018.
35. Huser V and Cimino JJ. Linking ClinicalTrials. gov and PubMed to track results of
interventional human clinical trials. PLoS One 2013;8(7): e68409.
36. Ramsey S and Scoggins J. Commentary: practicing on the tip of an information iceberg?
Evidence of underpublication of registered clinical trials in oncology. Oncologist 2008;13(9):
925-929.
37. Lokker C, Haynes RB, Wilczynski NL, et al. Retrieval of diagnostic and treatment studies for
clinical use through PubMed and PubMed's Clinical Queries filters. J Am Med Inform Assoc
2011;18(5):652-659.
38. SCImago, SJR — SCImago Journal & Country Rank. Available from: www.scimagojr.com;
accessed 09/01/2018.
39. Anderson ML, Chiswell K, Peterson ED, et al. Compliance with results reporting at
ClinicalTrials.gov. New Engl J Med 2015;372(11):1031-1039.
40. Prayle AP, Hurley MN, and Smyth AR. Compliance with mandatory reporting of clinical trial
results on ClinicalTrials. gov: cross sectional study. BMJ 2012;344: d7373.
41. Manzoli L, Flacco ME, D'Addarion M, et al., Non-publication and delayed publication of
randomized trials on vaccines: survey. BMJ 2014;348:g3058.
42. Bala MM, Akl EA, Sun X, et al., Randomized trials published in higher vs. lower impact
journals differ in design, conduct, and analysis. J Clin Epidemiol 2013;66(3): 286-295.
43. Riveros C, Dechartres P, Perrodeau E, et al. Timing and completeness of trial results posted
at ClinicalTrials. gov and published in journals. PLoS Med 2013;10(12): e1001566.
44. Miller JE, Wilenzick M, Ritcey N, et al. Measuring clinical trial transparency: an empirical
analysis of newly approved drugs and large pharmaceutical companies. BMJ Open
2017;7(12):e017917.
45. Jones CW, Handler L, Crowell KE, et al. Non-publication of large randomized clinical trials:
cross sectional analysis. BMJ 2013;347: f6104.
46. Ross JS, Tse T, Zarin DA, et al. Publication of NIH funded trials registered in ClinicalTrials. gov:
cross sectional analysis. BMJ 2012;344:d7292.
Page 16 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only47. Chan AW, Song F, Vickers A, et al. Increasing value and reducing waste: addressing
inaccessible research. Lancet 2014;383(9913): 257-266.
48. Schmucker C, Schell LK, Portalupi S, et al. Extent of non-publication in cohorts of studies
approved by research ethics committees or included in trial registries. PloS One
2014;9(12):e114023.
49. Laterre PF and François B. Strengths and limitations of industry vs. academic randomized
controlled trials. Clin Microbiol Infect 2015;21(10): 906-909.
50. Martin L, Mutchens M, Hawkins C, et al. How much do clinical trials cost? Nature Rev Drug
Discov 2017;16(6): 381-382.
51. Ehlers MD. Lessons from a recovering academic. Cell 2016;165(5):1043-1048.
52. Novartis Position on Clinical Study Transparency – Clinical Study Registration, Results
Reporting and Data Sharing. Available from:
https://www.novartis.com/sites/www.novartis.com/files/clinical-trial-data-
transparency.pdf; accessed 09/01/2018.
53. Pfizer: Trial Data & Results. Available from: https://www.pfizer.com/science/clinical-
trials/trial-data-and-results; accessed 09/01/2018.
54. GSK: Data transparency. Available from: https://www.gsk.com/en-gb/behind-the-
science/innovation/data-transparency/; accessed 09/01/2018.
55. NIH Policy on the Dissemination of NIH-Funded Clinical Trial Information. Available from:
https://grants.nih.gov/grants/guide/notice-files/NOT-OD-16-149.html; accessed
09/01/2018.
56. Hopewell S, Loudon K, Clarke MJ, et al. Publication bias in clinical trials due to statistical
significance or direction of trial results. Cochrane Database Syst Rev 2009;1:MR000006.
57. Stern JM and Simes RJ. Publication bias: evidence of delayed publication in a cohort study of
clinical research projects. BMJ 1997;315(7109):640-645.
58. Laine C, Horton R, DeAngelis CD, et al. Clinical trial registration—looking back and moving
ahead. Lancet 2007;369(9577):1909-1911.
59. Viergever RF and Li K. Trends in global clinical trial registration: an analysis of numbers of
registered clinical trials in different parts of the world from 2004 to 2013. BMJ Open
2015;5(9):e008932.
Page 17 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyFigure captions
Fig 1. Funding source distribution for all trials registered with clinicaltrials.gov. A: Trends in funding
source distribution for all pharmaceutical trials conducted after Jan 1, 1997 until July 19, 2017
(based on the start_date data field). B: Overall funding source distribution for all registered
pharmaceutical studies.
Fig 2. Trial design properties. Each radar chart illustrates the differences between three groups of
trials (represented by coloured polygons) with respect to fifteen trial protocol characteristics
(individual radial axes). Each axis shows the fraction of trials with given property, such as reported
use of randomization, blinding, or data monitoring committees (DMC). A: Trial characteristics by
clinical phase. B: Characteristics of Phase 2 treatment-oriented trials for three representative
diseases. The profile of glioblastoma (with many small non-randomized studies) is typical of
oncology trials; see Suppl. File 2.
Fig 3. Phase 2 trial properties by funding source. Detailed statistics across 4 funding categories and
remaining clinical phases are available in Suppl. File 4.
Fig 4. Changes in clinical trial design over time. Plots compare the characteristics of all Phase 2 trials
(regardless of completion status) divided into four temporal subsets based on their starting year.
Additional results are available in Suppl. File 2.
Fig 5. Characteristics of trials whose results were published as a journal article or through a direct
submission to the clinicaltrials.gov results database. Trial published as an article are divided into
high and low-impact classes based on the 2016 H-index of the corresponding journal; see Methods.
Amongst 945 trials published in high-impact journals (H index > 400), 459 (48.6%) were funded by
Big Pharma, 178 by Other, 169 by Small Pharma, and 139 by NIH. Detailed statistics available in
Suppl. File 3.
Fig 6. Trial results dissemination rates by trial funding category. A: Trial dissemination rates by
funder type: overall dissemination rate, journal publications, and submission of structured results to
clinicaltrials.gov. B: Time to first results dissemination (whether through submission to
clinicaltrials.gov results database or publication in a journal article). The vertical line indicates the
12-month deadline for reporting mandated by FDAAA. C: Time to results reporting on
clinicaltrials.gov. D: Time to journal article publication.
Fig 7. Trial results dissemination rates and funding source distribution across seven medical
specialties. A: For each specialty, the plot shows the fraction of completed Phase 2-4 trials whose
results were publicly disseminated, either as a journal article or through direct submission to
clinicaltrials.gov results database. B: Funding source distribution by medical specialty.
Page 18 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Funding source distribution for all trials registered with clinicaltrials.gov. A: Trends in funding source distribution for all pharmaceutical trials conducted after Jan 1, 1997 until July 19, 2017 (based on the start_date data field). B: Overall funding source distribution for all registered pharmaceutical studies.
161x59mm (300 x 300 DPI)
Page 19 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Trial design properties. Each radar chart illustrates the differences between three groups of trials (represented by coloured polygons) with respect to fifteen trial protocol characteristics (individual radial axes). Each axis shows the fraction of trials with given property, such as reported use of randomization, blinding, or data monitoring committees (DMC). A: Trial characteristics by clinical phase. B: Characteristics of Phase 2 treatment-oriented trials for three representative diseases. The profile of glioblastoma (with
many small non-randomized studies) is typical of oncology trials; see Suppl. File 2.
161x58mm (300 x 300 DPI)
Page 20 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Changes in clinical trial design over time. Plots compare the characteristics of all Phase 2 trials (regardless of completion status) divided into four temporal subsets based on their starting year. Additional results are
available in Suppl. File 2.
307x217mm (300 x 300 DPI)
Page 21 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Characteristics of trials whose results were published as a journal article or through a direct submission to the clinicaltrials.gov results database. Trial published as an article are divided into high and low-impact
classes based on the 2016 H-index of the corresponding journal; see Methods. Amongst 945 trials published
in high-impact journals (H index > 400), 459 (48.6%) were funded by Big Pharma, 178 by Other, 169 by Small Pharma, and 139 by NIH. Detailed statistics available in Suppl. File 3.
172x114mm (300 x 300 DPI)
Page 22 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Trial results dissemination rates by trial funding category. A: Trial dissemination rates by funder type: overall dissemination rate, journal publications, and submission of structured results to clinicaltrials.gov. B: Time to first results dissemination (whether through submission to clinicaltrials.gov results database or
publication in a journal article). The vertical line indicates the 12-month deadline for reporting mandated by FDAAA. C: Time to results reporting on clinicaltrials.gov. D: Time to journal article publication.
324x242mm (300 x 300 DPI)
Page 23 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Trial results dissemination rates and funding source distribution across seven medical specialties. A: For each specialty, the plot shows the fraction of completed Phase 2-4 trials whose results were publicly
disseminated, either as a journal article or through direct submission to clinicaltrials.gov results database. B: Funding source distribution by medical specialty.
161x64mm (300 x 300 DPI)
Page 24 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
249,904 clinicalstudiesregisteredwith
clinicaltrials.govuntilJuly19,2017
119,840 interventional(bio)pharmaceutical
clinicalstudies
Excludedstudies (N=130,064)• 50,111non-interventional studies• 79,953studiesnotevaluating
drugsorbiologics
45,620 completedstudiesexpectedtopublish their
results
Excludedstudies (N=74,220):• 14,873studieswithmissingstart
orcompletiondateinformation• 45,458studiesnotyetcompleted• 7,241studiescompletedbefore
Jan2006• 6,648studiescompletedafterJuly
19,2015(lessthan24monthsago)
• 0studiesthatfiledanapplicationtopostpone resultsreporting
27,835 completedstudiesfromclinicalphases
coveredbythemandateforresultspublication
(Phases2-4)
Excludedstudies (N=17,785)• 4,149studieswithnophase
assignment• 13,636studieswithphasesother
than2-4
Overviewana
lysis
Mainan
alysis
(tria
ldesignan
ddissem
ination)
Page 25 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
1
Appendix
Impactoffundingsourceanddiseasestudiedontrialdisseminationanddesign.Alargeintegrativeanalysis.
MagdalenaZwierzyna1,2,*,MarkDavies1,AroonHingorani2,3,JackieHunter1,41BenevolentBioLtd,London,UnitedKingdom;2InstituteofCardiovascularScience,UniversityCollegeLondon,London,UnitedKingdom;3FarrInstituteofHealthInformatics,London,UnitedKingdom;4StGeorge’sHospitalMedicalSchool,London,UnitedKingdom
Methods........................................................................................................................................2Annotatingcontrolledtrials...................................................................................................................2MeSHtermassignmentanddiseasecategorymapping........................................................................2Categorizingindustrialtrialsponsors.....................................................................................................3Trialmetadata:fielddefinitions.............................................................................................................3
Results...........................................................................................................................................5Overallcharacteristicsofstudiesregisteredwithclinicaltrials.gov.......................................................5Trialsacrossclinicalphases....................................................................................................................5
Table1:Trialcharacteristics..............................................................................................................5Figure1:Fundingsourcedistributionperclinicalphase....................................................................6
Trialsacrossdiseases..............................................................................................................................7Figure2:Diseasesstudiedinclinicaltrialsconductedovertime.......................................................7Figure3:Mapsofclinicaltriallocationsfortwodiseaseareas.........................................................7Figure4:Trialcharacteristics:cancervsnon-cancertrials................................................................8
Trialsdesignbyfundertype...................................................................................................................9Figure5:Phase1trialcharacteristicsbyfundingsource...................................................................9Figure6:Phase1trialcharacteristicsbyfundingsource...................................................................9Figure7:Characteristicsoftrialssponsoredbybigandsmallindustrialorganizations(acrossallclinicalphases).................................................................................................................................10
Trialdisseminationrates......................................................................................................................10Figure8:Characteristicsoftrialswhoseresultswerepublishedonlythroughadirectsubmissiontotheclinicaltrials.govresultsdatabase(nojournalpublication)andtrialswhoseresultswerenotpublishedinanyform.......................................................................................................................10Figure9:Trialresultsdisseminationratesbyclinicalphase............................................................11Table2:Disseminationratesandtimetoresultsdisclosurebysponsortype..................................11Figure10:Trendsintrialreportingonclinicaltrials.govovertime..................................................11
Globalizationofclinicalresearch.........................................................................................................11Figure11:Trendsinclinicaltriallocations.......................................................................................12Figure12:Locationsoftrialsregisteredwithclinicaltrials.govforfourtemporalsubsets..............12
Changesintrialcomplexity..................................................................................................................12Figure13:Averagenumberofeligibilitycriteriaandoutcomemeasuresusedinclinicaltrials......13
References..................................................................................................................................14
Page 26 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
2
MethodsAnnotatingcontrolledtrialsPrior to the analysis, all trials were annotated as controlled or non-controlled based on theinformation in the clinicaltrials.gov registry. Due to the inconsistencies in the trial records, theinformationabouttheuseofcontrolswassoughtinboth:structureddatafields(interventionname,armtype)andunstructureddatafields(brieftitle,officialtitle,briefsummary,detaileddescription).Thekeyword-basedsearchesweredevelopedfollowingamanualanalysisofasubsetof500randomlyselectedclinicaltrials.Aninterventionaltrialwastaggedas“placebo-controlled”if:
• Ithasanarmoftype“Placebocomparator”,“Nointervention”or“Shamcomparator”• Ithasaninterventionwhosenameincludesanyofthefollowingphrases:“placebo”,“vehicle”,
“sugarpill”• It’sbrieftitle,officialtitle,briefsummaryordetaileddescriptioncontainsanyofthefollowing
phrases:“placebo”,“vehicle”,“sugarpill”Aninterventionaltrialwastaggedas“comparator-controlled”if:
• Ithasanarmoftype“Activecomparator”• It’sbrieftitle,officialtitle,briefsummaryordetaileddescriptioncontainsanyofthefollowing
phrases: “active-controlled”, “active comparator”, “comparative study”, “non-inferiority”,“standardtherapy”,“standardofcare”,“standardtreatment”
Aninterventionaltrialwastaggedas“controlled”if:• Ithasbeentaggedaseither“placebo-controlled”or“comparatorcontrolled”• Ithasaninterventionwhosenameincludestheword“control”• It’sbrieftitle,officialtitle,briefsummaryordetaileddescriptioncontainsanyofthefollowing
phrases:“control”,“controlled”MeSHtermassignmentanddiseasecategorymappingToenableanalysisofclinicaltrialsbydiseasecategories,trialconditionsweremappedtoMeSHtermsusingadictionary-basednamedentityrecognitionapproachwithcustomdictionariescreatedthroughintegrationofMeSHterms (branchesCandF)withDiseaseOntology.A totalof171,772 (86.04%)conditiontermswereassignedaMeSHterm,correspondingto21,884(70.92%)ofdistinctdiseasestringsfrom105,453(88%)ofthetrials.Theannotationsweremanuallyevaluatedonasubsetof500randomlyselectedtrialconditions.Manyconditionswerenotgroundedastheydonotrepresentadisease,e.g.“Healthyvolunteers”,“Contraception”,“Anesthesia”,“Smokingcessation”,etc.FollowingMeSHtermassignment,thediseasesweremappedtodiseasecategoriesbasedonmatchingtermsfromlevels2and3oftheMeSHtermhierarchyasspecifiedinthetablebelow.
MeSHtreeterm(s) DiseasecategoryC15.378 HematologyC01;C02;C03 InfectiousdiseaseC11 OphthalmologyC18 MetabolicdiseaseC06 GastroenterologyC20.543 AllergyC19 EndocrineC08 RespiratorydiseaseC12;C13 UrologyC20.111 Autoimmunedisease
Page 27 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
3
C10 NeurologyC14 CardiovasculardiseaseF01;F02;F03;F04 PsychiatryC04 Oncology
CategorizingindustrialtrialsponsorsFor detailed analysis of industry-led trials, industrial organizations were further classified as “BigPharma”or“SmallPharma”.Wefirstgeneratedalistof50toppharmaceuticalcompanies,basedonthe2016PharmaceuticalExecutiveranking[1].Next,weappendedthelistwith11companiesthatwere not included in the 2016 ranking, but sponsored more than 100 trials registered withclinicaltrials.gov. Next, for each company we developed a regular expression to match the “leadsponsor” field in the registry; the expressions might include official company names (e.g.GlaxoSmithKline),acronyms(GSK),andpreviousnames(GlaxoWellcome,SmithKlineBeecham).Theperformanceoftheregularexpressionswasmanuallyevaluatedonasubsetofthedata.Big Pharma companies include: Pfizer,Novartis, Roche,Merck, Sanofi, Johnson& Johnson,GileadSciences,GSK,AbbVie,Amgen,AstraZeneca,Allergan,TevaPharmaceuticalIndustries,Bristol-MyersSquibb,EliLilly,Bayer,NovoNordisk,BoehringerIngelheim,Takeda,Celgene,AstellasPharma,Shire,Mylan,BiogenIdec,DaiichiSankyo,CSL,MerckKGaA,ValeantPharmaceuticalsInternational,Otsuka,SunPharmaceuticalIndustries,EisaiCo,LesLaboratoiresServier,EndoHealthPharmaceuticals,UCB,Abbott,Fresenius,ChugaiPharmaceuticals,Grifols,RegeneronPharmaceuticals,DainipponSumitomoPharma,AlexionPharmaceuticals,MalickrodtPharmaceuticals,Menarini,MitsubishiTanabePharma,Luin,Actelion,AspenPharmacare,KyowaHakkoKirin,OnoPharmaceutical,FerringPharmaceuticals,Janssen, Genentech, Alcon, Dr. Reddy’s Laboratories, Forest Laboratories (now Allergan), FerringPharmaceuticals, Sunovion Pharmaceuticals, Actelion Pharmaceuticals, Leo Pharma, RanbaxyLabratories,MedImmune.Trialmetadata:fielddefinitionsThe following definitionswere derived from the “Glossary of Common Site Terms” and “ProtocolRegistrationDataElementDefinitions”pagesoftheclinicaltrials.govregistry[2,3].
1. Randomization(Allocation)A clinical trial design strategy used to assign participants to an arm of a study. Types ofAllocationsincluderandomizedandnonrandomized.Randomized:Participantsareassignedto intervention groups by chance;Nonrandomized: Participants are expressly assigned tointerventiongroupsthroughanon-randommethod,suchasphysicianchoice.
2. InterventionModelThe general design of the strategy for assigning interventions to participants in a clinicalstudy.TypesofInterventionModelsincludeSingleGroupdesign,Paralleldesign,Cross-overdesign,andFactorialdesign.
3. Masking(Blinding)Aclinicaltrialdesignstrategyinwhichoneormorepartiesinvolvedinthetrial,suchastheinvestigator or participants, do not know which participants have been assigned whichinterventions.TypesofMaskingincludeOpenLabel,SingleBlindMasking,andDoubleBlindMasking.
4. EnrolmentThenumberofparticipants inaclinical study (totalnumberofparticipants forcompletedstudiesorestimatednumberofparticipantsforongoingstudies)
5. ArmTypes
Page 28 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
4
Arm type is a general description of the clinical study arm. It identifies the role of theintervention that participants will receive. Types of arms include Experimental, ActiveComparator,PlaceboComparator,ShamComparator,andNoIntervention.
6. DurationoftheStudyTime(inmonths)betweenthestudystartdateandprimarycompletiondate.
7. NumberofOutcomeMeasuresAnoutcomemeasureistheplannedmeasurementdescribedintheprotocolthatisusedtodetermine the effect of interventions on participants in a clinical trial. For Observationalstudies,ameasurementorobservationthatisusedtodescribepatternsofdiseasesortraits,or associations with exposures, risk factors, or treatment. Types of Outcome MeasuresincludePrimaryOutcomeMeasureandSecondaryOutcomeMeasure.
8. NumberofEligibilityCriteriaEligibilitycriteriaarethekeystandardsthatpeoplewhowanttoparticipateinaclinicalstudymustmeetor the characteristics theymusthave.EligibilityCriteria includeboth inclusioncriteriaandexclusioncriteria.Forexample,astudymightonlyacceptparticipantswhoareaboveorbelowcertainages.
9. StudyPhaseFoodandDrugAdministration(FDA)descriptionsoftheclinicaltrialofadrugbasedonthestudy's characteristics, such as the objective and number of participants. There are fivephases:EarlyPhase1(Phase0),Phase1,Phase2,Phase3,Phase4.
10. LeadSponsorTheorganizationorpersonwhooverseestheclinicalstudyandisresponsibleforanalyzingthestudydata.
11. DiseaseCategoryTrialconditionisthedisease,disorder,syndrome,illness,orinjurythatisbeingstudied.OnClinicalTrials.gov,conditionsmayalso includeotherhealth-relatedissues,suchas lifespan,qualityoflife,andhealthrisks.DiseasecategoryisamainclinicalspecialtyassignedtothetrialconditionbasedonMeSHdiseasehierarchyasdescribedabove.
12. CountriesCountriesinwhichstudylocationsarebased.
13. NumberofLocationsLocationisanorganizationwheretheclinicalstudyisbeingconducted.
14. DisseminationofResultsPublic disclosure status of trial results; includes structured result submission to theclinicaltrials.govresultsdatabaseandpublicationinajournalarticle.
15. DMC(DataMonitoringCommittee)Agroupofindependentscientistswhomonitorthesafetyandscientificintegrityofaclinicaltrial.Thegroupcanrecommendtothestudysponsorthatthestudybestoppedifitisnoteffective,isharmingparticipants,orisunlikelytoserveitsscientificpurpose.Membersarechosenbasedonthescientificskillsandknowledgeneededtomonitortheparticulartrial.Alsoreferredtoasadatasafetyandmonitoringboard(DSMB).
Page 29 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
5
ResultsOverallcharacteristicsofstudiesregisteredwithclinicaltrials.govTable1reportsbasiccharacteristicsofalltrialsinthedataset.Registeredstudiesencompassavarietyofdesignsandvaryintermsofduration(IQR335-1218days),size(IQR26-157patients),numberofoutcomemeasures,andeligibilitycriteria.Mostregisteredstudiesareorientedtowardstreatment(78.5%) and prevention (8.6%). 14.3% of trials are international (involving locations in up to 60countries),31.7%reportedmultiplecollaboratingsponsors,and34.9%reportedinvolvementofdatamonitoringcommittees(DMCs).Overall,themajorityofregisteredtrialsaresmall(with62.4%studiesenrollinglessthan100patients)andsingle-site(62.6%studies);smalltrialsarecommoneveninPhase3wherealmostaquarterofallstudiesreportenrolmentsizelowerthan100.Inaddition,arelativelysmall fraction of studies reported the use of randomization (62.0%), control group (64.2%), andblinding(42.9%).Asreportedinthenextsections,thesestatisticsvarysubstantiallydependingontrialphase,sponsortype,anddiseasearea.TrialsacrossclinicalphasesTable1:TrialcharacteristicsBasictrialcharacteristicsacrossphases.Mediansandinterquartilerangesarereportedforcontinuousvariables;countsandpercentagesarereportedforcategoricaldata(withtotalnumberoftrialsinagivencategoryusedasdenominator).
Alltrials Phase1 Phase2 Phase3
Trialcount 119,840 23,388 32,295 22,573
Non-randomized 41266(34.4%) 11258(48.1%) 13273(41.1%) 3456(15.3%)
Randomized 74313(62.0%) 10880(46.5%) 17288(53.5%) 18701(82.8%)
Allocationmissing 4261(3.6%) 1250(5.3%) 1734(5.4%) 416(1.8%)
OpenLabel 64222(53.6%) 15688(67.1%) 18002(55.7%) 8750(38.8%)
SingleBlind 6383(5.3%) 905(3.9%) 961(3.0%) 992(4.4%)
DoubleBlind 45094(37.6%) 5701(24.4%) 11820(36.6%) 12102(53.6%)
Maskingmissing 4141(3.5%) 1094(4.7%) 1512(4.7%) 729(3.2%)
Multi-site 44775(37.4%) 6356(27.2%) 15281(47.3%) 12678(56.2%)
International 17173(14.3%) 1804(7.7%) 5564(17.2%) 7078(31.4%)
Page 30 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
6
Enrollment 60(IQR:26-157)
30.0(IQR:18-50)
52.0(IQR:27-116)
252.0(IQR:99-550)
Duration(days) 701.0(IQR:335-1,218)
366.0(IQR:92-974)
792.0(IQR:427-1,370)
730.0(IQR:396-1,249)
Thedatasetderivedfromclinicaltrials.govregistryencompassedPhase1(n=23,388),Phase2(n=32,295),andPhase3studies(n=22,573).AssummarizedbyTable1,trialsatdifferentstagesdifferwidelyintermsofdesigncharacteristicsandpublicationrates,tosomeextentreflectingthedifferenceintheirprimaryobjectives,i.e.first-in-humansafetystudiesinPhase1vsefficacyevaluationinlaterphases.Asexpected,Phase1studieshadsmallestenrolmentsize(median30),durationtime(median366days)aswellasfractionofRCTsasshownbythelowestpercentageofstudiesreportingtheuseofrandomization(46.5%),blinding(28.2%),andcontrolgroups(46.5%).Despite similar efficacy-oriented objectives, registered Phase 2 and Phase 3 trials have largelydifferentcharacteristics.WhilstmostPhase3studiesfollowRCTdesignparadigms,arelativelysmallfractionofPhase2trialsreportedtheuseofrandomization,blinding,andcontrolgroups.Specifically,only53.5%ofPhase2trialsarerandomized(comparedto82.8%inPhase3);58.3%arecontrolled(comparedto79.9%inPhase3);and39.6%areblinded(comparedto58%inPhase3).Phase2trialshavesmallerenrolmentsizewithmedian52vs252patients inPhase3.However,despitesmallerpatientpopulationsinPhase2,bothphaseshavecomparableduration:median792days(IQR427-1,370days)inPhase2andmedian730days(IQR396-1,249days)inPhase3.Inaddition,phase2trialshavemore eligibility criteria (suggesting amuch narrower target patient population) and are lesscommonlycarriedinmultiplelocations(47.3%vs56.2%inPhase3)oracrossmultiplecountries(17.2%vs31.4%),reflectinglowerrecruitmentneeds.Funding sourcedistributionvariedacross clinicalphases. Industrywas themain funding sourceofPhase1throughPhase3trials.Non-profitsponsors(NIHandOther)weremorelikelytofundPhase2andPhase4studies;SeeFig1.
Figure1:Fundingsourcedistributionperclinicalphase.Thegraphshowsthenumberoftrialsfundedbyfourdifferenttypesofinstitutionsacrossclinicalphases.
Page 31 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
7
TrialsacrossdiseasesTrials registered in clinicaltrials.govcovermultiplemedical specialties.Oncology is the singlemostrepresenteddisciplinewith29.1%trials,followedbyinfectiousdiseases,andcardiovasculardiseases.Closer analysis of trials conducted over time revealed several interesting trends in trial indicationfocus. These included the decline of industry-funded studies in cardiovascular diseases (withincreasingnumberofsuchstudiesfundedbyuniversitiesandothernon-profitorganizations)aswellastrendsinstudyingparticularconditions(Fig2).Forinstance,highgrowthinthenumberoftrialswasobservedforarthritisandpsoriasis,whilstseveralCNSdisordersincludingdepressionandbipolardisordershowednegativegrowth.Bycontrast,prostatecancer,lymphomaandseveralothercancersshowednoobvioustrend.Figure2:Diseasesstudiedinclinicaltrialsconductedovertime.
Sincedifferencesbetweendiseasesmight impacttherecruitment,design,andoverall“success”ofclinicaltrials[4-6],wenextinvestigatedthetrialvariabilityby7maindiseaseareasdefinedinHayetal.:oncology,neurology,autoimmune,endocrine,respiratory,cardiovascular,andinfectiousdiseases[4].Asinthesponsor-centeredanalysisdescribedinthemaintext,heterogeneitybetweentrialsbydiseasecategorywasmostevidentinPhase2.Theanalysisshowedlargeheterogeneityasafunctionofdiseasecategory,includingdiscrepanciesintrial design characteristics, geographic locations (Fig 3), and publication rates (Fig 7 in mainmanuscript).Trialsindifferentdiseaseareasvariedconsiderablyindurationandenrolmentsize.ThelongestPhase2 trialswere inoncology: (median1,157days), the shortest: in respiratorydiseases(median457days).Onthe levelof individualdiseases,Waldenstrommacroglobulinemia(atypeofnon-Hodgkin lymphoma), splenic marginal zone lymphoma, and anaplastic astrocytoma had thelongesttrialsonaverage,whilstallergicseasonalrhinitis,ocularhypertension,allergicconjunctivitis,andinfluenzawerethediseaseswithshortesttrials.Similarly,Phase2trialsconductedindifferentdiseaseareasdifferedsubstantiallyintermsofenrolmentsize.Thesmallesttrialswereinoncology(median51patients),thelargestininfectiousdiseases(median117patients),withinfluenzahavinglargestenrolmentsizeintheentiredataset.Figure3:Mapsofclinicaltriallocationsfortwodiseaseareas.Thechloroplethmapsshowthenumberoftrialsconductedpercountryforpsychiatryandinfectiousdiseases,demonstratingdifferencesintermsofgeographiclocationsbetweenclinicalresearchinthetwospecialties.
Page 32 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
8
Inadditiontosizeandlength,trialsvariedintermsofdesignstrategies.Cancertrials, inparticular,hadverydifferentcharacteristicscomparedtootherdiseaseareas.Forinstance,theyrarelyfollowedRCTdesign inPhase2asonly25.8%ofPhase2cancerstudiesreportedtheuseofrandomization,34.5%werecontrolled,and7.8%wereblinded(comparedto83.3%,85.2%,and79%trialsconductedfor respiratory diseases – the category with largest proportion of RCTs). Fig 4 compares thecharacteristicsofcancertrialstostudiesconductedinallotherindications.Figure4:Trialcharacteristics:cancervsnon-cancertrials.Three radar charts compare the characteristics of clinical trials in oncology with trials in otherspecialtiesacrossthreeclinicalphases.
Phase 1 Phase 3
Phase 2
Blinded
Open label
Randomized
Non-randomized
Placebo
Comparator
Non-controlled
Large
Multi-country
Long
Small
Crossover
Single group
Short
DMC
20%
40%
60%
80%
100%
Blinded
Open label
Randomized
Non-randomized
Placebo
Comparator
Non-controlled
Large
Multi-country
Long
Small
Crossover
Single group
Short
DMC
20%
40%
60%
80%
100%
Blinded
Open label
Randomized
Non-randomized
Placebo
Comparator
Non-controlled
Large
Multi-country
Long
Small
Crossover
Single group
Short
DMC
20%
40%
60%
80%
100%
CancerNon-cancer
Page 33 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
9
TrialsdesignbyfundertypeFigure5:Phase1trialcharacteristicsbyfundingsource.
Figure6:Phase3trialcharacteristicsbyfundingsource.
Page 34 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
10
Figure7:Characteristicsoftrialssponsoredbybigandsmallindustrialorganizations(acrossallclinicalphases).
TrialdisseminationratesFigure 8: Characteristics of trials whose results were published only through a directsubmissiontotheclinicaltrials.govresultsdatabase(nojournalpublication)andtrialswhoseresultswerenotpublishedinanyform.
Page 35 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
11
Figure9:Trialresultsdisseminationratesbyclinicalphase.Left:Trialdisseminationratesbytrialfundingsourceacrossclinicalphases.Right:Overalltrialdisseminationratesanddistinctionbetweenjournalpublicationsandsubmissionofstructuredresultstoclinicaltrials.gov.
Table2:Disseminationratesandtimetoresultsdisclosurebysponsortype.
Figure10:Trendsintrialreportingonclinicaltrials.govovertime.
Fundercategory
Fractionofpubliclydisclosedtrials
Mediandissemination
lag
Mediantimetoreportingon
clinicaltrials.gov
Mediantimetojournalpublication
Numberoftrials
Bigpharma 0.667 15.467 13.083 26.000 11,508
Smallpharma 0.452 21.767 20.833 25.767 6,119
NIH 0.600 17.883 18.167 18.283 2,417Other 0.406 18.933 17.767 19.933 7,649
Page 36 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
12
GlobalizationofclinicalresearchStudies registeredwith clinicaltrials.govhavebecomemore globalwith time, although thereexistlargediscrepanciesdependingon thesponsor type; seeFig11andFig4d in themainmanuscript.Overallnumberofcountriesrepresentedintheregistryincreasedfrom84in2001to185in2017and,inaddition,thegeographicaldistributionoftrialshasshiftedovertime;seeFig12.Forinstance,whenrankedbythenumberofregisteredclinicaltrials,Chinahasjumpedfromthe35thplacein2005to8thplace in2017.Thesechanges innumberandcompositionofcountries included inclinicaltrials.govreflectthewideruseofthisprimarilyAmericanregistryaswellastrendsintheglobalisationofclinicaltrialsingeneral.[7]Figure11:Trendsinclinicaltriallocations.
Figure12:Locationsoftrialsregisteredwithclinicaltrials.govforfourtemporalsubsets.
Page 37 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
13
ChangesintrialcomplexityAnalysis of clinicaltrials.gov data shows that the number of trial eligibility criteria andoutcomemeasuresusedintrialshasincreasedovertime(Fig12).Asthesereflectthenumberofproceduresthathavetobeperformedinatrialthistrendsuggeststhatthecomplexity–andhencethecost–ofrunningclinicaltrialshasincreasedinrecentyears.Figure13:Averagenumberofeligibilitycriteriaandoutcomemeasuresusedinclinicaltrials.
Page 38 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
14
References1. Top 50 Global Pharma Companies. Annual ranking published by Pharmacutical
Executive. Available from: https://www.rankingthebrands.com/The-Brand-Rankings.aspx?rankingID=370.Accessed21/06/2017.
2. ClinicalTrials.gov Glossary of common site terms. Available from:https://clinicaltrials.gov/ct2/about-studies/glossary.Accessed:19/12/2017.
3. ClinicalTrials.govProtocolregistrationdataelementdefinitionsfor interventional and observational studies. Available from:
prsinfo.clinicaltrials.gov/definitions.html.19/12/2017.4. Hay,M.,etal.,Clinicaldevelopmentsuccess rates for investigationaldrugs.Nature
Biotechnology,2014.32(1):p.40.5. Hwang,T.J.,etal.,Failureofinvestigationaldrugsinlate-stageclinicaldevelopment
andpublicationoftrialresults.JAMAInternalMedicine,2016.176(12):p.1826-1833.6. Lagakos,S.W.,Clinicaltrialsandrarediseases.2003,MassMedicalSoc.7. Moses,H.,etal.,Theanatomyofmedicalresearch:USandinternationalcomparisons.
JAMA,2015.313(2):p.174-189.
Page 39 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Trial characteristics and publication of trial results
The table shows the characteristics of published and unpublished trials, including studies whose results were published in high-impactjournals (2016 H-index > 400), lower impact journals (H-index < 50) as well as trials whose results were not published in a form of a journalarticle, but were submitted to the clinicaltrials.gov results database.
Medians and interquartile ranges are reported for continuous variables; counts and percentages are reported for categorical data (with totalnumber of trials in a given category used as denominator). Percentages might not sum to 100% as some categories are not mutually exclusive(for instance, some trials include both comparator and placebo control groups).
The information is based on the data from clinicaltrials.gov data and from our derived annotations.
• Information about controls used in trials was derived from intervention name, arm type, trial title, and description fields (see the Methodssection of the article).
• Trials were designated as “collaborative” if they were sponsored by more than one institution.
• Industrial organizations were classified as Big Pharma and Small Pharma based on their sales in 2016 as well as overall number ofregistered trials they sponsored (see Methods).
• Primary purpose of clinical trials was classified as “Other purpose” for the following types in clinicaltrials.gov: “Diagnostic”, “SupportiveCare”, “Health Services Research”, “Screening”, “Educational/Counseling/Training”, “Device Feasibility”, “Other”.
• Clinical phase was classified as “Other phase” for the following types in clinicaltrials.gov: Phase 0 (Early Phase 1), combined Phase1/Phase 2, combined Phase 2/Phase 3, and trials with no phase assigned.
• Trials were classified as “Large” if they enrolled more than 100 patients and as “Long” if their duration was longer than 365 days.
Page 40 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Table 1: Characteristics of published and unpublished clinical trials
High-impact journals Low-impact journals clinicaltrials.gov Unpublished
Trial count 1126 1548 8999 24926
Non-Randomized 108 (9.6%) 322 (20.8%) 3526 (39.2%) 8494 (34.1%)Randomized 1009 (89.6%) 1223 (79.0%) 5459 (60.7%) 15909 (63.8%)Allocation missing 9 (0.8%) 3 (0.2%) 14 (0.2%) 523 (2.1%)
Open Label 349 (31.0%) 627 (40.5%) 5043 (56.0%) 13177 (52.9%)Single Blind 41 (3.6%) 122 (7.9%) 451 (5.0%) 1457 (5.8%)Double Blind 725 (64.4%) 796 (51.4%) 3500 (38.9%) 9740 (39.1%)Masking missing 11 (1.0%) 3 (0.2%) 5 (0.1%) 552 (2.2%)
Comparator-controlled 511 (45.4%) 699 (45.2%) 3256 (36.2%) 8145 (32.7%)Placebo-controlled 654 (58.1%) 723 (46.7%) 3237 (36.0%) 9318 (37.4%)Non-controlled 164 (14.6%) 334 (21.6%) 3204 (35.6%) 9236 (37.1%)
Phase 1 58 (5.2%) 190 (12.3%) 904 (10.0%) 8003 (32.1%)Phase 2 232 (20.6%) 290 (18.7%) 2960 (32.9%) 5890 (23.6%)Phase 3 617 (54.8%) 497 (32.1%) 2157 (24.0%) 3395 (13.6%)Phase 4 105 (9.3%) 344 (22.2%) 1629 (18.1%) 3466 (13.9%)Other phase 65 (5.8%) 92 (5.9%) 601 (6.7%) 1778 (7.1%)Phase missing 49 (4.4%) 135 (8.7%) 748 (8.3%) 2394 (9.6%)
Collaborative 446 (39.6%) 416 (26.9%) 2748 (30.5%) 6930 (27.8%)
Big Pharma 500 (44.4%) 738 (47.7%) 4601 (51.1%) 8211 (32.9%)Small Pharma 211 (18.7%) 267 (17.2%) 1703 (18.9%) 6591 (26.4%)NIH 176 (15.6%) 44 (2.8%) 1129 (12.5%) 2007 (8.1%)Other 229 (20.3%) 494 (31.9%) 1491 (16.6%) 7983 (32.0%)U.S. Fed 10 (0.9%) 5 (0.3%) 75 (0.8%) 134 (0.5%)
DMC reported 645 (57.3%) 487 (31.5%) 2565 (28.5%) 7018 (28.2%)no DMC reported 227 (20.2%) 762 (49.2%) 4882 (54.3%) 12409 (49.8%)DMC missing 254 (22.6%) 299 (19.3%) 1552 (17.2%) 5499 (22.1%)
Crossover Assignment 36 (3.2%) 155 (10.0%) 846 (9.4%) 3773 (15.1%)
Page 41 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Factorial Assignment 43 (3.8%) 20 (1.3%) 90 (1.0%) 291 (1.2%)Parallel Assignment 915 (81.3%) 1053 (68.0%) 4886 (54.3%) 11977 (48.1%)Sequential Assignment 0 (0.0%) 0 (0.0%) 0 (0.0%) 3 (0.0%)Single Group Assignment 114 (10.1%) 313 (20.2%) 3163 (35.1%) 8081 (32.4%)Intervention model missing 18 (1.6%) 7 (0.5%) 14 (0.2%) 801 (3.2%)
Basic Science 8 (0.7%) 53 (3.4%) 288 (3.2%) 1576 (6.3%)Prevention 164 (14.6%) 220 (14.2%) 790 (8.8%) 2023 (8.1%)Treatment 922 (81.9%) 1173 (75.8%) 7227 (80.3%) 18345 (73.6%)Other purpose 17 (1.5%) 41 (2.6%) 335 (3.7%) 1089 (4.4%)Primary purpose missing 15 (1.3%) 61 (3.9%) 359 (4.0%) 1893 (7.6%)
Enrollment 387.5 (IQR: 1107.25-137.0) 104.0 (IQR: 323.0-42.0) 63.0 (IQR: 186.75-30.0) 48.0 (IQR: 109.0-24.0)Large 906 (80.5%) 809 (52.3%) 3518 (39.1%) 6837 (27.4%)Small 214 (19.0%) 734 (47.4%) 5480 (60.9%) 17652 (70.8%)Size category missing 6 (0.5%) 5 (0.3%) 1 (0.0%) 437 (1.8%)
Length 1003.5 (IQR: 1522.0-580.0) 488.0 (IQR: 823.0-243.0) 669.0 (IQR: 1216.0-334.0) 488.0 (IQR: 1034.0-184.0)Long 946 (84.0%) 921 (59.5%) 6329 (70.3%) 13522 (54.2%)Short 118 (10.5%) 573 (37.0%) 2634 (29.3%) 9372 (37.6%)Length Category missing 62 (5.5%) 54 (3.5%) 36 (0.4%) 2032 (8.2%)
Multi-site 804 (71.4%) 724 (46.8%) 4215 (46.8%) 8264 (33.2%)Multi-country 522 (46.4%) 312 (20.2%) 1655 (18.4%) 2698 (10.8%)Multi-continent 391 (34.7%) 203 (13.1%) 1081 (12.0%) 1580 (6.3%)Number of locations 14.0 (IQR: 68.0-2.0) 2.0 (IQR: 17.0-1.0) 2.0 (IQR: 13.0-1.0) 1.0 (IQR: 3.0-1.0)
Number of eligibility criteria 14.0 (IQR: 23.0-8.0) 12.0 (IQR: 20.0-8.0) 13.0 (IQR: 22.0-8.0) 12.0 (IQR: 20.0-6.0)Number of outcomes 5.0 (IQR: 9.0-3.0) 4.0 (IQR: 7.0-2.0) 4.0 (IQR: 8.0-2.0) 3.0 (IQR: 5.0-2.0)
Page 42 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Trial characteristics across sponsors
The table shows the characteristics of clinical trials across four sponsor types available in clinicaltrials.gov (lead sponsor field). Each tablecovers trials of di↵erent clinical phases.
Medians and interquartile ranges are reported for continuous variables; counts and percentages are reported for categorical data (with totalnumber of trials in a given category used as denominator). Percentages might not sum to 100% as some categories are not mutually exclusive(for instance, some trials include both comparator and placebo control groups).
The information is based on the data from clinicaltrials.gov data and from our derived annotations.
• Information about controls used in trials was derived from intervention name, arm type, trial title, and description fields (see the Methodssection of the article).
• Trials were designated as “collaborative” if they were sponsored by more than one institution.
• Industrial organizations were classified as Big Pharma and Small Pharma based on their sales in 2016 as well as overall number ofregistered trials they sponsored (see Methods).
• Primary purpose of clinical trials was classified as “Other purpose” for the following types in clinicaltrials.gov: “Diagnostic”, “SupportiveCare”, “Health Services Research”, “Screening”, “Educational/Counseling/Training”, “Device Feasibility”, “Other”.
• Clinical phase was classified as “Other phase” for the following types in clinicaltrials.gov: Phase 0 (Early Phase 1), combined Phase1/Phase 2, combined Phase 2/Phase 3, and trials with no phase assigned.
• Trials were classified as “Large” if they enrolled more than 100 patients and as “Long” if their duration was longer than 365 days.
Page 43 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Table 1: Trial characteristics across sponsors (Phase 1)
Big pharma Small pharma NIH Other
Trial count 5123 3116 878 1300
Non-Randomized 1866 (36.4%) 1227 (39.4%) 470 (53.5%) 747 (57.5%)Randomized 3224 (62.9%) 1862 (59.8%) 294 (33.5%) 524 (40.3%)Allocation missing 33 (0.6%) 27 (0.9%) 114 (13.0%) 29 (2.2%)
Open Label 3243 (63.3%) 2007 (64.4%) 488 (55.6%) 874 (67.2%)Single Blind 269 (5.3%) 126 (4.0%) 17 (1.9%) 73 (5.6%)Double Blind 1596 (31.2%) 962 (30.9%) 209 (23.8%) 326 (25.1%)Masking missing 15 (0.3%) 21 (0.7%) 164 (18.7%) 27 (2.1%)
Comparator-controlled 1108 (21.6%) 852 (27.3%) 134 (15.3%) 328 (25.2%)Placebo-controlled 1704 (33.3%) 984 (31.6%) 237 (27.0%) 366 (28.2%)Non-controlled 2490 (48.6%) 1439 (46.2%) 494 (56.3%) 645 (49.6%)
Collaborative 754 (14.7%) 620 (19.9%) 508 (57.9%) 338 (26.0%)
DMC reported 457 (8.9%) 690 (22.1%) 327 (37.2%) 648 (49.8%)no DMC reported 3448 (67.3%) 2051 (65.8%) 93 (10.6%) 491 (37.8%)DMC missing 1218 (23.8%) 375 (12.0%) 458 (52.2%) 161 (12.4%)
Crossover Assignment 1759 (34.3%) 920 (29.5%) 57 (6.5%) 187 (14.4%)Factorial Assignment 22 (0.4%) 27 (0.9%) 17 (1.9%) 22 (1.7%)Parallel Assignment 1618 (31.6%) 982 (31.5%) 257 (29.3%) 381 (29.3%)Sequential Assignment 0 (0.0%) 3 (0.1%) 0 (0.0%) 0 (0.0%)Single Group Assignment 1692 (33.0%) 1151 (36.9%) 358 (40.8%) 670 (51.5%)Intervention model missing 32 (0.6%) 33 (1.1%) 189 (21.5%) 40 (3.1%)
Basic Science 899 (17.5%) 334 (10.7%) 38 (4.3%) 139 (10.7%)Prevention 141 (2.8%) 149 (4.8%) 155 (17.7%) 143 (11.0%)Treatment 3021 (59.0%) 2061 (66.1%) 611 (69.6%) 826 (63.5%)Other purpose 108 (2.1%) 94 (3.0%) 39 (4.4%) 91 (7.0%)Primary purpose missing 954 (18.6%) 478 (15.3%) 35 (4.0%) 101 (7.8%)
Enrollment 32.0 (IQR: 53.0-20.0) 30.0 (IQR: 48.0-20.0) 28.5 (IQR: 49.25-17.0) 21.0 (IQR: 38.5-12.0)
Page 44 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Large 350 (6.8%) 180 (5.8%) 63 (7.2%) 67 (5.2%)Small 4738 (92.5%) 2909 (93.4%) 785 (89.4%) 1216 (93.5%)Size category missing 35 (0.7%) 27 (0.9%) 30 (3.4%) 17 (1.3%)
Length 153.0 (IQR: 456.0-61.0) 153.0 (IQR: 485.0-61.0) 1127.0 (IQR: 1917.75-633.5) 638.0 (IQR: 1217.0-273.0)Long 1418 (27.7%) 890 (28.6%) 675 (76.9%) 830 (63.8%)Short 3470 (67.7%) 2124 (68.2%) 91 (10.4%) 407 (31.3%)Length Category missing 235 (4.6%) 102 (3.3%) 112 (12.8%) 63 (4.8%)
Multi-site 1391 (27.2%) 775 (24.9%) 234 (26.7%) 169 (13.0%)Multi-country 492 (9.6%) 161 (5.2%) 40 (4.6%) 17 (1.3%)Multi-continent 324 (6.3%) 81 (2.6%) 21 (2.4%) 7 (0.5%)Number of locations 1.0 (IQR: 2.0-1.0) 1.0 (IQR: 2.0-1.0) 1.0 (IQR: 2.0-1.0) 1.0 (IQR: 1.0-1.0)
Number of eligibility criteria 12.0 (IQR: 23.0-6.0) 17.0 (IQR: 26.0-9.0) 18.0 (IQR: 30.0-2.0) 15.0 (IQR: 23.0-9.0)Number of outcomes 4.0 (IQR: 7.0-2.0) 3.0 (IQR: 5.0-2.0) 3.0 (IQR: 5.0-2.0) 2.0 (IQR: 4.0-2.0)
Page 45 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Table 2: Trial characteristics across sponsors (Phase 2)
Big pharma Small pharma NIH Other
Trial count 4158 3078 1677 2535
Non-Randomized 1464 (35.2%) 777 (25.2%) 859 (51.2%) 1194 (47.1%)Randomized 2674 (64.3%) 2282 (74.1%) 661 (39.4%) 1283 (50.6%)Allocation missing 20 (0.5%) 19 (0.6%) 157 (9.4%) 58 (2.3%)
Open Label 2031 (48.8%) 1108 (36.0%) 1061 (63.3%) 1531 (60.4%)Single Blind 147 (3.5%) 134 (4.4%) 16 (1.0%) 112 (4.4%)Double Blind 1968 (47.3%) 1824 (59.3%) 469 (28.0%) 848 (33.5%)Masking missing 12 (0.3%) 12 (0.4%) 131 (7.8%) 44 (1.7%)
Comparator-controlled 1197 (28.8%) 926 (30.1%) 267 (15.9%) 812 (32.0%)Placebo-controlled 1863 (44.8%) 1716 (55.8%) 483 (28.8%) 852 (33.6%)Non-controlled 1471 (35.4%) 777 (25.2%) 964 (57.5%) 1057 (41.7%)
Collaborative 1188 (28.6%) 754 (24.5%) 1083 (64.6%) 697 (27.5%)
DMC reported 1038 (25.0%) 897 (29.1%) 702 (41.9%) 1219 (48.1%)no DMC reported 1937 (46.6%) 1749 (56.8%) 407 (24.3%) 953 (37.6%)DMC missing 1183 (28.5%) 432 (14.0%) 568 (33.9%) 363 (14.3%)
Crossover Assignment 282 (6.8%) 255 (8.3%) 61 (3.6%) 179 (7.1%)Factorial Assignment 19 (0.5%) 37 (1.2%) 29 (1.7%) 33 (1.3%)Parallel Assignment 2430 (58.4%) 2010 (65.3%) 582 (34.7%) 1097 (43.3%)Single Group Assignment 1404 (33.8%) 755 (24.5%) 790 (47.1%) 1135 (44.8%)Intervention model missing 23 (0.6%) 21 (0.7%) 215 (12.8%) 91 (3.6%)
Basic Science 36 (0.9%) 18 (0.6%) 18 (1.1%) 31 (1.2%)Prevention 412 (9.9%) 165 (5.4%) 127 (7.6%) 186 (7.3%)Treatment 3606 (86.7%) 2760 (89.7%) 1481 (88.3%) 2182 (86.1%)Other purpose 40 (1.0%) 80 (2.6%) 38 (2.3%) 89 (3.5%)Primary purpose missing 64 (1.5%) 55 (1.8%) 13 (0.8%) 47 (1.9%)
Enrollment 83.0 (IQR: 196.0-40.0) 76.5 (IQR: 159.0-36.0) 48.0 (IQR: 87.0-26.0) 43.0 (IQR: 75.0-25.0)Large 1877 (45.1%) 1275 (41.4%) 342 (20.4%) 432 (17.0%)
Page 46 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Small 2263 (54.4%) 1771 (57.5%) 1244 (74.2%) 2061 (81.3%)Size category missing 18 (0.4%) 32 (1.0%) 91 (5.4%) 42 (1.7%)
Length 638.0 (IQR: 1095.0-336.0) 456.0 (IQR: 789.0-243.0) 1431.0 (IQR: 2010.0-914.0) 942.0 (IQR: 1461.0-546.0)Long 2866 (68.9%) 1660 (53.9%) 1470 (87.7%) 1978 (78.0%)Short 1087 (26.1%) 1243 (40.4%) 73 (4.4%) 373 (14.7%)Length Category missing 205 (4.9%) 175 (5.7%) 134 (8.0%) 184 (7.3%)
Multi-site 2841 (68.3%) 1978 (64.3%) 779 (46.5%) 644 (25.4%)Multi-country 1507 (36.2%) 758 (24.6%) 184 (11.0%) 101 (4.0%)Multi-continent 1034 (24.9%) 419 (13.6%) 88 (5.2%) 27 (1.1%)Number of locations 9.0 (IQR: 26.0-2.0) 5.0 (IQR: 17.0-1.0) 1.0 (IQR: 8.0-1.0) 1.0 (IQR: 2.0-1.0)
Number of eligibility criteria 14.0 (IQR: 24.0-8.0) 16.0 (IQR: 24.0-9.0) 14.0 (IQR: 27.0-0.0) 13.0 (IQR: 19.0-7.0)Number of outcomes 5.0 (IQR: 8.0-2.0) 3.0 (IQR: 6.0-2.0) 3.0 (IQR: 5.0-2.0) 3.0 (IQR: 5.0-2.0)
Page 47 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Table 3: Trial characteristics across sponsors (Phase 3)
Big pharma Small pharma NIH Other
Trial count 4867 2186 428 1692
Non-Randomized 1019 (20.9%) 396 (18.1%) 18 (4.2%) 148 (8.7%)Randomized 3828 (78.7%) 1787 (81.7%) 398 (93.0%) 1541 (91.1%)Allocation missing 20 (0.4%) 3 (0.1%) 12 (2.8%) 3 (0.2%)
Open Label 1956 (40.2%) 695 (31.8%) 152 (35.5%) 638 (37.7%)Single Blind 148 (3.0%) 102 (4.7%) 17 (4.0%) 134 (7.9%)Double Blind 2749 (56.5%) 1379 (63.1%) 234 (54.7%) 873 (51.6%)Masking missing 14 (0.3%) 10 (0.5%) 25 (5.8%) 47 (2.8%)
Comparator-controlled 1972 (40.5%) 995 (45.5%) 195 (45.6%) 935 (55.3%)Placebo-controlled 2179 (44.8%) 1164 (53.2%) 219 (51.2%) 849 (50.2%)Non-controlled 1202 (24.7%) 349 (16.0%) 75 (17.5%) 226 (13.4%)
Collaborative 825 (17.0%) 464 (21.2%) 362 (84.6%) 520 (30.7%)
DMC reported 965 (19.8%) 673 (30.8%) 309 (72.2%) 803 (47.5%)no DMC reported 2147 (44.1%) 1181 (54.0%) 37 (8.6%) 621 (36.7%)DMC missing 1755 (36.1%) 332 (15.2%) 82 (19.2%) 268 (15.8%)
Crossover Assignment 192 (3.9%) 75 (3.4%) 14 (3.3%) 105 (6.2%)Factorial Assignment 27 (0.6%) 21 (1.0%) 23 (5.4%) 50 (3.0%)Parallel Assignment 3678 (75.6%) 1675 (76.6%) 325 (75.9%) 1271 (75.1%)Single Group Assignment 947 (19.5%) 398 (18.2%) 31 (7.2%) 192 (11.3%)Intervention model missing 23 (0.5%) 17 (0.8%) 35 (8.2%) 74 (4.4%)
Basic Science 7 (0.1%) 4 (0.2%) 0 (0.0%) 13 (0.8%)Prevention 715 (14.7%) 221 (10.1%) 55 (12.9%) 245 (14.5%)Treatment 4037 (82.9%) 1840 (84.2%) 320 (74.8%) 1321 (78.1%)Other purpose 51 (1.0%) 81 (3.7%) 52 (12.1%) 83 (4.9%)Primary purpose missing 57 (1.2%) 40 (1.8%) 1 (0.2%) 30 (1.8%)
Enrollment 350.0 (IQR: 674.0-144.0) 280.0 (IQR: 516.0-129.0) 255.0 (IQR: 607.0-129.5) 110.0 (IQR: 300.25-50.0)Large 4011 (82.4%) 1780 (81.4%) 342 (79.9%) 923 (54.6%)
Page 48 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
Small 841 (17.3%) 376 (17.2%) 77 (18.0%) 749 (44.3%)Size category missing 15 (0.3%) 30 (1.4%) 9 (2.1%) 20 (1.2%)
Length 607.0 (IQR: 973.0-365.0) 518.0 (IQR: 822.0-303.0) 1795.0 (IQR: 2517.75-1218.0) 881.0 (IQR: 1461.0-456.0)Long 3324 (68.3%) 1373 (62.8%) 390 (91.1%) 1217 (71.9%)Short 1235 (25.4%) 696 (31.8%) 8 (1.9%) 308 (18.2%)Length Category missing 308 (6.3%) 117 (5.4%) 30 (7.0%) 167 (9.9%)
Multi-site 3511 (72.1%) 1301 (59.5%) 292 (68.2%) 490 (29.0%)Multi-country 2141 (44.0%) 661 (30.2%) 127 (29.7%) 125 (7.4%)Multi-continent 1623 (33.3%) 386 (17.7%) 63 (14.7%) 62 (3.7%)Number of locations 25.0 (IQR: 66.0-5.0) 10.0 (IQR: 36.0-1.0) 10.0 (IQR: 42.0-1.0) 1.0 (IQR: 2.0-1.0)
Number of eligibility criteria 11.0 (IQR: 20.0-7.0) 13.0 (IQR: 21.0-8.0) 11.0 (IQR: 22.25-0.0) 11.0 (IQR: 17.0-7.0)Number of outcomes 6.0 (IQR: 10.0-3.0) 3.0 (IQR: 6.0-2.0) 4.0 (IQR: 7.0-2.0) 3.0 (IQR: 5.0-2.0)
Page 49 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyImpact of funding source and disease studied on clinical trial
dissemination and design
A large-scale integrative analysis of clinicaltrials.gov and PubMed data
Print abstract
Study question: How does the distribution, design characteristics, and dissemination of clinical trials
differ by funding organisation and medical specialty?
Methods: Data for 45,620 clinical trials evaluating small molecules, biologics and vaccines were
downloaded from clincialtrials.gov, processed, categorized by funder type and disease area, and
linked to disclosed study results. Design characteristics and result dissemination rates were
examined for studies funded by different types of organizations and across major medical
specialties.
Study answer and limitations: Industry was more likely to fund large randomized clinical trials and
to disseminate trial results compared to non-profit trial funders (59.3% vs 45.3%), with large
pharmaceutical companies having higher disclosure rates than small ones (66.7% vs 45.2%). Trials
reporting the use of randomization were more likely to be published in a journal article compared to
non-randomized studies (34.9% vs 18.2%), and journal publication rates varied across disease areas,
ranging from 42% for autoimmune diseases to 20% for oncology. The analysis was limited to studies
registered with clinicaltrials.gov and our method for linking registry records to associated
publications relied on the presence of registry identifier (NCT number) in the text or metadata of a
journal article.
What this study adds: Design characteristics (as a marker of methodological quality) vary by funder
and medical specialty, and associate with reporting rates and the likelihood of publication in high-
impact journals.
Funding, competing interests, data sharing: This study received no specific external funding. The
authors declare no competing interests. All the data used in the study are available from public
resources.
Page 50 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyImpact of funding source and disease studied on clinical trial
dissemination and design
A large-scale integrative analysis of clinicaltrials.gov and PubMed data
Magdalena Zwierzyna1,2,
*, Mark Davies1, Aroon D Hingorani
2,3, Jackie Hunter
1,4
1BenevolentBio Ltd, London, United Kingdom
2Institute of Cardiovascular Science, University College London, London, United
Kingdom
3Farr Institute of Health Informatics, London, United Kingdom
4St George’s Hospital Medical School, London, United Kingdom
*Corresponding author
Magdalena Zwierzyna
Email: [email protected] (MZ)
Tel: +44 (0) 203 096 0720
BenenevolentBio Ltd
40 Churchway
NW1 1LW London
United Kingdom
Word count: 5,084
Page 51 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyAbstract
Objective: To investigate the distribution, design characteristics, and dissemination of clinical trials
by funding organisation and medical specialty.
Design: Cross-sectional descriptive analysis.
Data sources: Trial protocol information from clinicaltrials.gov, metadata of journal articles in which
trial results were published (PubMed), and quality metrics of associated journals from SCImago
Journal & Country Rank database.
Selection criteria: All 45,620 clinical trials evaluating small-molecule therapeutics, biologics,
adjuvants, and vaccines, completed after January 2006 and prior to July 2015, including randomized
controlled trials (RCTs) and non-randomized studies across all clinical phases
Results: Industry was more likely to fund large international RCTs compared to non-profit funders,
although methodological differences have been decreasing with time. Amongst 27,835 completed
efficacy trials (Phase 2-4), 54.2% had disclosed their findings publicly. Industry was more likely to
disseminate trial results compared to non-profit trial funders (59.3% vs 45.3%), with large
pharmaceutical companies having higher disclosure rates than small ones (66.7% vs 45.2%). NIH-
funded trials were disseminated more often than those of other non-profit institutions (60.0% vs
40.6%). Big Pharma and NIH-funded study results were more likely to appear in clinicaltrials.gov
than those from non-profit funders, which were published mainly as journal articles. Trials reporting
the use of randomization were more likely to be published in a journal article compared to non-
randomized studies (34.9% vs 18.2%), and journal publication rates varied across disease areas,
ranging from from 42% for autoimmune diseases to 20% for oncology.
Conclusions: Trial design and result dissemination varies substantially depending on the type and
size of funding institution as well as the disease area under study.
What this paper adds
What is already known on this subject:
• Suboptimal study design and slow dissemination of findings are common amongst registered
clinical trials; almost half of all studies remaining unpublished years after completion.
• A wider range of organisations is now funding clinical trials
• Reporting rates and other quality measures may vary by the type of funding organization.
What this study adds:
• Design characteristics (as a marker of methodological quality) vary by funder and medical
specialty, and associate with reporting rates and the likelihood of publication in high-impact
journals.
• Interventional trials with industrial funding are, on average, more robustly designed than
those funded by non-commercial organizations.
• Industry-funded trials are also more likely to be disclosed than those from other funders,
although small pharmaceutical organizations lag behind “Big Pharma”.
• Trials with NIH funding are more likely to disclose results compared to other academic and
non-profit organizations.
• Big Pharma and NIH-funded studies show high rates of reporting within the clinicaltrials.gov
results database.
Page 52 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only• Trials funded by other academic and non-profit organizations were disclosed primarily as
journal articles, and rarely submitted to clinicaltrials.gov.
• Result dissemination rate varies by medical specialty, mainly due to differential journal
publication rates, ranging from 42% for autoimmune diseases to 20% for oncology.
INTRODUCTION
Well conducted clinical trials are widely regarded as the best source of evidence on the efficacy and
safety of medical interventions.[1] Trials of first-in-class drugs also provide the most rigorous test of
causal mechanisms in human disease.[2] Clinical trial findings inform regulatory approvals of novel
drugs, key clinical practice decisions and guidelines, and fuel the progress of translational
medicine.[3, 4] However, this model relies on trial activity being high-quality, transparent, and
discoverable.[5] Randomization, blinding, adequate power, and a clinically relevant patient
population are among the hallmarks of high-quality trials.[1, 5] In addition, timely reporting of
findings, including negative and inconclusive results, ensures fulfilment of ethical obligations to trial
volunteers and avoids unnecessary duplication of research or creating biases in clinical knowledge
base.[6, 7]
To improve the visibility and discoverability of clinical trials, the Food and Drug Administration (FDA)
mandated the registration of interventional efficacy trials and public disclosure of their results.[8-11]
Currently, 17 registries store records of clinical studies.[12, 13] Clinicaltrials.gov, the oldest and
largest of such platforms,[14, 15] serves as both a continually updated trial registry and a database
of trial results, thus offering an opportunity to explore, examine, and monitor the clinical research
landscape.[10, 14]
Previous studies of clinical research activity used registered trial protocols and associated published
reports to investigate trends in quality, reporting rate, and potential for publication bias (e.g.[14, 16-
21]). However, many previous studies relied on time-consuming manual literature searches, which
confined their scope to pivotal randomized controlled trials or research funded by major trial
sponsors. Large-scale analysis of relationships between a wide range of protocol characteristics and
result dissemination rates, and the differences between large and small organizations has not been
previously undertaken. This is important, given a reported growth in trials funded by small industrial
and academic institutions.[22-26]
To address this gap, we performed the most comprehensive analysis to date of all interventional
studies of small molecule and biologic therapies registered on clinicaltrials.gov, up to July 19, 2017.
We linked trial registry entries to the publication of findings in journal articles, and used descriptive
statistics, semantic enrichment, and data visualization methods to investigate and illustrate the
landscape and evolution of registered clinical trials. Specifically, we investigated the influence of
funder type and disease area on the design and publication rates of trials across a wide range of
medical journals; changes over time; and relationships to reported attrition rates in drug
development. To facilitate comparison across diverse study characteristics, we designed a novel
visualization method allowing rapid, intuitive inspection of basic trial properties.
Page 53 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyMETHODS
Our large-scale cross-sectional analysis and field synopsis of the current and historical clinical trial
landscape uses a combination of basic trial protocol information and published trial results. The
developed data processing and integration workflow manipulates and links data from
clinicaltrials.gov with information about journal articles in which trial results were published.
Clinicaltrials.gov data processing and annotation
We downloaded the clinicaltrials.gov database as of July 19, 2017 – nearly two decades after the
FDA Modernization Act (FDAMA) legislation requiring trial registration.[8] The primary dataset
encompassed 249,904 studies in total. We then identified records of all pharmaceutical and
biopharmaceutical trials, defined as clinical studies evaluating small-molecule therapeutics,
biologics, adjuvants, and vaccines.[27] To extract the relevant subset, the Study type field was
restricted to “interventional” and Intervention type was restricted to either “Drug” or “Biological”.
This led to a set of 119,840 clinical trials with a range of organizations, disease areas, and design
characteristics (Suppl. File 1).
To enable robust aggregate analyses, we processed and further annotated the dataset. Firstly, date
formats and age eligibility fields were harmonised. Next, wherever possible, missing categories were
inferred from available fields as described by Califf and colleagues[16]. For example, for single-arm
interventional trials with missing allocation and blinding type annotations, allocation was assigned as
“non-randomized” and blinding as “open label”.[16] The dataset was further enriched with derived
variables. For example, trial duration value was derived from available trial start and primary
completion date; trial locations were mapped to continents and flagged if they involved emerging
markets, and trials making use of placebo and/or comparator controls were labelled based on
information from intervention name, arm type, trial title, and description field. For instance, if a
study had an arm of type “Placebo Comparator” or an intervention whose name included words
“placebo”, “vehicle” or “sugar pill”, it was annotated as “placebo-controlled”; similarly, studies with
an arm of type “Active Comparator” were classified as “comparator-controlled”; see Suppl. File 2 for
details. Finally, each trial was annotated with the number of outcome measures and eligibility
criteria following parsing of the unstructured eligibility data field as described by Chondrogiannis et
al.[28]
To categorize clinical trials by medical specialty, we mapped reported disease condition names to
corresponding Medical Subject Headings (MeSH) terms (from both MeSH Thesaurus and MeSH
Supplementary Concept Records)[29] using a dictionary-based named entity recognition
approach.[30] A total of 21,884 distinct condition terms were annotated with MeSH identifiers,
accounting for 86.4% of all disease names from 88% of the clinical trials. The mapping enabled
normalization of various synonyms to preferred disease names as well as automatic classification of
diseases into distinct categories using the MeSH hierarchy. To compare our results with previously
published clinical trial success rates, we focused on seven major disease areas investigated in the
study by Hay et al.[31]: oncology, neurology, autoimmune, endocrine, respiratory, cardiovascular,
and infectious diseases. Trial conditions annotated with MeSH terms were mapped to these
Page 54 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Onlycategories based on matching terms from levels 2 and 3 of the MeSH term hierarchy; see Suppl. File
2 for details on MeSH term assignment and disease category mapping.
To categorize studies by funder type, we first extended the agency classification available from
clinicaltrials.gov. Specifically, we used two main categories: industry and non-profit organizations;
and four sub-categories: Big Pharma, Small Pharma, NIH, and Other; the last category corresponding
to universities, hospitals, and other non-profit research organizations.[32, 33] Industrial
organizations were classified as “Big Pharma” and “Small Pharma” based on their sales in 2016 as
well as overall number of registered trials they sponsored. Specifically, an industrial organization was
classed as “Big Pharma” if it featured on the list of 50 companies from the 2016 Pharmaceutical
Executive ranking[34] or if it sponsored more than 100 clinical trials; otherwise it was classified as
“Small Pharma”; see Suppl. File 2. Next, we derived the likely source of funding for each trial based
on lead_sponsor and collaborators data fields, using a previously described method[16] extended to
include the distinction between big and small commercial organizations. Briefly, if NIH was listed
either as the lead sponsor of a trial or a collaborator, the study was classified as NIH-funded. If NIH
was not involved, but either the lead sponsor or a collaborator was from industry, the study was
classified as funded by either “Big Pharma” if it included any Big Pharma organization or “Small
Pharma” otherwise. Remaining studies were assigned “Other” as the funding source.
Establishing links between trial registry records and published results
We linked the trial registry records to their disclosed results, either deposited directly on
clinicaltrials.gov or published as a journal article, to establish if the results of a particular trial had
been disseminated. We implemented the approach described by Powell-Smith and Goldacre[20] to
identify eligible trials that should have publicly disclosed results. These were studies concluded after
January 2006 and by July 2015 (to allow sufficient time for publication of results that had not filed an
application to delay submission of the results (based on the firstreceived_results_disposition_date
field in the registry).[20] We included studies from all organizations (even minor institutions)
regardless of how many trials they funded. The selection process led to a dataset of 45,620 studies,
including 27,835 efficacy trials (Phase 2-4); summarized by the flow diagram in Suppl. File 1.
For each eligible trial, we searched for published results using two methods. Firstly, we searched for
structured reports submitted directly to the results database of clinicaltrials.gov (based on the
clinical_results field). Next, we queried the PubMed database with NCT numbers of eligible trials
using an approach that involves searching for the NCT number in the title, abstract, and Secondary
Source ID field of PubMed-indexed articles.[20, 35, 36] To exclude study protocols, commentaries,
and other non-relevant publication types, we used the broad “Therapy” filter – a standard validated
PubMed search filter for clinical queries[37] – as well as simple keyword matching for “study
protocol” in publication titles.[20] In addition, we excluded articles published before trial
completion.
We then determined the proportion of publicly disseminated trial reports for the studies covered by
mandate for result submission (efficacy trials: Phase 2-4) as well as for all completed drug trials
including those not covered by the FDA Amendment Act (Phase 0-1 trials and studies with phase set
to “n/a”). In addition to result dissemination rate, we calculated the dissemination lag for individual
studies to determine the proportion of reports disclosed within the required 12 months after
Page 55 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Onlycompletion of a trial. We defined dissemination lag as time between study completion and first
public disclosure of the results (whether through submission to clinicaltrials.gov or publication in a
journal).
Categorizing scientific journals
To investigate the relation between characteristics of published trials and quality metrics of scientific
journals, we classified journals as high or low impact based on journal metrics information from
SCImago Journal & Country Rank, a publicly available resource that includes journal quality
indicators developed from the information contained in the Scopus database.[38] Specifically, a
journal was classified as high-impact if its 2016 H-index is larger than 400 and as low-impact journal
if its H-index was below 50.
Statistical analysis and data visualization
We examined the study protocol data on 15 characteristics available from clinicaltrials.gov and from
our derived annotations. These were: (1) randomization; (2) interventional study design (e.g. parallel
assignment, sequential assignment or crossover studies); (3) masking (open-label, single-blinded,
double-blinded studies); (4) enrolment (total number of participants in completed studies or
estimated number of participants for ongoing studies); (5) arm types (placebo or active comparator);
(6) duration; (7) number of outcomes (clinical endpoints); (8) number of eligibility criteria; (9) study
phase; (10) funding source; (11) disease category; (12) countries; (13) number of locations; (14)
publication of results; (15) reported use of DMCs (Data monitoring committees). Further
explanation and definitions of trial properties are available in Suppl. File 2.
Based on these criteria characteristics, we examined methodological quality of trials funded by
different types of organizations and compared studies published in higher and lower impact journals
and reported on clinicaltrials.gov. In addition, we analysed trial dissemination rates across funding
categories and disease areas and compared them with clinical success rates reported previously.[31]
We used descriptive statistics and data visualization methods to characterize trial categories. For
categorical data we report frequencies and percentages; using medians and interquartile ranges for
continuous variables. To visualize multiple features of clinical trial design, we used radar plots with
each axis showing the fraction of trials with a different property, such as reported use of
randomization, blinding or active comparators. We converted three continuous variables to binary
categories: we classified trials as “small” if they enrolled fewer than 100 participants, as “short” if
they lasted less than 365 days, and as “multi-country” if trial recruitment centres were in more than
one country. Further details are available in supplementary tables. To facilitate comparisons
between several radar plots, we always standardized the order and length of axes. We used Python
for all data analysis tasks (scikit-learn, pandas, scipy, and numpy libraries), working with geographical
data (geonamescache, geopy, and Basemap libraries) as well as for data processing and visualization
(matplotlib, Basemap, and seaborn libraries).
Patient involvement
Patients were not involved in any aspect of the study design, conduct, or in the development of the
research question. The study is an analysis of existing publicly available data and therefore there was
no active patient recruitment for data collection.
Page 56 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyRESULTS
Overview of clinical trial activity by funder type
To date, 119,840 pharmaceutical trials have been registered with clinicaltrials.gov. As illustrated by
Fig 1A, the distribution of trials funded by different types of organizations has changed over time. In
particular, there has been a rise in the proportion of studies funded by organizations other than
industry or NIH – mainly universities, hospitals and other academic and non-profit agencies,
classified here as “Other”. Overall, these institutions funded 36.5% of all registered pharmaceutical
trials, followed by Big Pharma with 31%, and Small Pharma with 21.2% of studies. NIH funded 11.3%
of registered pharmaceutical trials (Fig 1B).
Amongst the 10 main study funders ranked by the overall number of trials, two belonged to the
“NIH” category (National Cancer Institute leading with 7,219 trials) and eight to the “Big Pharma”
category (GSK at the top with 3,171 trials). Out of 4,914 “Small Pharma” organizations, 73.6% had
fewer than five trials, whilst 43.1% funded only one study. Similarly, amongst the 6,468 funders
classified as “Other”, 79.2% had fewer than five trials, whilst 32.7% funded only one study.
The approach outlined in the Methods section identified 45,620 clinical trials in our dataset that
should also have available published results. The remaining analysis in the Results section focuses on
this subset of studies.
Trial design characteristics
The dataset included various types of study designs from large randomized clinical trials (RCTs) to
small single-site studies, many of which lacked controls (Suppl. File 2). Comparative analysis showed
that methodological characteristics of trials varied substantially across clinical phases (Fig 2A),
disease areas (Fig 2B), and funding source (Fig 3). The following section will focus on a comparative
analysis by funder type whilst detailed overview of differences across phases and diseases is
available in Suppl. File 2.
Industrial organizations were more likely to fund large international trials and more often reported
the use of randomization, controls and blinding, compared to non-industrial funders. Whilst
differences were evident across all clinical phases (Suppl. File 2), the largest variability was observed
for Phase 2 trials, illustrated by Fig 3. For instance, 68.5% of all industry-funded Phase 2 trials were
randomized (compared to 39.4% and 50.6% of studies funded by NIH and other institutions,
respectively). Studies with industrial funding were also substantially larger (median 80 patients
compared to 43 and 48) and more likely to include international locations (31.3% for industry vs
4.0% for Other and 11.0% for NIH). Despite larger enrollment size, industry-funded studies were
shorter on average compared to trials by other funders (median 547 days compared to 1,431 days
for NIH and 941 for “Other” funding category). Non-industrial funders were more likely to report the
use of data monitoring committees (DMCs): 41.9% of trials with NIH funding and 48.1% of studies
funded by other organizations involved DMCs compared to 26.7% of trials funded by industry).
Design characteristics did not differ substantially between Big and Small Pharma (Suppl. File 2).
Page 57 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyTrial design changed substantially over time with the trends most evident in Phase 2. As shown on
Fig 4, the methodological differences between industry and non-industrial funders have generally
decreased over time. Although overall a higher proportion of industry-led studies reported the use
of randomization, the proportion of RCTs funded by non-profit organizations in Phase 2 has recently
increased (Fig 4). Notably, the use of blinding is still much lower in non-commercial than commercial
trials. Additional analysis showed that the average number of trial eligibility criteria and outcome
measures have increased over time, leading to a greater number of procedures performed in an
average study, and hence to increase in trial complexity (Suppl. File 2).
Trial results dissemination and characteristics of published trials
Of the 45,620 completed trials, 27,835 were efficacy studies in Phases 2-4 covered by the mandate
for results publication (Suppl. File 1). Of these, only 15,084 (54.2%) have disclosed their results
publicly. Consistent with previously reported trends,[39, 40] dissemination rates were lower for
earlier clinical phases (not covered by the publication mandate) as only 23.4% of trials in Phases 0-1
had publicly disclosed results (Suppl. File 2).
Detailed analysis of Phase 2-4 trials showed that only 25.2% of published studies disclosed their
results within the required period of 12 months after completion. 18.6 months was the median time
to first public reporting, whether through direct submission to clinicaltrials.gov or publication in a
journal. Dissemination lag was smaller for results submitted to clinicaltrials.gov (median 15.3
months) compared to results published in a journal (median 23.9 months).
10,554 (37.9%) Phase 2-4 studies disclosed the results through structured submission to
clinicaltrials.gov, 8,338 (29.95%) through a journal article, and 3,808 (13.7%) disclosed through both
resources. Scientific articles were published in 1,454 titles of diverse nature and varying impact
factor. These ranged from prestigious general medical journals (BMJ, JAMA, Lancet, NEJM) to
narrower focus, specialist journals with no impact information available.
Journal publication status depended on study design. Trials reporting the use of randomization were
more likely to be published compared to non-randomized studies (34.9% vs 18.2%), and large trials
enrolling more than 100 participants were more likely to be published than studies with less than
100 participants (39.3% vs 20.5%).
In general, studies whose results were disclosed in the form of a scientific publication were more
likely to follow the ideal RCT design paradigms compared to those whose results were submitted to
clinicaltrials.gov only (Fig 5). Furthermore, a similar trend was observed for trials with results
published in higher vs. lower impact journals. For instance, 89.9% of trials published in journals with
H-index > 400 reported the use of randomization, compared to 79% of trials published in journals
with H-index < 50, and 60.4% of trials whose results were submitted to clinicaltrials.gov only (Fig 5).
The characteristics of trials disclosed solely through submission to clinicaltrials.gov did not
substantially differ from those that not disseminated in any form (Suppl. Files 2 and 3).
Trial dissemination rates across funder types
Trial dissemination rates varied amongst the different funding source categories. In general,
industry-funded studies were more likely to disclose results compared to non-commercial trials
Page 58 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only(59.3% vs 45.3% of all completed Phase 2-4 studies). They were also more likely to report the results
on clinicaltrials.gov, overall (45.6% vs 24.4%) and within the required period of 12 months after
completion (23.3% vs 10.3%). Notably, only a small difference was observed for journal publication
rates (30.98% vs 28.4%). Additional analysis showed substantial differences across more granular
funder categories (Fig 6A). Among non-profit funders, there was a gap between NIH-funded studies
and those funded by other academic and non-profit organizations: 60% vs 40.6% of trials being
disclosed from the two categories, respectively. Similarly, among the industry-funded clinical trials,
Big Pharma had a higher result dissemination rate compared to Small Pharma (66.7% vs 45.2% of
studies).
In addition to overall reporting rates, the analysis also highlighted differences in the nature and
timing of results dissemination across trial funding sources (Fig 6A-D). Big Pharma had the highest
dissemination rate, both in terms of clinicaltrials.gov submission (53.2% of studies) and journal
publications (34.8%). By contrast, funders classified as “Other” showed the lowest percentage of
trials published on clinicaltrials.gov (15.9%). Organizations from this category disclosed trial results
primarily through journal articles (29.1% of all studies) and were the only funder type with fewer
trials reported through clinicaltrials.gov submission than through journal publication. The time to
reporting trials at clinicaltrials.gov was fastest for industry funded trials and slowest for studies
funded by organizations classed as “Other”; publication lag for journal articles showed a reverse
profile (see Fig 6C-D and Suppl. File 2 for detailed statistics).
Analysis of the fraction of trials with results disclosed within the required 12 months after
completion showed that dissemination rates have been steadily growing with time with the largest
increase observed after 2007, when FDAAA came into effect.[9] Further analysis showed that the
observed growth was largely driven by increased timely reporting on clinicaltrials.gov, most evident
for studies funded by Big Pharma and NIH (Suppl. File 2).
Trial dissemination rates across disease categories
Disease areas with highest result dissemination were autoimmune diseases and infectious diseases,
whilst neurology and oncology showed the lowest dissemination rates (Fig 7). As shown on Fig 7A,
these differences were largely driven by differential trends in journal publication, with less variability
observed for rates of result reporting on clinicaltrials.gov. Oncology in particular had consistently the
lowest journal publication rate across all clinical phases and funder categories.
Two disease categories with the highest fraction of published trials (autoimmune and infectious
diseases) were also previously associated with highest likelihood of new drug approval (LoA) (12.7%
and 16.7%, respectively). Similarly, two disease areas showing the lowest proportion of published
trials also had the lowest drug approval rates: neurology (LoA of 9.4%) and oncology (6.7%).[31]
Cardiovascular diseases had a relatively high publication rate, but low clinical trial success rate with
only 7.1% of tested compounds eventually approved for marketing.[31]
DISCUSSION
Page 59 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyThis analysis, enabled by automated data-driven analytics, provided several important insights on
the scope of research activity, design, visibility and reporting of clinical trials of pharmaceutical
interventions. There has been a growth and diversification of trial activity with a more diverse range
of profit and non-profit organizations undertaking clinical research across a wide range of disease
areas, and the analysis highlighted potentially important differences in trial quality and
dissemination.
In terms of trial design, industry-funded studies tended to use more robust methodology compared
to studies funded by non-profit institutions, although the methodological differences have been
decreasing with time. Result dissemination rates varied substantially depending on disease area as
well as the type and size of funding institution. In general, the results of industry-funded trials were
more likely to be disclosed compared to non-commercial studies, although small pharmaceutical
organizations lagged behind “Big Pharma”. Similarly, trials with NIH funding more often disclosed
their results compared to those funded by other academic and non-profit organizations. Although
overall dissemination rates improved with time, the trend was largely driven by increased results
reporting on clinicaltrials.gov for Big Pharma and NIH-funded studies. By contrast, the results of
trials funded by other academic and non-profit organizations were primarily published as journal
articles and were rarely submitted to clinicaltrials.gov. Finally, differences in dissemination rates
across medical specialties were mainly due to differential journal publication rates ranging from 42%
for autoimmune diseases to 20% for oncology.
Strengths and limitations
One of the major strengths of this work is the larger volume of clinical trials data and the greater
depth and detail of the analysis compared to previous studies in this area. Many factors were
analyzed for the first time in the context of both trial design quality and transparency of research.
These included the distinction between Big Pharma and smaller industrial organizations, analysis of
publication trends across distinct medical specialties, and differences between trials whose results
were disseminated through clinicaltrials.gov submission and publication in higher or lower impact
journals. In addition, the study provided a novel method for visualizing trial characteristics that
enables a rapid assimilation of similarities and differences across a variety of factors.
The study had several limitations related to the underlying dataset and the methodology used to
annotate and integrate the data. Firstly, it was limited to the clinicaltrials.gov database and hence
did not include unregistered studies. However, given that trial registration has been widely accepted
and enforced by regulators and journal editors,[8, 12] it is likely that most of the influential studies
have been captured in the analysis.[41] Additional limitations were due to the data quality issues
such as missing or incorrect entries, ambiguous terminology, and lack of semantic
standardization.[14, 16] Many trials were annotated as placebo or comparator-controlled based on
information mined from their unstructured titles and descriptions, as the correct annotation in the
structured arm type field was often missing. Although a subset of annotations was analyzed
manually, there might still be some trials that our automatic workflow has misclassified. In addition,
due to the non-standardized free-text format of the data, only a simple count of eligibility criteria
and outcome measures was used as an imperfect proxy for assessing trial complexity and the
homogeneity of enrolled population.
Page 60 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyFinally, our method for linking registry records to associated publications relied on the presence of
registry identifier (NCT number) in the text or metadata of a journal article. Its absence from a trial
publication would lead to misclassified publication status and, consequently, underestimated
publication rates. However, owing to the rigorousness of the search methods, it is unlikely that the
publication status was misclassified in a systematic manner, e.g. depending on study funding source
or disease area. Hence, although absolute estimates might be affected, it is unlikely that the
comparative analysis based on automated record linkage is biased towards one of the categories.
Notably, reporting of registration numbers is encouraged by medical journals through ICMJE and the
CONSORT checklist and it can be argued that compliance is part of investigators’ responsibility to
ensure that their research is transparent and discoverable.[20, 35]
Comparison with other studies
In addition to providing novel contributions, our analysis further confirmed and extended several
findings reported previously by others. Notably, larger sample size, wide inclusion criteria, and
additional mappings allowed us to calculate more accurate statistics and ask previously unexplored
questions in an unbiased way. For instance, Bala et al.[42] showed that trials published in higher
impact journals have larger enrollment size based on a sample of 1,140 RCTs reports and manual
search. Our study extended this analysis to entire clinicaltrials.gov registry (not only RCTs) and many
additional trial properties beyond enrollment size, including reported use of DMCs, duration of the
studies, and number of countries. Other previously reported trends confirmed and further examined
in this study included lower reporting rate observed for earlier clinical phases[39, 40] and academic
sponsors,[39, 40] longer dissemination lag for journal publications,[43] and overall improvements in
result dissemination.[44]
Several previous studies investigated the transparency of clinical research using various subsets of
clinicaltrials.gov and diverse approaches for classifying study results as published[17, 20, 36, 39, 40,
45, 46]; summarized in detail elsewhere.[47, 48] General trends showed by our study were within
estimates reported previously and our estimate of 54% for overall trial dissemination rate was in
agreement with two recent systematic reviews.[47, 48] Our study is methodologically most similar
to the work by Powell-Smith and Goldacre[20] with one important distinction: we did not restrict
the analysis to trials sponsored by organizations with more than 30 studies as this would exclude
small pharmaceutical companies and research centers, and bias the results towards large
institutions that routinely perform or fund clinical research. As a consequence, the analysis showed
analogous trends, but slightly lower overall reporting rates compared to Ref 20.[20]
Meaning and implications of the study
The higher methodological quality of industry funded trials may reflect the available financial and
organizational resources as well as better infrastructure and expertise in conducting clinical
trials.[49, 50] It is also possible that differential objectives of the industrial and non-profit
investigators (development of new medicines with a view to gaining a license for market access vs
mechanistic research and rapid publication of novel findings) contribute to the observed
differences.[49, 51]
The analysis suggests that trials funded by large institutions that regularly conduct clinical research
are more likely to disseminate their results publicly. Although industry led reporting overall, ‘Big
Page 61 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyPharma’ performed much better compared to smaller organizations, likely because many large
companies developed explicit disclosure policies in the recent years.[52-54] Similarly, NIH has
developed specific regulations for disseminating the results of the studies it funds. The most recent
such policy requires the publication of all NIH-funded trials, including (unprecedentedly) Phase 1
safety studies.[55] Whilst, large trial sponsors seem to actively pursue better transparency,[39]
smaller organizations, particularly those in the non-profit sector, were less responsive to the FDAAA
publication mandate. It is possible that such institutions are less able, or less motivated, to allocate
time and resources required to prepare and submit the results, or may be less familiar with the
legislation.[39, 41]
Our study suggests that the FDA mandate for trial results dissemination and the establishment of
clinicaltrials.gov results database led to improved transparency of clinical research. The time to
results dissemination is faster for clinicaltrials.gov compared to traditional publication and the
database stores the results of many otherwise unpublished studies. Lack of journal publications
might be explained by the reluctance of editors to publish smaller studies based on suboptimal
design or studies with “negative” or non-significant outcomes that are unlikely to influence clinical
practice.[42, 56, 57] Our analysis showed that journal publication varied depending on trial design
and disease area, with certain medical specialties and smaller non-randomized studies more likely to
be disseminated solely through clinicaltrials.gov. The difference was particularly evident for
oncology. Only a fifth of completed cancer trials were published in a journal, perhaps due to the high
prevalence of small sample sizes, suboptimal trial designs, and negative outcomes in this field.[14,
31] High clinicaltrials.gov dissemination rates, including cancer trials and small studies, suggests that
the resource plays an important role in reducing publication bias. Thus, to assure better
completeness of available evidence, the database should be used by clinical and drug discovery
researchers in addition to literature reviews.
Importantly, automated analyses and methods for identification of yet to be published studies could
be used by regulators and funders of clinical research to track trial investigators that failed to
disclose the results of their studies. Such a technology would be particularly useful when coupled
with automated notifications and additional incentives (or sanctions) for result dissemination.
Recently, the concept of automated ongoing audit of trial reporting was implemented through an
online TrialTracker tool that monitors major clinical trial sponsors with more than 30 studies.[20]
Improvements to the quality of clinical trial data could facilitate the development of robust
monitoring systems in the future. Firstly, data quality should be taken into account when registering
new trials and standardized nomenclature should be used whenever possible. Ideally, the user
interface of the clinicaltrials.gov registry itself should be designed to encourage and facilitate
normalization. In addition, stricter adherence to the CONSORT requirement for including the trial
registration number with each published article would improve the accuracy of automated mapping
of registry records to associated publications.[41]
In addition to improved monitoring, new regulations might be needed to ensure better transparency
and it is important that small industrial organizations and academic centres are reached by these
efforts as well. One possible solution was proposed by Anderson and colleagues[39]: journal editors
from ICMJE could call for submission of results to clinicaltrials.gov as a requirement for article
publication. Given dramatic increase in trial registration rates that followed previous ICMJE
Page 62 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Onlypolicies,[58, 59] it is likely that such a call would improve the reporting of ongoing clinical
research.[39]
Page 63 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyAcknowledgements
We thank our colleagues from BenevolentAI, University College London, and Farr Institute of Health
Informatics for much useful discussion and advice, in particular John Overington (currently
Medicines Discovery Catapult), Nikki Robas, Bryn Williams-Jones, Chris Finan, and Peter Cox. We also
thank clinicaltrials.gov team for their assistance and advice.
Funding
This study received no specific external funding from any institution in the public, commercial, or not
for profit sectors and was conducted by Benevolent Bio as part of its normal business.
Competing interests
All authors have completed the ICMJE uniform disclosure form at
http://www.icmje.org/coi_disclosure.pdf and declare: no financial relationships with any
organisations that might have an interest in the submitted work and no other relationships or
activities that could appear to have influenced the submitted work.
Licence
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non
Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this
work non-commercially, and license their derivative works on different terms, provided the original
work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-
nc/3.0/.
Contributorship
Contributors: MZ, MD, AH, and JH conceived and designed this study. MZ and MD acquired the
data. MZ, MD, AH, and JH analysed and interpreted the data. The initial manuscript was drafted by
MZ; all authors critically revised the manuscript and approved its final version. MZ takes
responsibility for the overall integrity of the data and the accuracy of the data analysis. JH is the
guarantor.
Transparency
The authors affirm that the manuscript is an honest, accurate, and transparent account of the study
being reported; that no important aspects of the study have been omitted; and that any
discrepancies from the study as planned (and, if relevant, registered) have been explained.
Data sharing
All the data used in the study are available from public resources. The dataset can be made available
on request from the authors.
Patient consent
Not required
Ethical approval
Not required
Page 64 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyReferences
1. Sibbald B and Roland M. Understanding controlled trials. Why are randomised controlled
trials important? BMJ 1998;316(7126):201.
2. Lindsay MA. Target discovery. Nat Rev Drug Discov 2003;2(10):831.
3. Sackett DL, Rosenberg WMC, Gray JAM, et al. Evidence based medicine: what it is and what
it isn't. BMJ 1996;312:71.
4. Woolf SH. The meaning of translational research and why it matters. JAMA 2008;299(2):211-
213.
5. Kendall J. Designing a research project: randomised controlled trials and their principles.
Emerg Med J 2003;20(2): 164-168.
6. Simes RJ. Publication bias: the case for an international registry of clinical trials. J Clin Oncol
1986;4(10):1529-1541.
7. Yamey G. Scientists who do not publish trial results are “unethical”. BMJ
1999;319(7215):939.
8. Food and Drug Administration Modernization Act (FDAMA) of 1997. Available from:
https://www.fda.gov/RegulatoryInformation/LawsEnforcedbyFDA/SignificantAmendmentst
otheFDCAct/FDAMA/default.htm; accessed 09/01/2018.
9. Food and Drug Administration Amendments Act (FDAAA) of 2007. Available from:
https://www.fda.gov/drugs/guidancecomplianceregulatoryinformation/guidances/ucm0649
98.htm; accessed 09/01/2018.
10. Zarin DA and Tse T. Moving towards transparency of clinical trials. Science
2008;319(5868):1340.
11. Zarin DA, Tse T, Williams RJ, et al. Trial reporting in ClinicalTrials. gov—the final rule. N Engl J
Med 2016;375(20):1998-2004.
12. ICMJE's clinical trial registration policy. Available from:
http://www.icmje.org/recommendations/browse/publishing-and-editorial-issues/clinical-
trial-registration.html; accessed 09/01/2018.
13. WHO International Clinical Trials Registry Platform (ICTRP): Primary Registries. Available
from: http://www.who.int/ictrp/network/primary/en/; accessed 09/01/2018.
14. Hirsch BR, Califf RM, Cheng SK, et al. Characteristics of oncology clinical trials: insights from a
systematic analysis of ClinicalTrials.gov. JAMA Inter Med 2013;173(11):972-979.
15. Zarin DA, Ide NC, Tse T, et al. Issues in the registration of clinical trials. JAMA
2007;297(19):2112-2120.
16. Califf RM, Zarin DA, Kramer JM, et al. Characteristics of clinical trials registered in
ClinicalTrials.gov, 2007-2010. JAMA 2012;307(17):1838-1847.
17. Chen R, Desai NR, Ross JS, Zhang W, et al. Publication and reporting of clinical trial results:
cross sectional analysis across academic medical centers. BMJ 2016;352:i637.
18. Hakala A, Kimmelman J, Carlisle B, et al. Accessibility of trial reports for drugs stalling in
development: a systematic assessment of registered trials. BMJ 2015;350:h1116.
19. Mathieu S,Boutron I, Moher D, et al. Comparison of registered and published primary
outcomes in randomized controlled trials. JAMA 2009;302(9):977-984.
20. Powell-Smith A and Goldacre B. The TrialsTracker: Automated ongoing monitoring of failure
to share clinical trial results by all major companies and research institutions. F1000Res
2016;5:2629.
21. Ross JS, Mulvey GK, Hines EM, et al. Trial publication after registration in ClinicalTrials. Gov:
a cross-sectional analysis. PLoS Med 2009;6(9):e1000144.
22. Small drug companies grow as big pharma outsources. Available from:
http://www.telegraph.co.uk/business/2017/01/22/top-50-privately-owned-pharma-
companies-britain/; accessed 09/01/2018.
23. Barden CJ and Weaver DF. The rise of micropharma. Drug Discov Today 2010;15(3):84-87.
Page 65 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only24. Munos B. Lessons from 60 years of pharmaceutical innovation. Nat Rev Drug Discov
2009;8(12):959-968.
25. Munos BH and Chin WW. How to revive breakthrough innovation in the pharmaceutical
industry. Sci Transl Med 2011;3(89):89cm16.
26. Pammolli F, Magazzini L, and Riccaboni M. The productivity crisis in pharmaceutical R&D.
Nat Rev Drug Discov 2011;10(6): 428-438.
27. Thiers FA, Sinskey AJ, and Berndt ER. Trends in the globalization of clinical trials. Nat Rev
Drug Disov 2008;7(1):13.
28. Chondrogiannis E, Andronikou V, Tagaris A, et al., A novel semantic representation for
eligibility criteria in clinical trials. J Biomed Inform 2017;69:10-23.
29. Coletti MH and Bleich HL. Medical subject headings used to search the biomedical literature.
J Am Med Inform Assoc 2001;8(4):317-323.
30. Jensen LJ, Saric J, and Bork P. Literature mining for the biologist: from information retrieval
to biological discovery. Nat Rev Genet 2006;7(2):119-129.
31. Hay M, Thomas DW, Craighead JL, et al. Clinical development success rates for
investigational drugs. Nat Biotechnol 2014;32(1):40.
32. ClinicalTrials.gov Glossary of common site terms. Available from:
https://clinicaltrials.gov/ct2/about-studies/glossary; accessed 09/01/2018.
33. ClinicalTrials.gov Protocol registration data element definitions
for interventional and observational studies. Available from:
http://prsinfo.clinicaltrials.gov/definitions.html; accessed 09/01/2018.
34. Top 50 Global Pharma Companies. Annual ranking published by Pharmacutical Executive.
Available from: https://www.rankingthebrands.com/The-Brand-
Rankings.aspx?rankingID=370; accessed 09/01/2018.
35. Huser V and Cimino JJ. Linking ClinicalTrials. gov and PubMed to track results of
interventional human clinical trials. PLoS One 2013;8(7): e68409.
36. Ramsey S and Scoggins J. Commentary: practicing on the tip of an information iceberg?
Evidence of underpublication of registered clinical trials in oncology. Oncologist 2008;13(9):
925-929.
37. Lokker C, Haynes RB, Wilczynski NL, et al. Retrieval of diagnostic and treatment studies for
clinical use through PubMed and PubMed's Clinical Queries filters. J Am Med Inform Assoc
2011;18(5):652-659.
38. SCImago, SJR — SCImago Journal & Country Rank. Available from: www.scimagojr.com;
accessed 09/01/2018.
39. Anderson ML, Chiswell K, Peterson ED, et al. Compliance with results reporting at
ClinicalTrials.gov. New Engl J Med 2015;372(11):1031-1039.
40. Prayle AP, Hurley MN, and Smyth AR. Compliance with mandatory reporting of clinical trial
results on ClinicalTrials. gov: cross sectional study. BMJ 2012;344: d7373.
41. Manzoli L, Flacco ME, D'Addarion M, et al., Non-publication and delayed publication of
randomized trials on vaccines: survey. BMJ 2014;348:g3058.
42. Bala MM, Akl EA, Sun X, et al., Randomized trials published in higher vs. lower impact
journals differ in design, conduct, and analysis. J Clin Epidemiol 2013;66(3): 286-295.
43. Riveros C, Dechartres P, Perrodeau E, et al. Timing and completeness of trial results posted
at ClinicalTrials. gov and published in journals. PLoS Med 2013;10(12): e1001566.
44. Miller JE, Wilenzick M, Ritcey N, et al. Measuring clinical trial transparency: an empirical
analysis of newly approved drugs and large pharmaceutical companies. BMJ Open
2017;7(12):e017917.
45. Jones CW, Handler L, Crowell KE, et al. Non-publication of large randomized clinical trials:
cross sectional analysis. BMJ 2013;347: f6104.
46. Ross JS, Tse T, Zarin DA, et al. Publication of NIH funded trials registered in ClinicalTrials. gov:
cross sectional analysis. BMJ 2012;344:d7292.
Page 66 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only47. Chan AW, Song F, Vickers A, et al. Increasing value and reducing waste: addressing
inaccessible research. Lancet 2014;383(9913): 257-266.
48. Schmucker C, Schell LK, Portalupi S, et al. Extent of non-publication in cohorts of studies
approved by research ethics committees or included in trial registries. PloS One
2014;9(12):e114023.
49. Laterre PF and François B. Strengths and limitations of industry vs. academic randomized
controlled trials. Clin Microbiol Infect 2015;21(10): 906-909.
50. Martin L, Mutchens M, Hawkins C, et al. How much do clinical trials cost? Nature Rev Drug
Discov 2017;16(6): 381-382.
51. Ehlers MD. Lessons from a recovering academic. Cell 2016;165(5):1043-1048.
52. Novartis Position on Clinical Study Transparency – Clinical Study Registration, Results
Reporting and Data Sharing. Available from:
https://www.novartis.com/sites/www.novartis.com/files/clinical-trial-data-
transparency.pdf; accessed 09/01/2018.
53. Pfizer: Trial Data & Results. Available from: https://www.pfizer.com/science/clinical-
trials/trial-data-and-results; accessed 09/01/2018.
54. GSK: Data transparency. Available from: https://www.gsk.com/en-gb/behind-the-
science/innovation/data-transparency/; accessed 09/01/2018.
55. NIH Policy on the Dissemination of NIH-Funded Clinical Trial Information. Available from:
https://grants.nih.gov/grants/guide/notice-files/NOT-OD-16-149.html; accessed
09/01/2018.
56. Hopewell S, Loudon K, Clarke MJ, et al. Publication bias in clinical trials due to statistical
significance or direction of trial results. Cochrane Database Syst Rev 2009;1:MR000006.
57. Stern JM and Simes RJ. Publication bias: evidence of delayed publication in a cohort study of
clinical research projects. BMJ 1997;315(7109):640-645.
58. Laine C, Horton R, DeAngelis CD, et al. Clinical trial registration—looking back and moving
ahead. Lancet 2007;369(9577):1909-1911.
59. Viergever RF and Li K. Trends in global clinical trial registration: an analysis of numbers of
registered clinical trials in different parts of the world from 2004 to 2013. BMJ Open
2015;5(9):e008932.
Page 67 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review OnlyFigure captions
Fig 1. Funding source distribution for all trials registered with clinicaltrials.gov. A: Trends in funding
source distribution for all pharmaceutical trials conducted after Jan 1, 1997 until July 19, 2017
(based on the start_date data field). B: Overall funding source distribution for all registered
pharmaceutical studies.
Fig 2. Trial design properties. Each radar chart illustrates the differences between three groups of
trials (represented by coloured polygons) with respect to fifteen trial protocol characteristics
(individual radial axes). Each axis shows the fraction of trials with given property, such as reported
use of randomization, blinding, or data monitoring committees (DMC). A: Trial characteristics by
clinical phase. B: Characteristics of Phase 2 treatment-oriented trials for three representative
diseases. The profile of glioblastoma (with many small non-randomized studies) is typical of
oncology trials; see Suppl. File 2.
Fig 3. Phase 2 trial properties by funding source. Detailed statistics across 4 funding categories and
remaining clinical phases are available in Suppl. File 4.
Fig 4. Changes in clinical trial design over time. Plots compare the characteristics of all Phase 2 trials
(regardless of completion status) divided into four temporal subsets based on their starting year.
Additional results are available in Suppl. File 2.
Fig 5. Characteristics of trials whose results were published as a journal article or through a direct
submission to the clinicaltrials.gov results database. Trial published as an article are divided into
high and low-impact classes based on the 2016 H-index of the corresponding journal; see Methods.
Amongst 945 trials published in high-impact journals (H index > 400), 459 (48.6%) were funded by
Big Pharma, 178 by Other, 169 by Small Pharma, and 139 by NIH. Detailed statistics available in
Suppl. File 3.
Fig 6. Trial results dissemination rates by trial funding category. A: Trial dissemination rates by
funder type: overall dissemination rate, journal publications, and submission of structured results to
clinicaltrials.gov. B: Time to first results dissemination (whether through submission to
clinicaltrials.gov results database or publication in a journal article). The vertical line indicates the
12-month deadline for reporting mandated by FDAAA. C: Time to results reporting on
clinicaltrials.gov. D: Time to journal article publication.
Fig 7. Trial results dissemination rates and funding source distribution across seven medical
specialties. A: For each specialty, the plot shows the fraction of completed Phase 2-4 trials whose
results were publicly disseminated, either as a journal article or through direct submission to
clinicaltrials.gov results database. B: Funding source distribution by medical specialty.
Page 68 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Confidential: For Review Only
172x124mm (300 x 300 DPI)
Page 69 of 69
https://mc.manuscriptcentral.com/bmj
BMJ
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960