Download pdf - Confidential: For Review Only - BMJ › sites › default › files › attachments... · Confidential: For Review Only Impact of funding source and disease studied on clinical trial

Confidential: For Review Only

Impact of funding source and disease studied on clinical

trial dissemination and design. A large-scale integrative analysis of clinicaltrials.gov and PubMed data.

Journal: BMJ

Manuscript ID BMJ.2017.041628.R1

Article Type: Research

BMJ Journal: BMJ

Date Submitted by the Author: 12-Jan-2018

Complete List of Authors: Zwierzyna, Magdalena; BenevolentBio Ltd, Biomedical Informatics; University College London, Institute of Cardiovascular Science Davies, Mark; BenevolentBio Ltd Hingorani, Aroon; University College London, Institute of Cardiovascular Science; Farr Institute of Health Informatics Hunter, Jackie; BenevolentBio Ltd; St Georges Hospital Medical School

Keywords: clinical trials, research transparency, study design, data analysis, data integration, trial reporting, publication bias

https://mc.manuscriptcentral.com/bmj

BMJ

Confidential: For Review OnlyImpact of funding source and disease studied on clinical trial

dissemination and design

A large-scale integrative analysis of clinicaltrials.gov and PubMed data

Magdalena Zwierzyna, biomedical data scientist1,2,

*, Mark Davies, VP of

biomedical informatics1, Aroon D Hingorani, professor, consultant

2,3, Jackie

Hunter, CEO, professor1,4

1BenevolentBio Ltd, London, United Kingdom

2Institute of Cardiovascular Science, University College London, London, United

Kingdom

3Farr Institute of Health Informatics, London, United Kingdom

4St George’s Hospital Medical School, London, United Kingdom

*Corresponding author

Magdalena Zwierzyna

Email: [email protected] (MZ)

Tel: +44 (0) 203 096 0720

BenenevolentBio Ltd

40 Churchway

NW1 1LW London

United Kingdom

Word count: 5,084

Page 1 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyAbstract

Objective: To investigate the distribution, design characteristics, and dissemination of clinical trials

by funding organisation and medical specialty.

Design: Cross-sectional descriptive analysis.

Data sources: Trial protocol information from clinicaltrials.gov, metadata of journal articles in which

trial results were published (PubMed), and quality metrics of associated journals from SCImago

Journal & Country Rank database.

Selection criteria: All 45,620 clinical trials evaluating small-molecule therapeutics, biologics,

adjuvants, and vaccines, completed after January 2006 and prior to July 2015, including randomized

controlled trials (RCTs) and non-randomized studies across all clinical phases

Results: Industry was more likely to fund large international RCTs compared to non-profit funders,

although methodological differences have been decreasing with time. Amongst 27,835 completed

efficacy trials (Phase 2-4), 54.2% had disclosed their findings publicly. Industry was more likely to

disseminate trial results compared to non-profit trial funders (59.3% vs 45.3%), with large

pharmaceutical companies having higher disclosure rates than small ones (66.7% vs 45.2%). NIH-

funded trials were disseminated more often than those of other non-profit institutions (60.0% vs

40.6%). Big Pharma and NIH-funded study results were more likely to appear in clinicaltrials.gov

than those from non-profit funders, which were published mainly as journal articles. Trials reporting

the use of randomization were more likely to be published in a journal article compared to non-

randomized studies (34.9% vs 18.2%), and journal publication rates varied across disease areas,

ranging from from 42% for autoimmune diseases to 20% for oncology.

Conclusions: Trial design and result dissemination varies substantially depending on the type and

size of funding institution as well as the disease area under study.

What this paper adds

What is already known on this subject:

• Suboptimal study design and slow dissemination of findings are common amongst registered

clinical trials; almost half of all studies remaining unpublished years after completion.

• A wider range of organisations is now funding clinical trials

• Reporting rates and other quality measures may vary by the type of funding organization.

What this study adds:

• Design characteristics (as a marker of methodological quality) vary by funder and medical

specialty, and associate with reporting rates and the likelihood of publication in high-impact

journals.

• Interventional trials with industrial funding are, on average, more robustly designed than

those funded by non-commercial organizations.

• Industry-funded trials are also more likely to be disclosed than those from other funders,

although small pharmaceutical organizations lag behind “Big Pharma”.

• Trials with NIH funding are more likely to disclose results compared to other academic and

non-profit organizations.

• Big Pharma and NIH-funded studies show high rates of reporting within the clinicaltrials.gov

results database.

Page 2 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Only• Trials funded by other academic and non-profit organizations were disclosed primarily as

journal articles, and rarely submitted to clinicaltrials.gov.

• Result dissemination rate varies by medical specialty, mainly due to differential journal

publication rates, ranging from 42% for autoimmune diseases to 20% for oncology.

INTRODUCTION

Well conducted clinical trials are widely regarded as the best source of evidence on the efficacy and

safety of medical interventions.[1] Trials of first-in-class drugs also provide the most rigorous test of

causal mechanisms in human disease.[2] Clinical trial findings inform regulatory approvals of novel

drugs, key clinical practice decisions and guidelines, and fuel the progress of translational

medicine.[3, 4] However, this model relies on trial activity being high-quality, transparent, and

discoverable.[5] Randomization, blinding, adequate power, and a clinically relevant patient

population are among the hallmarks of high-quality trials.[1, 5] In addition, timely reporting of

findings, including negative and inconclusive results, ensures fulfilment of ethical obligations to trial

volunteers and avoids unnecessary duplication of research or creating biases in clinical knowledge

base.[6, 7]

To improve the visibility and discoverability of clinical trials, the Food and Drug Administration (FDA)

mandated the registration of interventional efficacy trials and public disclosure of their results.[8-11]

Currently, 17 registries store records of clinical studies.[12, 13] Clinicaltrials.gov, the oldest and

largest of such platforms,[14, 15] serves as both a continually updated trial registry and a database

of trial results, thus offering an opportunity to explore, examine, and monitor the clinical research

landscape.[10, 14]

Previous studies of clinical research activity used registered trial protocols and associated published

reports to investigate trends in quality, reporting rate, and potential for publication bias (e.g.[14, 16-

21]). However, many previous studies relied on time-consuming manual literature searches, which

confined their scope to pivotal randomized controlled trials or research funded by major trial

sponsors. Large-scale analysis of relationships between a wide range of protocol characteristics and

result dissemination rates, and the differences between large and small organizations has not been

previously undertaken. This is important, given a reported growth in trials funded by small industrial

and academic institutions.[22-26]

To address this gap, we performed the most comprehensive analysis to date of all interventional

studies of small molecule and biologic therapies registered on clinicaltrials.gov, up to July 19, 2017.

We linked trial registry entries to the publication of findings in journal articles, and used descriptive

statistics, semantic enrichment, and data visualization methods to investigate and illustrate the

landscape and evolution of registered clinical trials. Specifically, we investigated the influence of

funder type and disease area on the design and publication rates of trials across a wide range of

medical journals; changes over time; and relationships to reported attrition rates in drug

development. To facilitate comparison across diverse study characteristics, we designed a novel

visualization method allowing rapid, intuitive inspection of basic trial properties.

Page 3 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyMETHODS

Our large-scale cross-sectional analysis and field synopsis of the current and historical clinical trial

landscape uses a combination of basic trial protocol information and published trial results. The

developed data processing and integration workflow manipulates and links data from

clinicaltrials.gov with information about journal articles in which trial results were published.

Clinicaltrials.gov data processing and annotation

We downloaded the clinicaltrials.gov database as of July 19, 2017 – nearly two decades after the

FDA Modernization Act (FDAMA) legislation requiring trial registration.[8] The primary dataset

encompassed 249,904 studies in total. We then identified records of all pharmaceutical and

biopharmaceutical trials, defined as clinical studies evaluating small-molecule therapeutics,

biologics, adjuvants, and vaccines.[27] To extract the relevant subset, the Study type field was

restricted to “interventional” and Intervention type was restricted to either “Drug” or “Biological”.

This led to a set of 119,840 clinical trials with a range of organizations, disease areas, and design

characteristics (Suppl. File 1).

To enable robust aggregate analyses, we processed and further annotated the dataset. Firstly, date

formats and age eligibility fields were harmonised. Next, wherever possible, missing categories were

inferred from available fields as described by Califf and colleagues[16]. For example, for single-arm

interventional trials with missing allocation and blinding type annotations, allocation was assigned as

“non-randomized” and blinding as “open label”.[16] The dataset was further enriched with derived

variables. For example, trial duration value was derived from available trial start and primary

completion date; trial locations were mapped to continents and flagged if they involved emerging

markets, and trials making use of placebo and/or comparator controls were labelled based on

information from intervention name, arm type, trial title, and description field. For instance, if a

study had an arm of type “Placebo Comparator” or an intervention whose name included words

“placebo”, “vehicle” or “sugar pill”, it was annotated as “placebo-controlled”; similarly, studies with

an arm of type “Active Comparator” were classified as “comparator-controlled”; see Suppl. File 2 for

details. Finally, each trial was annotated with the number of outcome measures and eligibility

criteria following parsing of the unstructured eligibility data field as described by Chondrogiannis et

al.[28]

To categorize clinical trials by medical specialty, we mapped reported disease condition names to

corresponding Medical Subject Headings (MeSH) terms (from both MeSH Thesaurus and MeSH

Supplementary Concept Records)[29] using a dictionary-based named entity recognition

approach.[30] A total of 21,884 distinct condition terms were annotated with MeSH identifiers,

accounting for 86.4% of all disease names from 88% of the clinical trials. The mapping enabled

normalization of various synonyms to preferred disease names as well as automatic classification of

diseases into distinct categories using the MeSH hierarchy. To compare our results with previously

published clinical trial success rates, we focused on seven major disease areas investigated in the

study by Hay et al.[31]: oncology, neurology, autoimmune, endocrine, respiratory, cardiovascular,

and infectious diseases. Trial conditions annotated with MeSH terms were mapped to these

Page 4 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Onlycategories based on matching terms from levels 2 and 3 of the MeSH term hierarchy; see Suppl. File

2 for details on MeSH term assignment and disease category mapping.

To categorize studies by funder type, we first extended the agency classification available from

clinicaltrials.gov. Specifically, we used two main categories: industry and non-profit organizations;

and four sub-categories: Big Pharma, Small Pharma, NIH, and Other; the last category corresponding

to universities, hospitals, and other non-profit research organizations.[32, 33] Industrial

organizations were classified as “Big Pharma” and “Small Pharma” based on their sales in 2016 as

well as overall number of registered trials they sponsored. Specifically, an industrial organization was

classed as “Big Pharma” if it featured on the list of 50 companies from the 2016 Pharmaceutical

Executive ranking[34] or if it sponsored more than 100 clinical trials; otherwise it was classified as

“Small Pharma”; see Suppl. File 2. Next, we derived the likely source of funding for each trial based

on lead_sponsor and collaborators data fields, using a previously described method[16] extended to

include the distinction between big and small commercial organizations. Briefly, if NIH was listed

either as the lead sponsor of a trial or a collaborator, the study was classified as NIH-funded. If NIH

was not involved, but either the lead sponsor or a collaborator was from industry, the study was

classified as funded by either “Big Pharma” if it included any Big Pharma organization or “Small

Pharma” otherwise. Remaining studies were assigned “Other” as the funding source.

Establishing links between trial registry records and published results

We linked the trial registry records to their disclosed results, either deposited directly on

clinicaltrials.gov or published as a journal article, to establish if the results of a particular trial had

been disseminated. We implemented the approach described by Powell-Smith and Goldacre[20] to

identify eligible trials that should have publicly disclosed results. These were studies concluded after

January 2006 and by July 2015 (to allow sufficient time for publication of results that had not filed an

application to delay submission of the results (based on the firstreceived_results_disposition_date

field in the registry).[20] We included studies from all organizations (even minor institutions)

regardless of how many trials they funded. The selection process led to a dataset of 45,620 studies,

including 27,835 efficacy trials (Phase 2-4); summarized by the flow diagram in Suppl. File 1.

For each eligible trial, we searched for published results using two methods. Firstly, we searched for

structured reports submitted directly to the results database of clinicaltrials.gov (based on the

clinical_results field). Next, we queried the PubMed database with NCT numbers of eligible trials

using an approach that involves searching for the NCT number in the title, abstract, and Secondary

Source ID field of PubMed-indexed articles.[20, 35, 36] To exclude study protocols, commentaries,

and other non-relevant publication types, we used the broad “Therapy” filter – a standard validated

PubMed search filter for clinical queries[37] – as well as simple keyword matching for “study

protocol” in publication titles.[20] In addition, we excluded articles published before trial

completion.

We then determined the proportion of publicly disseminated trial reports for the studies covered by

mandate for result submission (efficacy trials: Phase 2-4) as well as for all completed drug trials

including those not covered by the FDA Amendment Act (Phase 0-1 trials and studies with phase set

to “n/a”). In addition to result dissemination rate, we calculated the dissemination lag for individual

studies to determine the proportion of reports disclosed within the required 12 months after

Page 5 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Onlycompletion of a trial. We defined dissemination lag as time between study completion and first

public disclosure of the results (whether through submission to clinicaltrials.gov or publication in a

journal).

Categorizing scientific journals

To investigate the relation between characteristics of published trials and quality metrics of scientific

journals, we classified journals as high or low impact based on journal metrics information from

SCImago Journal & Country Rank, a publicly available resource that includes journal quality

indicators developed from the information contained in the Scopus database.[38] Specifically, a

journal was classified as high-impact if its 2016 H-index is larger than 400 and as low-impact journal

if its H-index was below 50.

Statistical analysis and data visualization

We examined the study protocol data on 15 characteristics available from clinicaltrials.gov and from

our derived annotations. These were: (1) randomization; (2) interventional study design (e.g. parallel

assignment, sequential assignment or crossover studies); (3) masking (open-label, single-blinded,

double-blinded studies); (4) enrolment (total number of participants in completed studies or

estimated number of participants for ongoing studies); (5) arm types (placebo or active comparator);

(6) duration; (7) number of outcomes (clinical endpoints); (8) number of eligibility criteria; (9) study

phase; (10) funding source; (11) disease category; (12) countries; (13) number of locations; (14)

publication of results; (15) reported use of DMCs (Data monitoring committees). Further

explanation and definitions of trial properties are available in Suppl. File 2.

Based on these criteria characteristics, we examined methodological quality of trials funded by

different types of organizations and compared studies published in higher and lower impact journals

and reported on clinicaltrials.gov. In addition, we analysed trial dissemination rates across funding

categories and disease areas and compared them with clinical success rates reported previously.[31]

We used descriptive statistics and data visualization methods to characterize trial categories. For

categorical data we report frequencies and percentages; using medians and interquartile ranges for

continuous variables. To visualize multiple features of clinical trial design, we used radar plots with

each axis showing the fraction of trials with a different property, such as reported use of

randomization, blinding or active comparators. We converted three continuous variables to binary

categories: we classified trials as “small” if they enrolled fewer than 100 participants, as “short” if

they lasted less than 365 days, and as “multi-country” if trial recruitment centres were in more than

one country. Further details are available in supplementary tables. To facilitate comparisons

between several radar plots, we always standardized the order and length of axes. We used Python

for all data analysis tasks (scikit-learn, pandas, scipy, and numpy libraries), working with geographical

data (geonamescache, geopy, and Basemap libraries) as well as for data processing and visualization

(matplotlib, Basemap, and seaborn libraries).

Patient involvement

Patients were not involved in any aspect of the study design, conduct, or in the development of the

research question. The study is an analysis of existing publicly available data and therefore there was

no active patient recruitment for data collection.

Page 6 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


RESULTS

Overview of clinical trial activity by funder type

To date, 119,840 pharmaceutical trials have been registered with clinicaltrials.gov. As illustrated by

Fig 1A, the distribution of trials funded by different types of organizations has changed over time. In

particular, there has been a rise in the proportion of studies funded by organizations other than

industry or NIH – mainly universities, hospitals and other academic and non-profit agencies,

classified here as “Other”. Overall, these institutions funded 36.5% of all registered pharmaceutical

trials, followed by Big Pharma with 31%, and Small Pharma with 21.2% of studies. NIH funded 11.3%

of registered pharmaceutical trials (Fig 1B).

Amongst the 10 main study funders ranked by the overall number of trials, two belonged to the

“NIH” category (National Cancer Institute leading with 7,219 trials) and eight to the “Big Pharma”

category (GSK at the top with 3,171 trials). Out of 4,914 “Small Pharma” organizations, 73.6% had

fewer than five trials, whilst 43.1% funded only one study. Similarly, amongst the 6,468 funders

classified as “Other”, 79.2% had fewer than five trials, whilst 32.7% funded only one study.

The approach outlined in the Methods section identified 45,620 clinical trials in our dataset that

should also have available published results. The remaining analysis in the Results section focuses on

this subset of studies.

Trial design characteristics

The dataset included various types of study designs from large randomized clinical trials (RCTs) to

small single-site studies, many of which lacked controls (Suppl. File 2). Comparative analysis showed

that methodological characteristics of trials varied substantially across clinical phases (Fig 2A),

disease areas (Fig 2B), and funding source (Fig 3). The following section will focus on a comparative

analysis by funder type whilst detailed overview of differences across phases and diseases is

available in Suppl. File 2.

Industrial organizations were more likely to fund large international trials and more often reported

the use of randomization, controls and blinding, compared to non-industrial funders. Whilst

differences were evident across all clinical phases (Suppl. File 2), the largest variability was observed

for Phase 2 trials, illustrated by Fig 3. For instance, 68.5% of all industry-funded Phase 2 trials were

randomized (compared to 39.4% and 50.6% of studies funded by NIH and other institutions,

respectively). Studies with industrial funding were also substantially larger (median 80 patients

compared to 43 and 48) and more likely to include international locations (31.3% for industry vs

4.0% for Other and 11.0% for NIH). Despite larger enrollment size, industry-funded studies were

shorter on average compared to trials by other funders (median 547 days compared to 1,431 days

for NIH and 941 for “Other” funding category). Non-industrial funders were more likely to report the

use of data monitoring committees (DMCs): 41.9% of trials with NIH funding and 48.1% of studies

funded by other organizations involved DMCs compared to 26.7% of trials funded by industry).

Design characteristics did not differ substantially between Big and Small Pharma (Suppl. File 2).

Page 7 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Trial design changed substantially over time with the trends most evident in Phase 2. As shown on

Fig 4, the methodological differences between industry and non-industrial funders have generally

decreased over time. Although overall a higher proportion of industry-led studies reported the use

of randomization, the proportion of RCTs funded by non-profit organizations in Phase 2 has recently

increased (Fig 4). Notably, the use of blinding is still much lower in non-commercial than commercial

trials. Additional analysis showed that the average number of trial eligibility criteria and outcome

measures have increased over time, leading to a greater number of procedures performed in an

average study, and hence to increase in trial complexity (Suppl. File 2).

Trial results dissemination and characteristics of published trials

Of the 45,620 completed trials, 27,835 were efficacy studies in Phases 2-4 covered by the mandate

for results publication (Suppl. File 1). Of these, only 15,084 (54.2%) have disclosed their results

publicly. Consistent with previously reported trends,[39, 40] dissemination rates were lower for

earlier clinical phases (not covered by the publication mandate) as only 23.4% of trials in Phases 0-1

had publicly disclosed results (Suppl. File 2).

Detailed analysis of Phase 2-4 trials showed that only 25.2% of published studies disclosed their

results within the required period of 12 months after completion. 18.6 months was the median time

to first public reporting, whether through direct submission to clinicaltrials.gov or publication in a

journal. Dissemination lag was smaller for results submitted to clinicaltrials.gov (median 15.3

months) compared to results published in a journal (median 23.9 months).

10,554 (37.9%) Phase 2-4 studies disclosed the results through structured submission to

clinicaltrials.gov, 8,338 (29.95%) through a journal article, and 3,808 (13.7%) disclosed through both

resources. Scientific articles were published in 1,454 titles of diverse nature and varying impact

factor. These ranged from prestigious general medical journals (BMJ, JAMA, Lancet, NEJM) to

narrower focus, specialist journals with no impact information available.

Journal publication status depended on study design. Trials reporting the use of randomization were

more likely to be published compared to non-randomized studies (34.9% vs 18.2%), and large trials

enrolling more than 100 participants were more likely to be published than studies with less than

100 participants (39.3% vs 20.5%).

In general, studies whose results were disclosed in the form of a scientific publication were more

likely to follow the ideal RCT design paradigms compared to those whose results were submitted to

clinicaltrials.gov only (Fig 5). Furthermore, a similar trend was observed for trials with results

published in higher vs. lower impact journals. For instance, 89.9% of trials published in journals with

H-index > 400 reported the use of randomization, compared to 79% of trials published in journals

with H-index < 50, and 60.4% of trials whose results were submitted to clinicaltrials.gov only (Fig 5).

The characteristics of trials disclosed solely through submission to clinicaltrials.gov did not

substantially differ from those that not disseminated in any form (Suppl. Files 2 and 3).

Trial dissemination rates across funder types

Page 8 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyTrial dissemination rates varied amongst the different funding source categories. In general,

industry-funded studies were more likely to disclose results compared to non-commercial trials

(59.3% vs 45.3% of all completed Phase 2-4 studies). They were also more likely to report the results

on clinicaltrials.gov, overall (45.6% vs 24.4%) and within the required period of 12 months after

completion (23.3% vs 10.3%). Notably, only a small difference was observed for journal publication

rates (30.98% vs 28.4%). Additional analysis showed substantial differences across more granular

funder categories (Fig 6A). Among non-profit funders, there was a gap between NIH-funded studies

and those funded by other academic and non-profit organizations: 60% vs 40.6% of trials being

disclosed from the two categories, respectively. Similarly, among the industry-funded clinical trials,

Big Pharma had a higher result dissemination rate compared to Small Pharma (66.7% vs 45.2% of

studies).

In addition to overall reporting rates, the analysis also highlighted differences in the nature and

timing of results dissemination across trial funding sources (Fig 6A-D). Big Pharma had the highest

dissemination rate, both in terms of clinicaltrials.gov submission (53.2% of studies) and journal

publications (34.8%). By contrast, funders classified as “Other” showed the lowest percentage of

trials published on clinicaltrials.gov (15.9%). Organizations from this category disclosed trial results

primarily through journal articles (29.1% of all studies) and were the only funder type with fewer

trials reported through clinicaltrials.gov submission than through journal publication. The time to

reporting trials at clinicaltrials.gov was fastest for industry funded trials and slowest for studies

funded by organizations classed as “Other”; publication lag for journal articles showed a reverse

profile (see Fig 6C-D and Suppl. File 2 for detailed statistics).

Analysis of the fraction of trials with results disclosed within the required 12 months after

completion showed that dissemination rates have been steadily growing with time with the largest

increase observed after 2007, when FDAAA came into effect.[9] Further analysis showed that the

observed growth was largely driven by increased timely reporting on clinicaltrials.gov, most evident

for studies funded by Big Pharma and NIH (Suppl. File 2).

Trial dissemination rates across disease categories

Disease areas with highest result dissemination were autoimmune diseases and infectious diseases,

whilst neurology and oncology showed the lowest dissemination rates (Fig 7). As shown on Fig 7A,

these differences were largely driven by differential trends in journal publication, with less variability

observed for rates of result reporting on clinicaltrials.gov. Oncology in particular had consistently the

lowest journal publication rate across all clinical phases and funder categories.

Two disease categories with the highest fraction of published trials (autoimmune and infectious

diseases) were also previously associated with highest likelihood of new drug approval (LoA) (12.7%

and 16.7%, respectively). Similarly, two disease areas showing the lowest proportion of published

trials also had the lowest drug approval rates: neurology (LoA of 9.4%) and oncology (6.7%).[31]

Cardiovascular diseases had a relatively high publication rate, but low clinical trial success rate with

only 7.1% of tested compounds eventually approved for marketing.[31]

Page 9 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyDISCUSSION

This analysis, enabled by automated data-driven analytics, provided several important insights on

the scope of research activity, design, visibility and reporting of clinical trials of pharmaceutical

interventions. There has been a growth and diversification of trial activity with a more diverse range

of profit and non-profit organizations undertaking clinical research across a wide range of disease

areas, and the analysis highlighted potentially important differences in trial quality and

dissemination.

In terms of trial design, industry-funded studies tended to use more robust methodology compared

to studies funded by non-profit institutions, although the methodological differences have been

decreasing with time. Result dissemination rates varied substantially depending on disease area as

well as the type and size of funding institution. In general, the results of industry-funded trials were

more likely to be disclosed compared to non-commercial studies, although small pharmaceutical

organizations lagged behind “Big Pharma”. Similarly, trials with NIH funding more often disclosed

their results compared to those funded by other academic and non-profit organizations. Although

overall dissemination rates improved with time, the trend was largely driven by increased results

reporting on clinicaltrials.gov for Big Pharma and NIH-funded studies. By contrast, the results of

trials funded by other academic and non-profit organizations were primarily published as journal

articles and were rarely submitted to clinicaltrials.gov. Finally, differences in dissemination rates

across medical specialties were mainly due to differential journal publication rates ranging from 42%

for autoimmune diseases to 20% for oncology.

Strengths and limitations

One of the major strengths of this work is the larger volume of clinical trials data and the greater

depth and detail of the analysis compared to previous studies in this area. Many factors were

analyzed for the first time in the context of both trial design quality and transparency of research.

These included the distinction between Big Pharma and smaller industrial organizations, analysis of

publication trends across distinct medical specialties, and differences between trials whose results

were disseminated through clinicaltrials.gov submission and publication in higher or lower impact

journals. In addition, the study provided a novel method for visualizing trial characteristics that

enables a rapid assimilation of similarities and differences across a variety of factors.

The study had several limitations related to the underlying dataset and the methodology used to

annotate and integrate the data. Firstly, it was limited to the clinicaltrials.gov database and hence

did not include unregistered studies. However, given that trial registration has been widely accepted

and enforced by regulators and journal editors,[8, 12] it is likely that most of the influential studies

have been captured in the analysis.[41] Additional limitations were due to the data quality issues

such as missing or incorrect entries, ambiguous terminology, and lack of semantic

standardization.[14, 16] Many trials were annotated as placebo or comparator-controlled based on

information mined from their unstructured titles and descriptions, as the correct annotation in the

structured arm type field was often missing. Although a subset of annotations was analyzed

manually, there might still be some trials that our automatic workflow has misclassified. In addition,

due to the non-standardized free-text format of the data, only a simple count of eligibility criteria

and outcome measures was used as an imperfect proxy for assessing trial complexity and the

Page 10 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Onlyhomogeneity of enrolled population.

Finally, our method for linking registry records to associated publications relied on the presence of

registry identifier (NCT number) in the text or metadata of a journal article. Its absence from a trial

publication would lead to misclassified publication status and, consequently, underestimated

publication rates. However, owing to the rigorousness of the search methods, it is unlikely that the

publication status was misclassified in a systematic manner, e.g. depending on study funding source

or disease area. Hence, although absolute estimates might be affected, it is unlikely that the

comparative analysis based on automated record linkage is biased towards one of the categories.

Notably, reporting of registration numbers is encouraged by medical journals through ICMJE and the

CONSORT checklist and it can be argued that compliance is part of investigators’ responsibility to

ensure that their research is transparent and discoverable.[20, 35]

Comparison with other studies

In addition to providing novel contributions, our analysis further confirmed and extended several

findings reported previously by others. Notably, larger sample size, wide inclusion criteria, and

additional mappings allowed us to calculate more accurate statistics and ask previously unexplored

questions in an unbiased way. For instance, Bala et al.[42] showed that trials published in higher

impact journals have larger enrollment size based on a sample of 1,140 RCTs reports and manual

search. Our study extended this analysis to entire clinicaltrials.gov registry (not only RCTs) and many

additional trial properties beyond enrollment size, including reported use of DMCs, duration of the

studies, and number of countries. Other previously reported trends confirmed and further examined

in this study included lower reporting rate observed for earlier clinical phases[39, 40] and academic

sponsors,[39, 40] longer dissemination lag for journal publications,[43] and overall improvements in

result dissemination.[44]

Several previous studies investigated the transparency of clinical research using various subsets of

clinicaltrials.gov and diverse approaches for classifying study results as published[17, 20, 36, 39, 40,

45, 46]; summarized in detail elsewhere.[47, 48] General trends showed by our study were within

estimates reported previously and our estimate of 54% for overall trial dissemination rate was in

agreement with two recent systematic reviews.[47, 48] Our study is methodologically most similar

to the work by Powell-Smith and Goldacre[20] with one important distinction: we did not restrict

the analysis to trials sponsored by organizations with more than 30 studies as this would exclude

small pharmaceutical companies and research centers, and bias the results towards large

institutions that routinely perform or fund clinical research. As a consequence, the analysis showed

analogous trends, but slightly lower overall reporting rates compared to Ref 20.[20]

Meaning and implications of the study

The higher methodological quality of industry funded trials may reflect the available financial and

organizational resources as well as better infrastructure and expertise in conducting clinical

trials.[49, 50] It is also possible that differential objectives of the industrial and non-profit

investigators (development of new medicines with a view to gaining a license for market access vs

mechanistic research and rapid publication of novel findings) contribute to the observed

differences.[49, 51]

Page 11 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyThe analysis suggests that trials funded by large institutions that regularly conduct clinical research

are more likely to disseminate their results publicly. Although industry led reporting overall, ‘Big

Pharma’ performed much better compared to smaller organizations, likely because many large

companies developed explicit disclosure policies in the recent years.[52-54] Similarly, NIH has

developed specific regulations for disseminating the results of the studies it funds. The most recent

such policy requires the publication of all NIH-funded trials, including (unprecedentedly) Phase 1

safety studies.[55] Whilst, large trial sponsors seem to actively pursue better transparency,[39]

smaller organizations, particularly those in the non-profit sector, were less responsive to the FDAAA

publication mandate. It is possible that such institutions are less able, or less motivated, to allocate

time and resources required to prepare and submit the results, or may be less familiar with the

legislation.[39, 41]

Our study suggests that the FDA mandate for trial results dissemination and the establishment of

clinicaltrials.gov results database led to improved transparency of clinical research. The time to

results dissemination is faster for clinicaltrials.gov compared to traditional publication and the

database stores the results of many otherwise unpublished studies. Lack of journal publications

might be explained by the reluctance of editors to publish smaller studies based on suboptimal

design or studies with “negative” or non-significant outcomes that are unlikely to influence clinical

practice.[42, 56, 57] Our analysis showed that journal publication varied depending on trial design

and disease area, with certain medical specialties and smaller non-randomized studies more likely to

be disseminated solely through clinicaltrials.gov. The difference was particularly evident for

oncology. Only a fifth of completed cancer trials were published in a journal, perhaps due to the high

prevalence of small sample sizes, suboptimal trial designs, and negative outcomes in this field.[14,

31] High clinicaltrials.gov dissemination rates, including cancer trials and small studies, suggests that

the resource plays an important role in reducing publication bias. Thus, to assure better

completeness of available evidence, the database should be used by clinical and drug discovery

researchers in addition to literature reviews.

Importantly, automated analyses and methods for identification of yet to be published studies could

be used by regulators and funders of clinical research to track trial investigators that failed to

disclose the results of their studies. Such a technology would be particularly useful when coupled

with automated notifications and additional incentives (or sanctions) for result dissemination.

Recently, the concept of automated ongoing audit of trial reporting was implemented through an

online TrialTracker tool that monitors major clinical trial sponsors with more than 30 studies.[20]

Improvements to the quality of clinical trial data could facilitate the development of robust

monitoring systems in the future. Firstly, data quality should be taken into account when registering

new trials and standardized nomenclature should be used whenever possible. Ideally, the user

interface of the clinicaltrials.gov registry itself should be designed to encourage and facilitate

normalization. In addition, stricter adherence to the CONSORT requirement for including the trial

registration number with each published article would improve the accuracy of automated mapping

of registry records to associated publications.[41]

In addition to improved monitoring, new regulations might be needed to ensure better transparency

and it is important that small industrial organizations and academic centres are reached by these

efforts as well. One possible solution was proposed by Anderson and colleagues[39]: journal editors

Page 12 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Onlyfrom ICMJE could call for submission of results to clinicaltrials.gov as a requirement for article

publication. Given dramatic increase in trial registration rates that followed previous ICMJE

policies,[58, 59] it is likely that such a call would improve the reporting of ongoing clinical

research.[39]

Page 13 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyAcknowledgements

We thank our colleagues from BenevolentAI, University College London, and Farr Institute of Health

Informatics for much useful discussion and advice, in particular John Overington (currently

Medicines Discovery Catapult), Nikki Robas, Bryn Williams-Jones, Chris Finan, and Peter Cox. We also

thank clinicaltrials.gov team for their assistance and advice.

Funding

This study received no specific external funding from any institution in the public, commercial, or not

for profit sectors and was conducted by Benevolent Bio as part of its normal business.

Competing interests

All authors have completed the ICMJE uniform disclosure form at

http://www.icmje.org/coi_disclosure.pdf and declare: no financial relationships with any

organisations that might have an interest in the submitted work and no other relationships or

activities that could appear to have influenced the submitted work.

Licence

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non

Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this

work non-commercially, and license their derivative works on different terms, provided the original

work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-

nc/3.0/.

Contributorship

Contributors: MZ, MD, AH, and JH conceived and designed this study. MZ and MD acquired the

data. MZ, MD, AH, and JH analysed and interpreted the data. The initial manuscript was drafted by

MZ; all authors critically revised the manuscript and approved its final version. MZ takes

responsibility for the overall integrity of the data and the accuracy of the data analysis. JH is the

guarantor.

Transparency

The authors affirm that the manuscript is an honest, accurate, and transparent account of the study

being reported; that no important aspects of the study have been omitted; and that any

discrepancies from the study as planned (and, if relevant, registered) have been explained.

Data sharing

All the data used in the study are available from public resources. The dataset can be made available

on request from the authors.

Patient consent

Not required

Ethical approval

Not required

Page 14 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyReferences

1. Sibbald B and Roland M. Understanding controlled trials. Why are randomised controlled

trials important? BMJ 1998;316(7126):201.

2. Lindsay MA. Target discovery. Nat Rev Drug Discov 2003;2(10):831.

3. Sackett DL, Rosenberg WMC, Gray JAM, et al. Evidence based medicine: what it is and what

it isn't. BMJ 1996;312:71.

4. Woolf SH. The meaning of translational research and why it matters. JAMA 2008;299(2):211-

213.

5. Kendall J. Designing a research project: randomised controlled trials and their principles.

Emerg Med J 2003;20(2): 164-168.

6. Simes RJ. Publication bias: the case for an international registry of clinical trials. J Clin Oncol

1986;4(10):1529-1541.

7. Yamey G. Scientists who do not publish trial results are “unethical”. BMJ

1999;319(7215):939.

8. Food and Drug Administration Modernization Act (FDAMA) of 1997. Available from:

https://www.fda.gov/RegulatoryInformation/LawsEnforcedbyFDA/SignificantAmendmentst

otheFDCAct/FDAMA/default.htm; accessed 09/01/2018.

9. Food and Drug Administration Amendments Act (FDAAA) of 2007. Available from:

https://www.fda.gov/drugs/guidancecomplianceregulatoryinformation/guidances/ucm0649

98.htm; accessed 09/01/2018.

10. Zarin DA and Tse T. Moving towards transparency of clinical trials. Science

2008;319(5868):1340.

11. Zarin DA, Tse T, Williams RJ, et al. Trial reporting in ClinicalTrials. gov—the final rule. N Engl J

Med 2016;375(20):1998-2004.

12. ICMJE's clinical trial registration policy. Available from:

http://www.icmje.org/recommendations/browse/publishing-and-editorial-issues/clinical-

trial-registration.html; accessed 09/01/2018.

13. WHO International Clinical Trials Registry Platform (ICTRP): Primary Registries. Available

from: http://www.who.int/ictrp/network/primary/en/; accessed 09/01/2018.

14. Hirsch BR, Califf RM, Cheng SK, et al. Characteristics of oncology clinical trials: insights from a

systematic analysis of ClinicalTrials.gov. JAMA Inter Med 2013;173(11):972-979.

15. Zarin DA, Ide NC, Tse T, et al. Issues in the registration of clinical trials. JAMA

2007;297(19):2112-2120.

16. Califf RM, Zarin DA, Kramer JM, et al. Characteristics of clinical trials registered in

ClinicalTrials.gov, 2007-2010. JAMA 2012;307(17):1838-1847.

17. Chen R, Desai NR, Ross JS, Zhang W, et al. Publication and reporting of clinical trial results:

cross sectional analysis across academic medical centers. BMJ 2016;352:i637.

18. Hakala A, Kimmelman J, Carlisle B, et al. Accessibility of trial reports for drugs stalling in

development: a systematic assessment of registered trials. BMJ 2015;350:h1116.

19. Mathieu S,Boutron I, Moher D, et al. Comparison of registered and published primary

outcomes in randomized controlled trials. JAMA 2009;302(9):977-984.

20. Powell-Smith A and Goldacre B. The TrialsTracker: Automated ongoing monitoring of failure

to share clinical trial results by all major companies and research institutions. F1000Res

2016;5:2629.

21. Ross JS, Mulvey GK, Hines EM, et al. Trial publication after registration in ClinicalTrials. Gov:

a cross-sectional analysis. PLoS Med 2009;6(9):e1000144.

22. Small drug companies grow as big pharma outsources. Available from:

http://www.telegraph.co.uk/business/2017/01/22/top-50-privately-owned-pharma-

companies-britain/; accessed 09/01/2018.

23. Barden CJ and Weaver DF. The rise of micropharma. Drug Discov Today 2010;15(3):84-87.

Page 15 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Only24. Munos B. Lessons from 60 years of pharmaceutical innovation. Nat Rev Drug Discov

2009;8(12):959-968.

25. Munos BH and Chin WW. How to revive breakthrough innovation in the pharmaceutical

industry. Sci Transl Med 2011;3(89):89cm16.

26. Pammolli F, Magazzini L, and Riccaboni M. The productivity crisis in pharmaceutical R&D.

Nat Rev Drug Discov 2011;10(6): 428-438.

27. Thiers FA, Sinskey AJ, and Berndt ER. Trends in the globalization of clinical trials. Nat Rev

Drug Disov 2008;7(1):13.

28. Chondrogiannis E, Andronikou V, Tagaris A, et al., A novel semantic representation for

eligibility criteria in clinical trials. J Biomed Inform 2017;69:10-23.

29. Coletti MH and Bleich HL. Medical subject headings used to search the biomedical literature.

J Am Med Inform Assoc 2001;8(4):317-323.

30. Jensen LJ, Saric J, and Bork P. Literature mining for the biologist: from information retrieval

to biological discovery. Nat Rev Genet 2006;7(2):119-129.

31. Hay M, Thomas DW, Craighead JL, et al. Clinical development success rates for

investigational drugs. Nat Biotechnol 2014;32(1):40.

32. ClinicalTrials.gov Glossary of common site terms. Available from:

https://clinicaltrials.gov/ct2/about-studies/glossary; accessed 09/01/2018.

33. ClinicalTrials.gov Protocol registration data element definitions

for interventional and observational studies. Available from:

http://prsinfo.clinicaltrials.gov/definitions.html; accessed 09/01/2018.

34. Top 50 Global Pharma Companies. Annual ranking published by Pharmacutical Executive.

Available from: https://www.rankingthebrands.com/The-Brand-

Rankings.aspx?rankingID=370; accessed 09/01/2018.

35. Huser V and Cimino JJ. Linking ClinicalTrials. gov and PubMed to track results of

interventional human clinical trials. PLoS One 2013;8(7): e68409.

36. Ramsey S and Scoggins J. Commentary: practicing on the tip of an information iceberg?

Evidence of underpublication of registered clinical trials in oncology. Oncologist 2008;13(9):

925-929.

37. Lokker C, Haynes RB, Wilczynski NL, et al. Retrieval of diagnostic and treatment studies for

clinical use through PubMed and PubMed's Clinical Queries filters. J Am Med Inform Assoc

2011;18(5):652-659.

38. SCImago, SJR — SCImago Journal & Country Rank. Available from: www.scimagojr.com;

accessed 09/01/2018.

39. Anderson ML, Chiswell K, Peterson ED, et al. Compliance with results reporting at

ClinicalTrials.gov. New Engl J Med 2015;372(11):1031-1039.

40. Prayle AP, Hurley MN, and Smyth AR. Compliance with mandatory reporting of clinical trial

results on ClinicalTrials. gov: cross sectional study. BMJ 2012;344: d7373.

41. Manzoli L, Flacco ME, D'Addarion M, et al., Non-publication and delayed publication of

randomized trials on vaccines: survey. BMJ 2014;348:g3058.

42. Bala MM, Akl EA, Sun X, et al., Randomized trials published in higher vs. lower impact

journals differ in design, conduct, and analysis. J Clin Epidemiol 2013;66(3): 286-295.

43. Riveros C, Dechartres P, Perrodeau E, et al. Timing and completeness of trial results posted

at ClinicalTrials. gov and published in journals. PLoS Med 2013;10(12): e1001566.

44. Miller JE, Wilenzick M, Ritcey N, et al. Measuring clinical trial transparency: an empirical

analysis of newly approved drugs and large pharmaceutical companies. BMJ Open

2017;7(12):e017917.

45. Jones CW, Handler L, Crowell KE, et al. Non-publication of large randomized clinical trials:

cross sectional analysis. BMJ 2013;347: f6104.

46. Ross JS, Tse T, Zarin DA, et al. Publication of NIH funded trials registered in ClinicalTrials. gov:

cross sectional analysis. BMJ 2012;344:d7292.

Page 16 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Only47. Chan AW, Song F, Vickers A, et al. Increasing value and reducing waste: addressing

inaccessible research. Lancet 2014;383(9913): 257-266.

48. Schmucker C, Schell LK, Portalupi S, et al. Extent of non-publication in cohorts of studies

approved by research ethics committees or included in trial registries. PloS One

2014;9(12):e114023.

49. Laterre PF and François B. Strengths and limitations of industry vs. academic randomized

controlled trials. Clin Microbiol Infect 2015;21(10): 906-909.

50. Martin L, Mutchens M, Hawkins C, et al. How much do clinical trials cost? Nature Rev Drug

Discov 2017;16(6): 381-382.

51. Ehlers MD. Lessons from a recovering academic. Cell 2016;165(5):1043-1048.

52. Novartis Position on Clinical Study Transparency – Clinical Study Registration, Results

Reporting and Data Sharing. Available from:

https://www.novartis.com/sites/www.novartis.com/files/clinical-trial-data-

transparency.pdf; accessed 09/01/2018.

53. Pfizer: Trial Data & Results. Available from: https://www.pfizer.com/science/clinical-

trials/trial-data-and-results; accessed 09/01/2018.

54. GSK: Data transparency. Available from: https://www.gsk.com/en-gb/behind-the-

science/innovation/data-transparency/; accessed 09/01/2018.

55. NIH Policy on the Dissemination of NIH-Funded Clinical Trial Information. Available from:

https://grants.nih.gov/grants/guide/notice-files/NOT-OD-16-149.html; accessed

09/01/2018.

56. Hopewell S, Loudon K, Clarke MJ, et al. Publication bias in clinical trials due to statistical

significance or direction of trial results. Cochrane Database Syst Rev 2009;1:MR000006.

57. Stern JM and Simes RJ. Publication bias: evidence of delayed publication in a cohort study of

clinical research projects. BMJ 1997;315(7109):640-645.

58. Laine C, Horton R, DeAngelis CD, et al. Clinical trial registration—looking back and moving

ahead. Lancet 2007;369(9577):1909-1911.

59. Viergever RF and Li K. Trends in global clinical trial registration: an analysis of numbers of

registered clinical trials in different parts of the world from 2004 to 2013. BMJ Open

2015;5(9):e008932.

Page 17 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyFigure captions

Fig 1. Funding source distribution for all trials registered with clinicaltrials.gov. A: Trends in funding

source distribution for all pharmaceutical trials conducted after Jan 1, 1997 until July 19, 2017

(based on the start_date data field). B: Overall funding source distribution for all registered

pharmaceutical studies.

Fig 2. Trial design properties. Each radar chart illustrates the differences between three groups of

trials (represented by coloured polygons) with respect to fifteen trial protocol characteristics

(individual radial axes). Each axis shows the fraction of trials with given property, such as reported

use of randomization, blinding, or data monitoring committees (DMC). A: Trial characteristics by

clinical phase. B: Characteristics of Phase 2 treatment-oriented trials for three representative

diseases. The profile of glioblastoma (with many small non-randomized studies) is typical of

oncology trials; see Suppl. File 2.

Fig 3. Phase 2 trial properties by funding source. Detailed statistics across 4 funding categories and

remaining clinical phases are available in Suppl. File 4.

Fig 4. Changes in clinical trial design over time. Plots compare the characteristics of all Phase 2 trials

(regardless of completion status) divided into four temporal subsets based on their starting year.

Additional results are available in Suppl. File 2.

Fig 5. Characteristics of trials whose results were published as a journal article or through a direct

submission to the clinicaltrials.gov results database. Trial published as an article are divided into

high and low-impact classes based on the 2016 H-index of the corresponding journal; see Methods.

Amongst 945 trials published in high-impact journals (H index > 400), 459 (48.6%) were funded by

Big Pharma, 178 by Other, 169 by Small Pharma, and 139 by NIH. Detailed statistics available in

Suppl. File 3.

Fig 6. Trial results dissemination rates by trial funding category. A: Trial dissemination rates by

funder type: overall dissemination rate, journal publications, and submission of structured results to

clinicaltrials.gov. B: Time to first results dissemination (whether through submission to

clinicaltrials.gov results database or publication in a journal article). The vertical line indicates the

12-month deadline for reporting mandated by FDAAA. C: Time to results reporting on

clinicaltrials.gov. D: Time to journal article publication.

Fig 7. Trial results dissemination rates and funding source distribution across seven medical

specialties. A: For each specialty, the plot shows the fraction of completed Phase 2-4 trials whose

results were publicly disseminated, either as a journal article or through direct submission to

clinicaltrials.gov results database. B: Funding source distribution by medical specialty.

Page 18 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Funding source distribution for all trials registered with clinicaltrials.gov. A: Trends in funding source distribution for all pharmaceutical trials conducted after Jan 1, 1997 until July 19, 2017 (based on the start_date data field). B: Overall funding source distribution for all registered pharmaceutical studies.

161x59mm (300 x 300 DPI)

Page 19 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Trial design properties. Each radar chart illustrates the differences between three groups of trials (represented by coloured polygons) with respect to fifteen trial protocol characteristics (individual radial axes). Each axis shows the fraction of trials with given property, such as reported use of randomization, blinding, or data monitoring committees (DMC). A: Trial characteristics by clinical phase. B: Characteristics of Phase 2 treatment-oriented trials for three representative diseases. The profile of glioblastoma (with

many small non-randomized studies) is typical of oncology trials; see Suppl. File 2.

161x58mm (300 x 300 DPI)

Page 20 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Changes in clinical trial design over time. Plots compare the characteristics of all Phase 2 trials (regardless of completion status) divided into four temporal subsets based on their starting year. Additional results are


307x217mm (300 x 300 DPI)

Page 21 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Characteristics of trials whose results were published as a journal article or through a direct submission to the clinicaltrials.gov results database. Trial published as an article are divided into high and low-impact

classes based on the 2016 H-index of the corresponding journal; see Methods. Amongst 945 trials published

in high-impact journals (H index > 400), 459 (48.6%) were funded by Big Pharma, 178 by Other, 169 by Small Pharma, and 139 by NIH. Detailed statistics available in Suppl. File 3.

172x114mm (300 x 300 DPI)

Page 22 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Trial results dissemination rates by trial funding category. A: Trial dissemination rates by funder type: overall dissemination rate, journal publications, and submission of structured results to clinicaltrials.gov. B: Time to first results dissemination (whether through submission to clinicaltrials.gov results database or

publication in a journal article). The vertical line indicates the 12-month deadline for reporting mandated by FDAAA. C: Time to results reporting on clinicaltrials.gov. D: Time to journal article publication.

324x242mm (300 x 300 DPI)

Page 23 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Trial results dissemination rates and funding source distribution across seven medical specialties. A: For each specialty, the plot shows the fraction of completed Phase 2-4 trials whose results were publicly

disseminated, either as a journal article or through direct submission to clinicaltrials.gov results database. B: Funding source distribution by medical specialty.

161x64mm (300 x 300 DPI)

Page 24 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


249,904 clinicalstudiesregisteredwith

clinicaltrials.govuntilJuly19,2017

119,840 interventional(bio)pharmaceutical

clinicalstudies

Excludedstudies (N=130,064)• 50,111non-interventional studies• 79,953studiesnotevaluating

drugsorbiologics

45,620 completedstudiesexpectedtopublish their

results

Excludedstudies (N=74,220):• 14,873studieswithmissingstart

orcompletiondateinformation• 45,458studiesnotyetcompleted• 7,241studiescompletedbefore

Jan2006• 6,648studiescompletedafterJuly

19,2015(lessthan24monthsago)

• 0studiesthatfiledanapplicationtopostpone resultsreporting

27,835 completedstudiesfromclinicalphases

coveredbythemandateforresultspublication

(Phases2-4)

Excludedstudies (N=17,785)• 4,149studieswithnophase

assignment• 13,636studieswithphasesother

than2-4

Overviewana

lysis

Mainan

alysis

(tria

ldesignan

ddissem

ination)

Page 25 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


1

Appendix

Impactoffundingsourceanddiseasestudiedontrialdisseminationanddesign.Alargeintegrativeanalysis.

MagdalenaZwierzyna1,2,*,MarkDavies1,AroonHingorani2,3,JackieHunter1,41BenevolentBioLtd,London,UnitedKingdom;2InstituteofCardiovascularScience,UniversityCollegeLondon,London,UnitedKingdom;3FarrInstituteofHealthInformatics,London,UnitedKingdom;4StGeorge’sHospitalMedicalSchool,London,UnitedKingdom

Methods........................................................................................................................................2Annotatingcontrolledtrials...................................................................................................................2MeSHtermassignmentanddiseasecategorymapping........................................................................2Categorizingindustrialtrialsponsors.....................................................................................................3Trialmetadata:fielddefinitions.............................................................................................................3

Results...........................................................................................................................................5Overallcharacteristicsofstudiesregisteredwithclinicaltrials.gov.......................................................5Trialsacrossclinicalphases....................................................................................................................5

Table1:Trialcharacteristics..............................................................................................................5Figure1:Fundingsourcedistributionperclinicalphase....................................................................6

Trialsacrossdiseases..............................................................................................................................7Figure2:Diseasesstudiedinclinicaltrialsconductedovertime.......................................................7Figure3:Mapsofclinicaltriallocationsfortwodiseaseareas.........................................................7Figure4:Trialcharacteristics:cancervsnon-cancertrials................................................................8

Trialsdesignbyfundertype...................................................................................................................9Figure5:Phase1trialcharacteristicsbyfundingsource...................................................................9Figure6:Phase1trialcharacteristicsbyfundingsource...................................................................9Figure7:Characteristicsoftrialssponsoredbybigandsmallindustrialorganizations(acrossallclinicalphases).................................................................................................................................10

Trialdisseminationrates......................................................................................................................10Figure8:Characteristicsoftrialswhoseresultswerepublishedonlythroughadirectsubmissiontotheclinicaltrials.govresultsdatabase(nojournalpublication)andtrialswhoseresultswerenotpublishedinanyform.......................................................................................................................10Figure9:Trialresultsdisseminationratesbyclinicalphase............................................................11Table2:Disseminationratesandtimetoresultsdisclosurebysponsortype..................................11Figure10:Trendsintrialreportingonclinicaltrials.govovertime..................................................11

Globalizationofclinicalresearch.........................................................................................................11Figure11:Trendsinclinicaltriallocations.......................................................................................12Figure12:Locationsoftrialsregisteredwithclinicaltrials.govforfourtemporalsubsets..............12

Changesintrialcomplexity..................................................................................................................12Figure13:Averagenumberofeligibilitycriteriaandoutcomemeasuresusedinclinicaltrials......13

References..................................................................................................................................14

Page 26 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


2

MethodsAnnotatingcontrolledtrialsPrior to the analysis, all trials were annotated as controlled or non-controlled based on theinformation in the clinicaltrials.gov registry. Due to the inconsistencies in the trial records, theinformationabouttheuseofcontrolswassoughtinboth:structureddatafields(interventionname,armtype)andunstructureddatafields(brieftitle,officialtitle,briefsummary,detaileddescription).Thekeyword-basedsearchesweredevelopedfollowingamanualanalysisofasubsetof500randomlyselectedclinicaltrials.Aninterventionaltrialwastaggedas“placebo-controlled”if:

• Ithasanarmoftype“Placebocomparator”,“Nointervention”or“Shamcomparator”• Ithasaninterventionwhosenameincludesanyofthefollowingphrases:“placebo”,“vehicle”,

“sugarpill”• It’sbrieftitle,officialtitle,briefsummaryordetaileddescriptioncontainsanyofthefollowing

phrases:“placebo”,“vehicle”,“sugarpill”Aninterventionaltrialwastaggedas“comparator-controlled”if:

• Ithasanarmoftype“Activecomparator”• It’sbrieftitle,officialtitle,briefsummaryordetaileddescriptioncontainsanyofthefollowing

phrases: “active-controlled”, “active comparator”, “comparative study”, “non-inferiority”,“standardtherapy”,“standardofcare”,“standardtreatment”

Aninterventionaltrialwastaggedas“controlled”if:• Ithasbeentaggedaseither“placebo-controlled”or“comparatorcontrolled”• Ithasaninterventionwhosenameincludestheword“control”• It’sbrieftitle,officialtitle,briefsummaryordetaileddescriptioncontainsanyofthefollowing

phrases:“control”,“controlled”MeSHtermassignmentanddiseasecategorymappingToenableanalysisofclinicaltrialsbydiseasecategories,trialconditionsweremappedtoMeSHtermsusingadictionary-basednamedentityrecognitionapproachwithcustomdictionariescreatedthroughintegrationofMeSHterms (branchesCandF)withDiseaseOntology.A totalof171,772 (86.04%)conditiontermswereassignedaMeSHterm,correspondingto21,884(70.92%)ofdistinctdiseasestringsfrom105,453(88%)ofthetrials.Theannotationsweremanuallyevaluatedonasubsetof500randomlyselectedtrialconditions.Manyconditionswerenotgroundedastheydonotrepresentadisease,e.g.“Healthyvolunteers”,“Contraception”,“Anesthesia”,“Smokingcessation”,etc.FollowingMeSHtermassignment,thediseasesweremappedtodiseasecategoriesbasedonmatchingtermsfromlevels2and3oftheMeSHtermhierarchyasspecifiedinthetablebelow.

MeSHtreeterm(s) DiseasecategoryC15.378 HematologyC01;C02;C03 InfectiousdiseaseC11 OphthalmologyC18 MetabolicdiseaseC06 GastroenterologyC20.543 AllergyC19 EndocrineC08 RespiratorydiseaseC12;C13 UrologyC20.111 Autoimmunedisease

Page 27 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


3

C10 NeurologyC14 CardiovasculardiseaseF01;F02;F03;F04 PsychiatryC04 Oncology

CategorizingindustrialtrialsponsorsFor detailed analysis of industry-led trials, industrial organizations were further classified as “BigPharma”or“SmallPharma”.Wefirstgeneratedalistof50toppharmaceuticalcompanies,basedonthe2016PharmaceuticalExecutiveranking[1].Next,weappendedthelistwith11companiesthatwere not included in the 2016 ranking, but sponsored more than 100 trials registered withclinicaltrials.gov. Next, for each company we developed a regular expression to match the “leadsponsor” field in the registry; the expressions might include official company names (e.g.GlaxoSmithKline),acronyms(GSK),andpreviousnames(GlaxoWellcome,SmithKlineBeecham).Theperformanceoftheregularexpressionswasmanuallyevaluatedonasubsetofthedata.Big Pharma companies include: Pfizer,Novartis, Roche,Merck, Sanofi, Johnson& Johnson,GileadSciences,GSK,AbbVie,Amgen,AstraZeneca,Allergan,TevaPharmaceuticalIndustries,Bristol-MyersSquibb,EliLilly,Bayer,NovoNordisk,BoehringerIngelheim,Takeda,Celgene,AstellasPharma,Shire,Mylan,BiogenIdec,DaiichiSankyo,CSL,MerckKGaA,ValeantPharmaceuticalsInternational,Otsuka,SunPharmaceuticalIndustries,EisaiCo,LesLaboratoiresServier,EndoHealthPharmaceuticals,UCB,Abbott,Fresenius,ChugaiPharmaceuticals,Grifols,RegeneronPharmaceuticals,DainipponSumitomoPharma,AlexionPharmaceuticals,MalickrodtPharmaceuticals,Menarini,MitsubishiTanabePharma,Luin,Actelion,AspenPharmacare,KyowaHakkoKirin,OnoPharmaceutical,FerringPharmaceuticals,Janssen, Genentech, Alcon, Dr. Reddy’s Laboratories, Forest Laboratories (now Allergan), FerringPharmaceuticals, Sunovion Pharmaceuticals, Actelion Pharmaceuticals, Leo Pharma, RanbaxyLabratories,MedImmune.Trialmetadata:fielddefinitionsThe following definitionswere derived from the “Glossary of Common Site Terms” and “ProtocolRegistrationDataElementDefinitions”pagesoftheclinicaltrials.govregistry[2,3].

1. Randomization(Allocation)A clinical trial design strategy used to assign participants to an arm of a study. Types ofAllocationsincluderandomizedandnonrandomized.Randomized:Participantsareassignedto intervention groups by chance;Nonrandomized: Participants are expressly assigned tointerventiongroupsthroughanon-randommethod,suchasphysicianchoice.

2. InterventionModelThe general design of the strategy for assigning interventions to participants in a clinicalstudy.TypesofInterventionModelsincludeSingleGroupdesign,Paralleldesign,Cross-overdesign,andFactorialdesign.

3. Masking(Blinding)Aclinicaltrialdesignstrategyinwhichoneormorepartiesinvolvedinthetrial,suchastheinvestigator or participants, do not know which participants have been assigned whichinterventions.TypesofMaskingincludeOpenLabel,SingleBlindMasking,andDoubleBlindMasking.

4. EnrolmentThenumberofparticipants inaclinical study (totalnumberofparticipants forcompletedstudiesorestimatednumberofparticipantsforongoingstudies)

5. ArmTypes

Page 28 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


4

Arm type is a general description of the clinical study arm. It identifies the role of theintervention that participants will receive. Types of arms include Experimental, ActiveComparator,PlaceboComparator,ShamComparator,andNoIntervention.

6. DurationoftheStudyTime(inmonths)betweenthestudystartdateandprimarycompletiondate.

7. NumberofOutcomeMeasuresAnoutcomemeasureistheplannedmeasurementdescribedintheprotocolthatisusedtodetermine the effect of interventions on participants in a clinical trial. For Observationalstudies,ameasurementorobservationthatisusedtodescribepatternsofdiseasesortraits,or associations with exposures, risk factors, or treatment. Types of Outcome MeasuresincludePrimaryOutcomeMeasureandSecondaryOutcomeMeasure.

8. NumberofEligibilityCriteriaEligibilitycriteriaarethekeystandardsthatpeoplewhowanttoparticipateinaclinicalstudymustmeetor the characteristics theymusthave.EligibilityCriteria includeboth inclusioncriteriaandexclusioncriteria.Forexample,astudymightonlyacceptparticipantswhoareaboveorbelowcertainages.

9. StudyPhaseFoodandDrugAdministration(FDA)descriptionsoftheclinicaltrialofadrugbasedonthestudy's characteristics, such as the objective and number of participants. There are fivephases:EarlyPhase1(Phase0),Phase1,Phase2,Phase3,Phase4.

10. LeadSponsorTheorganizationorpersonwhooverseestheclinicalstudyandisresponsibleforanalyzingthestudydata.

11. DiseaseCategoryTrialconditionisthedisease,disorder,syndrome,illness,orinjurythatisbeingstudied.OnClinicalTrials.gov,conditionsmayalso includeotherhealth-relatedissues,suchas lifespan,qualityoflife,andhealthrisks.DiseasecategoryisamainclinicalspecialtyassignedtothetrialconditionbasedonMeSHdiseasehierarchyasdescribedabove.

12. CountriesCountriesinwhichstudylocationsarebased.

13. NumberofLocationsLocationisanorganizationwheretheclinicalstudyisbeingconducted.

14. DisseminationofResultsPublic disclosure status of trial results; includes structured result submission to theclinicaltrials.govresultsdatabaseandpublicationinajournalarticle.

15. DMC(DataMonitoringCommittee)Agroupofindependentscientistswhomonitorthesafetyandscientificintegrityofaclinicaltrial.Thegroupcanrecommendtothestudysponsorthatthestudybestoppedifitisnoteffective,isharmingparticipants,orisunlikelytoserveitsscientificpurpose.Membersarechosenbasedonthescientificskillsandknowledgeneededtomonitortheparticulartrial.Alsoreferredtoasadatasafetyandmonitoringboard(DSMB).

Page 29 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


5

ResultsOverallcharacteristicsofstudiesregisteredwithclinicaltrials.govTable1reportsbasiccharacteristicsofalltrialsinthedataset.Registeredstudiesencompassavarietyofdesignsandvaryintermsofduration(IQR335-1218days),size(IQR26-157patients),numberofoutcomemeasures,andeligibilitycriteria.Mostregisteredstudiesareorientedtowardstreatment(78.5%) and prevention (8.6%). 14.3% of trials are international (involving locations in up to 60countries),31.7%reportedmultiplecollaboratingsponsors,and34.9%reportedinvolvementofdatamonitoringcommittees(DMCs).Overall,themajorityofregisteredtrialsaresmall(with62.4%studiesenrollinglessthan100patients)andsingle-site(62.6%studies);smalltrialsarecommoneveninPhase3wherealmostaquarterofallstudiesreportenrolmentsizelowerthan100.Inaddition,arelativelysmall fraction of studies reported the use of randomization (62.0%), control group (64.2%), andblinding(42.9%).Asreportedinthenextsections,thesestatisticsvarysubstantiallydependingontrialphase,sponsortype,anddiseasearea.TrialsacrossclinicalphasesTable1:TrialcharacteristicsBasictrialcharacteristicsacrossphases.Mediansandinterquartilerangesarereportedforcontinuousvariables;countsandpercentagesarereportedforcategoricaldata(withtotalnumberoftrialsinagivencategoryusedasdenominator).

Alltrials Phase1 Phase2 Phase3

Trialcount 119,840 23,388 32,295 22,573

Non-randomized 41266(34.4%) 11258(48.1%) 13273(41.1%) 3456(15.3%)

Randomized 74313(62.0%) 10880(46.5%) 17288(53.5%) 18701(82.8%)

Allocationmissing 4261(3.6%) 1250(5.3%) 1734(5.4%) 416(1.8%)

OpenLabel 64222(53.6%) 15688(67.1%) 18002(55.7%) 8750(38.8%)

SingleBlind 6383(5.3%) 905(3.9%) 961(3.0%) 992(4.4%)

DoubleBlind 45094(37.6%) 5701(24.4%) 11820(36.6%) 12102(53.6%)

Maskingmissing 4141(3.5%) 1094(4.7%) 1512(4.7%) 729(3.2%)

Multi-site 44775(37.4%) 6356(27.2%) 15281(47.3%) 12678(56.2%)

International 17173(14.3%) 1804(7.7%) 5564(17.2%) 7078(31.4%)

Page 30 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


6

Enrollment 60(IQR:26-157)

30.0(IQR:18-50)

52.0(IQR:27-116)

252.0(IQR:99-550)

Duration(days) 701.0(IQR:335-1,218)

366.0(IQR:92-974)

792.0(IQR:427-1,370)

730.0(IQR:396-1,249)

Thedatasetderivedfromclinicaltrials.govregistryencompassedPhase1(n=23,388),Phase2(n=32,295),andPhase3studies(n=22,573).AssummarizedbyTable1,trialsatdifferentstagesdifferwidelyintermsofdesigncharacteristicsandpublicationrates,tosomeextentreflectingthedifferenceintheirprimaryobjectives,i.e.first-in-humansafetystudiesinPhase1vsefficacyevaluationinlaterphases.Asexpected,Phase1studieshadsmallestenrolmentsize(median30),durationtime(median366days)aswellasfractionofRCTsasshownbythelowestpercentageofstudiesreportingtheuseofrandomization(46.5%),blinding(28.2%),andcontrolgroups(46.5%).Despite similar efficacy-oriented objectives, registered Phase 2 and Phase 3 trials have largelydifferentcharacteristics.WhilstmostPhase3studiesfollowRCTdesignparadigms,arelativelysmallfractionofPhase2trialsreportedtheuseofrandomization,blinding,andcontrolgroups.Specifically,only53.5%ofPhase2trialsarerandomized(comparedto82.8%inPhase3);58.3%arecontrolled(comparedto79.9%inPhase3);and39.6%areblinded(comparedto58%inPhase3).Phase2trialshavesmallerenrolmentsizewithmedian52vs252patients inPhase3.However,despitesmallerpatientpopulationsinPhase2,bothphaseshavecomparableduration:median792days(IQR427-1,370days)inPhase2andmedian730days(IQR396-1,249days)inPhase3.Inaddition,phase2trialshavemore eligibility criteria (suggesting amuch narrower target patient population) and are lesscommonlycarriedinmultiplelocations(47.3%vs56.2%inPhase3)oracrossmultiplecountries(17.2%vs31.4%),reflectinglowerrecruitmentneeds.Funding sourcedistributionvariedacross clinicalphases. Industrywas themain funding sourceofPhase1throughPhase3trials.Non-profitsponsors(NIHandOther)weremorelikelytofundPhase2andPhase4studies;SeeFig1.

Figure1:Fundingsourcedistributionperclinicalphase.Thegraphshowsthenumberoftrialsfundedbyfourdifferenttypesofinstitutionsacrossclinicalphases.

Page 31 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


7

TrialsacrossdiseasesTrials registered in clinicaltrials.govcovermultiplemedical specialties.Oncology is the singlemostrepresenteddisciplinewith29.1%trials,followedbyinfectiousdiseases,andcardiovasculardiseases.Closer analysis of trials conducted over time revealed several interesting trends in trial indicationfocus. These included the decline of industry-funded studies in cardiovascular diseases (withincreasingnumberofsuchstudiesfundedbyuniversitiesandothernon-profitorganizations)aswellastrendsinstudyingparticularconditions(Fig2).Forinstance,highgrowthinthenumberoftrialswasobservedforarthritisandpsoriasis,whilstseveralCNSdisordersincludingdepressionandbipolardisordershowednegativegrowth.Bycontrast,prostatecancer,lymphomaandseveralothercancersshowednoobvioustrend.Figure2:Diseasesstudiedinclinicaltrialsconductedovertime.

Sincedifferencesbetweendiseasesmight impacttherecruitment,design,andoverall“success”ofclinicaltrials[4-6],wenextinvestigatedthetrialvariabilityby7maindiseaseareasdefinedinHayetal.:oncology,neurology,autoimmune,endocrine,respiratory,cardiovascular,andinfectiousdiseases[4].Asinthesponsor-centeredanalysisdescribedinthemaintext,heterogeneitybetweentrialsbydiseasecategorywasmostevidentinPhase2.Theanalysisshowedlargeheterogeneityasafunctionofdiseasecategory,includingdiscrepanciesintrial design characteristics, geographic locations (Fig 3), and publication rates (Fig 7 in mainmanuscript).Trialsindifferentdiseaseareasvariedconsiderablyindurationandenrolmentsize.ThelongestPhase2 trialswere inoncology: (median1,157days), the shortest: in respiratorydiseases(median457days).Onthe levelof individualdiseases,Waldenstrommacroglobulinemia(atypeofnon-Hodgkin lymphoma), splenic marginal zone lymphoma, and anaplastic astrocytoma had thelongesttrialsonaverage,whilstallergicseasonalrhinitis,ocularhypertension,allergicconjunctivitis,andinfluenzawerethediseaseswithshortesttrials.Similarly,Phase2trialsconductedindifferentdiseaseareasdifferedsubstantiallyintermsofenrolmentsize.Thesmallesttrialswereinoncology(median51patients),thelargestininfectiousdiseases(median117patients),withinfluenzahavinglargestenrolmentsizeintheentiredataset.Figure3:Mapsofclinicaltriallocationsfortwodiseaseareas.Thechloroplethmapsshowthenumberoftrialsconductedpercountryforpsychiatryandinfectiousdiseases,demonstratingdifferencesintermsofgeographiclocationsbetweenclinicalresearchinthetwospecialties.

Page 32 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


8

Inadditiontosizeandlength,trialsvariedintermsofdesignstrategies.Cancertrials, inparticular,hadverydifferentcharacteristicscomparedtootherdiseaseareas.Forinstance,theyrarelyfollowedRCTdesign inPhase2asonly25.8%ofPhase2cancerstudiesreportedtheuseofrandomization,34.5%werecontrolled,and7.8%wereblinded(comparedto83.3%,85.2%,and79%trialsconductedfor respiratory diseases – the category with largest proportion of RCTs). Fig 4 compares thecharacteristicsofcancertrialstostudiesconductedinallotherindications.Figure4:Trialcharacteristics:cancervsnon-cancertrials.Three radar charts compare the characteristics of clinical trials in oncology with trials in otherspecialtiesacrossthreeclinicalphases.

Phase 1 Phase 3

Phase 2

Blinded

Open label

Randomized

Non-randomized

Placebo

Comparator

Non-controlled

Large

Multi-country

Long

Small

Crossover

Single group

Short

DMC

20%

40%

60%

80%

100%

Blinded

Open label

Randomized

Non-randomized

Placebo

Comparator

Non-controlled

Large

Multi-country

Long

Small

Crossover

Single group

Short

DMC

20%

40%

60%

80%

100%

Blinded

Open label

Randomized

Non-randomized

Placebo

Comparator

Non-controlled

Large

Multi-country

Long

Small

Crossover

Single group

Short

DMC

20%

40%

60%

80%

100%

CancerNon-cancer

Page 33 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


9

TrialsdesignbyfundertypeFigure5:Phase1trialcharacteristicsbyfundingsource.

Figure6:Phase3trialcharacteristicsbyfundingsource.

Page 34 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


10

Figure7:Characteristicsoftrialssponsoredbybigandsmallindustrialorganizations(acrossallclinicalphases).

TrialdisseminationratesFigure 8: Characteristics of trials whose results were published only through a directsubmissiontotheclinicaltrials.govresultsdatabase(nojournalpublication)andtrialswhoseresultswerenotpublishedinanyform.

Page 35 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


11

Figure9:Trialresultsdisseminationratesbyclinicalphase.Left:Trialdisseminationratesbytrialfundingsourceacrossclinicalphases.Right:Overalltrialdisseminationratesanddistinctionbetweenjournalpublicationsandsubmissionofstructuredresultstoclinicaltrials.gov.

Table2:Disseminationratesandtimetoresultsdisclosurebysponsortype.

Figure10:Trendsintrialreportingonclinicaltrials.govovertime.

Fundercategory

Fractionofpubliclydisclosedtrials

Mediandissemination

lag

Mediantimetoreportingon

clinicaltrials.gov

Mediantimetojournalpublication

Numberoftrials

Bigpharma 0.667 15.467 13.083 26.000 11,508

Smallpharma 0.452 21.767 20.833 25.767 6,119

NIH 0.600 17.883 18.167 18.283 2,417Other 0.406 18.933 17.767 19.933 7,649

Page 36 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


12

GlobalizationofclinicalresearchStudies registeredwith clinicaltrials.govhavebecomemore globalwith time, although thereexistlargediscrepanciesdependingon thesponsor type; seeFig11andFig4d in themainmanuscript.Overallnumberofcountriesrepresentedintheregistryincreasedfrom84in2001to185in2017and,inaddition,thegeographicaldistributionoftrialshasshiftedovertime;seeFig12.Forinstance,whenrankedbythenumberofregisteredclinicaltrials,Chinahasjumpedfromthe35thplacein2005to8thplace in2017.Thesechanges innumberandcompositionofcountries included inclinicaltrials.govreflectthewideruseofthisprimarilyAmericanregistryaswellastrendsintheglobalisationofclinicaltrialsingeneral.[7]Figure11:Trendsinclinicaltriallocations.

Figure12:Locationsoftrialsregisteredwithclinicaltrials.govforfourtemporalsubsets.

Page 37 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


13

ChangesintrialcomplexityAnalysis of clinicaltrials.gov data shows that the number of trial eligibility criteria andoutcomemeasuresusedintrialshasincreasedovertime(Fig12).Asthesereflectthenumberofproceduresthathavetobeperformedinatrialthistrendsuggeststhatthecomplexity–andhencethecost–ofrunningclinicaltrialshasincreasedinrecentyears.Figure13:Averagenumberofeligibilitycriteriaandoutcomemeasuresusedinclinicaltrials.

Page 38 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


14

References1. Top 50 Global Pharma Companies. Annual ranking published by Pharmacutical

Executive. Available from: https://www.rankingthebrands.com/The-Brand-Rankings.aspx?rankingID=370.Accessed21/06/2017.

2. ClinicalTrials.gov Glossary of common site terms. Available from:https://clinicaltrials.gov/ct2/about-studies/glossary.Accessed:19/12/2017.

3. ClinicalTrials.govProtocolregistrationdataelementdefinitionsfor interventional and observational studies. Available from:

prsinfo.clinicaltrials.gov/definitions.html.19/12/2017.4. Hay,M.,etal.,Clinicaldevelopmentsuccess rates for investigationaldrugs.Nature

Biotechnology,2014.32(1):p.40.5. Hwang,T.J.,etal.,Failureofinvestigationaldrugsinlate-stageclinicaldevelopment

andpublicationoftrialresults.JAMAInternalMedicine,2016.176(12):p.1826-1833.6. Lagakos,S.W.,Clinicaltrialsandrarediseases.2003,MassMedicalSoc.7. Moses,H.,etal.,Theanatomyofmedicalresearch:USandinternationalcomparisons.

JAMA,2015.313(2):p.174-189.

Page 39 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Trial characteristics and publication of trial results

The table shows the characteristics of published and unpublished trials, including studies whose results were published in high-impactjournals (2016 H-index > 400), lower impact journals (H-index < 50) as well as trials whose results were not published in a form of a journalarticle, but were submitted to the clinicaltrials.gov results database.

Medians and interquartile ranges are reported for continuous variables; counts and percentages are reported for categorical data (with totalnumber of trials in a given category used as denominator). Percentages might not sum to 100% as some categories are not mutually exclusive(for instance, some trials include both comparator and placebo control groups).

The information is based on the data from clinicaltrials.gov data and from our derived annotations.

• Information about controls used in trials was derived from intervention name, arm type, trial title, and description fields (see the Methodssection of the article).

• Trials were designated as “collaborative” if they were sponsored by more than one institution.

• Industrial organizations were classified as Big Pharma and Small Pharma based on their sales in 2016 as well as overall number ofregistered trials they sponsored (see Methods).

• Primary purpose of clinical trials was classified as “Other purpose” for the following types in clinicaltrials.gov: “Diagnostic”, “SupportiveCare”, “Health Services Research”, “Screening”, “Educational/Counseling/Training”, “Device Feasibility”, “Other”.

• Clinical phase was classified as “Other phase” for the following types in clinicaltrials.gov: Phase 0 (Early Phase 1), combined Phase1/Phase 2, combined Phase 2/Phase 3, and trials with no phase assigned.

• Trials were classified as “Large” if they enrolled more than 100 patients and as “Long” if their duration was longer than 365 days.

Page 40 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Table 1: Characteristics of published and unpublished clinical trials

High-impact journals Low-impact journals clinicaltrials.gov Unpublished

Trial count 1126 1548 8999 24926

Non-Randomized 108 (9.6%) 322 (20.8%) 3526 (39.2%) 8494 (34.1%)Randomized 1009 (89.6%) 1223 (79.0%) 5459 (60.7%) 15909 (63.8%)Allocation missing 9 (0.8%) 3 (0.2%) 14 (0.2%) 523 (2.1%)

Open Label 349 (31.0%) 627 (40.5%) 5043 (56.0%) 13177 (52.9%)Single Blind 41 (3.6%) 122 (7.9%) 451 (5.0%) 1457 (5.8%)Double Blind 725 (64.4%) 796 (51.4%) 3500 (38.9%) 9740 (39.1%)Masking missing 11 (1.0%) 3 (0.2%) 5 (0.1%) 552 (2.2%)

Comparator-controlled 511 (45.4%) 699 (45.2%) 3256 (36.2%) 8145 (32.7%)Placebo-controlled 654 (58.1%) 723 (46.7%) 3237 (36.0%) 9318 (37.4%)Non-controlled 164 (14.6%) 334 (21.6%) 3204 (35.6%) 9236 (37.1%)

Phase 1 58 (5.2%) 190 (12.3%) 904 (10.0%) 8003 (32.1%)Phase 2 232 (20.6%) 290 (18.7%) 2960 (32.9%) 5890 (23.6%)Phase 3 617 (54.8%) 497 (32.1%) 2157 (24.0%) 3395 (13.6%)Phase 4 105 (9.3%) 344 (22.2%) 1629 (18.1%) 3466 (13.9%)Other phase 65 (5.8%) 92 (5.9%) 601 (6.7%) 1778 (7.1%)Phase missing 49 (4.4%) 135 (8.7%) 748 (8.3%) 2394 (9.6%)

Collaborative 446 (39.6%) 416 (26.9%) 2748 (30.5%) 6930 (27.8%)

Big Pharma 500 (44.4%) 738 (47.7%) 4601 (51.1%) 8211 (32.9%)Small Pharma 211 (18.7%) 267 (17.2%) 1703 (18.9%) 6591 (26.4%)NIH 176 (15.6%) 44 (2.8%) 1129 (12.5%) 2007 (8.1%)Other 229 (20.3%) 494 (31.9%) 1491 (16.6%) 7983 (32.0%)U.S. Fed 10 (0.9%) 5 (0.3%) 75 (0.8%) 134 (0.5%)

DMC reported 645 (57.3%) 487 (31.5%) 2565 (28.5%) 7018 (28.2%)no DMC reported 227 (20.2%) 762 (49.2%) 4882 (54.3%) 12409 (49.8%)DMC missing 254 (22.6%) 299 (19.3%) 1552 (17.2%) 5499 (22.1%)

Crossover Assignment 36 (3.2%) 155 (10.0%) 846 (9.4%) 3773 (15.1%)

Page 41 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Factorial Assignment 43 (3.8%) 20 (1.3%) 90 (1.0%) 291 (1.2%)Parallel Assignment 915 (81.3%) 1053 (68.0%) 4886 (54.3%) 11977 (48.1%)Sequential Assignment 0 (0.0%) 0 (0.0%) 0 (0.0%) 3 (0.0%)Single Group Assignment 114 (10.1%) 313 (20.2%) 3163 (35.1%) 8081 (32.4%)Intervention model missing 18 (1.6%) 7 (0.5%) 14 (0.2%) 801 (3.2%)

Basic Science 8 (0.7%) 53 (3.4%) 288 (3.2%) 1576 (6.3%)Prevention 164 (14.6%) 220 (14.2%) 790 (8.8%) 2023 (8.1%)Treatment 922 (81.9%) 1173 (75.8%) 7227 (80.3%) 18345 (73.6%)Other purpose 17 (1.5%) 41 (2.6%) 335 (3.7%) 1089 (4.4%)Primary purpose missing 15 (1.3%) 61 (3.9%) 359 (4.0%) 1893 (7.6%)

Enrollment 387.5 (IQR: 1107.25-137.0) 104.0 (IQR: 323.0-42.0) 63.0 (IQR: 186.75-30.0) 48.0 (IQR: 109.0-24.0)Large 906 (80.5%) 809 (52.3%) 3518 (39.1%) 6837 (27.4%)Small 214 (19.0%) 734 (47.4%) 5480 (60.9%) 17652 (70.8%)Size category missing 6 (0.5%) 5 (0.3%) 1 (0.0%) 437 (1.8%)

Length 1003.5 (IQR: 1522.0-580.0) 488.0 (IQR: 823.0-243.0) 669.0 (IQR: 1216.0-334.0) 488.0 (IQR: 1034.0-184.0)Long 946 (84.0%) 921 (59.5%) 6329 (70.3%) 13522 (54.2%)Short 118 (10.5%) 573 (37.0%) 2634 (29.3%) 9372 (37.6%)Length Category missing 62 (5.5%) 54 (3.5%) 36 (0.4%) 2032 (8.2%)

Multi-site 804 (71.4%) 724 (46.8%) 4215 (46.8%) 8264 (33.2%)Multi-country 522 (46.4%) 312 (20.2%) 1655 (18.4%) 2698 (10.8%)Multi-continent 391 (34.7%) 203 (13.1%) 1081 (12.0%) 1580 (6.3%)Number of locations 14.0 (IQR: 68.0-2.0) 2.0 (IQR: 17.0-1.0) 2.0 (IQR: 13.0-1.0) 1.0 (IQR: 3.0-1.0)

Number of eligibility criteria 14.0 (IQR: 23.0-8.0) 12.0 (IQR: 20.0-8.0) 13.0 (IQR: 22.0-8.0) 12.0 (IQR: 20.0-6.0)Number of outcomes 5.0 (IQR: 9.0-3.0) 4.0 (IQR: 7.0-2.0) 4.0 (IQR: 8.0-2.0) 3.0 (IQR: 5.0-2.0)

Page 42 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Trial characteristics across sponsors

The table shows the characteristics of clinical trials across four sponsor types available in clinicaltrials.gov (lead sponsor field). Each tablecovers trials of di↵erent clinical phases.

Medians and interquartile ranges are reported for continuous variables; counts and percentages are reported for categorical data (with totalnumber of trials in a given category used as denominator). Percentages might not sum to 100% as some categories are not mutually exclusive(for instance, some trials include both comparator and placebo control groups).

The information is based on the data from clinicaltrials.gov data and from our derived annotations.

• Information about controls used in trials was derived from intervention name, arm type, trial title, and description fields (see the Methodssection of the article).

• Trials were designated as “collaborative” if they were sponsored by more than one institution.

• Industrial organizations were classified as Big Pharma and Small Pharma based on their sales in 2016 as well as overall number ofregistered trials they sponsored (see Methods).

• Primary purpose of clinical trials was classified as “Other purpose” for the following types in clinicaltrials.gov: “Diagnostic”, “SupportiveCare”, “Health Services Research”, “Screening”, “Educational/Counseling/Training”, “Device Feasibility”, “Other”.

• Clinical phase was classified as “Other phase” for the following types in clinicaltrials.gov: Phase 0 (Early Phase 1), combined Phase1/Phase 2, combined Phase 2/Phase 3, and trials with no phase assigned.

• Trials were classified as “Large” if they enrolled more than 100 patients and as “Long” if their duration was longer than 365 days.

Page 43 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Table 1: Trial characteristics across sponsors (Phase 1)

Big pharma Small pharma NIH Other

Trial count 5123 3116 878 1300




Collaborative 754 (14.7%) 620 (19.9%) 508 (57.9%) 338 (26.0%)


Crossover Assignment 1759 (34.3%) 920 (29.5%) 57 (6.5%) 187 (14.4%)Factorial Assignment 22 (0.4%) 27 (0.9%) 17 (1.9%) 22 (1.7%)Parallel Assignment 1618 (31.6%) 982 (31.5%) 257 (29.3%) 381 (29.3%)Sequential Assignment 0 (0.0%) 3 (0.1%) 0 (0.0%) 0 (0.0%)Single Group Assignment 1692 (33.0%) 1151 (36.9%) 358 (40.8%) 670 (51.5%)Intervention model missing 32 (0.6%) 33 (1.1%) 189 (21.5%) 40 (3.1%)


Enrollment 32.0 (IQR: 53.0-20.0) 30.0 (IQR: 48.0-20.0) 28.5 (IQR: 49.25-17.0) 21.0 (IQR: 38.5-12.0)

Page 44 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Large 350 (6.8%) 180 (5.8%) 63 (7.2%) 67 (5.2%)Small 4738 (92.5%) 2909 (93.4%) 785 (89.4%) 1216 (93.5%)Size category missing 35 (0.7%) 27 (0.9%) 30 (3.4%) 17 (1.3%)




Page 45 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960




Trial count 4158 3078 1677 2535




Collaborative 1188 (28.6%) 754 (24.5%) 1083 (64.6%) 697 (27.5%)


Crossover Assignment 282 (6.8%) 255 (8.3%) 61 (3.6%) 179 (7.1%)Factorial Assignment 19 (0.5%) 37 (1.2%) 29 (1.7%) 33 (1.3%)Parallel Assignment 2430 (58.4%) 2010 (65.3%) 582 (34.7%) 1097 (43.3%)Single Group Assignment 1404 (33.8%) 755 (24.5%) 790 (47.1%) 1135 (44.8%)Intervention model missing 23 (0.6%) 21 (0.7%) 215 (12.8%) 91 (3.6%)


Enrollment 83.0 (IQR: 196.0-40.0) 76.5 (IQR: 159.0-36.0) 48.0 (IQR: 87.0-26.0) 43.0 (IQR: 75.0-25.0)Large 1877 (45.1%) 1275 (41.4%) 342 (20.4%) 432 (17.0%)

Page 46 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Small 2263 (54.4%) 1771 (57.5%) 1244 (74.2%) 2061 (81.3%)Size category missing 18 (0.4%) 32 (1.0%) 91 (5.4%) 42 (1.7%)




Page 47 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960




Trial count 4867 2186 428 1692




Collaborative 825 (17.0%) 464 (21.2%) 362 (84.6%) 520 (30.7%)


Crossover Assignment 192 (3.9%) 75 (3.4%) 14 (3.3%) 105 (6.2%)Factorial Assignment 27 (0.6%) 21 (1.0%) 23 (5.4%) 50 (3.0%)Parallel Assignment 3678 (75.6%) 1675 (76.6%) 325 (75.9%) 1271 (75.1%)Single Group Assignment 947 (19.5%) 398 (18.2%) 31 (7.2%) 192 (11.3%)Intervention model missing 23 (0.5%) 17 (0.8%) 35 (8.2%) 74 (4.4%)


Enrollment 350.0 (IQR: 674.0-144.0) 280.0 (IQR: 516.0-129.0) 255.0 (IQR: 607.0-129.5) 110.0 (IQR: 300.25-50.0)Large 4011 (82.4%) 1780 (81.4%) 342 (79.9%) 923 (54.6%)

Page 48 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


Small 841 (17.3%) 376 (17.2%) 77 (18.0%) 749 (44.3%)Size category missing 15 (0.3%) 30 (1.4%) 9 (2.1%) 20 (1.2%)




Page 49 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960




Print abstract

Study question: How does the distribution, design characteristics, and dissemination of clinical trials

differ by funding organisation and medical specialty?

Methods: Data for 45,620 clinical trials evaluating small molecules, biologics and vaccines were

downloaded from clincialtrials.gov, processed, categorized by funder type and disease area, and

linked to disclosed study results. Design characteristics and result dissemination rates were

examined for studies funded by different types of organizations and across major medical

specialties.

Study answer and limitations: Industry was more likely to fund large randomized clinical trials and

to disseminate trial results compared to non-profit trial funders (59.3% vs 45.3%), with large

pharmaceutical companies having higher disclosure rates than small ones (66.7% vs 45.2%). Trials

reporting the use of randomization were more likely to be published in a journal article compared to

non-randomized studies (34.9% vs 18.2%), and journal publication rates varied across disease areas,

ranging from 42% for autoimmune diseases to 20% for oncology. The analysis was limited to studies

registered with clinicaltrials.gov and our method for linking registry records to associated

publications relied on the presence of registry identifier (NCT number) in the text or metadata of a

journal article.

What this study adds: Design characteristics (as a marker of methodological quality) vary by funder

and medical specialty, and associate with reporting rates and the likelihood of publication in high-

impact journals.

Funding, competing interests, data sharing: This study received no specific external funding. The

authors declare no competing interests. All the data used in the study are available from public

resources.

Page 50 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960




Magdalena Zwierzyna1,2,

*, Mark Davies1, Aroon D Hingorani

2,3, Jackie Hunter

1,4

1BenevolentBio Ltd, London, United Kingdom

2Institute of Cardiovascular Science, University College London, London, United

Kingdom

3Farr Institute of Health Informatics, London, United Kingdom

4St George’s Hospital Medical School, London, United Kingdom

*Corresponding author

Magdalena Zwierzyna

Email: [email protected] (MZ)

Tel: +44 (0) 203 096 0720

BenenevolentBio Ltd

40 Churchway

NW1 1LW London

United Kingdom

Word count: 5,084

Page 51 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyAbstract

Objective: To investigate the distribution, design characteristics, and dissemination of clinical trials

by funding organisation and medical specialty.

Design: Cross-sectional descriptive analysis.

Data sources: Trial protocol information from clinicaltrials.gov, metadata of journal articles in which

trial results were published (PubMed), and quality metrics of associated journals from SCImago

Journal & Country Rank database.

Selection criteria: All 45,620 clinical trials evaluating small-molecule therapeutics, biologics,

adjuvants, and vaccines, completed after January 2006 and prior to July 2015, including randomized

controlled trials (RCTs) and non-randomized studies across all clinical phases

Results: Industry was more likely to fund large international RCTs compared to non-profit funders,

although methodological differences have been decreasing with time. Amongst 27,835 completed

efficacy trials (Phase 2-4), 54.2% had disclosed their findings publicly. Industry was more likely to

disseminate trial results compared to non-profit trial funders (59.3% vs 45.3%), with large

pharmaceutical companies having higher disclosure rates than small ones (66.7% vs 45.2%). NIH-

funded trials were disseminated more often than those of other non-profit institutions (60.0% vs

40.6%). Big Pharma and NIH-funded study results were more likely to appear in clinicaltrials.gov

than those from non-profit funders, which were published mainly as journal articles. Trials reporting

the use of randomization were more likely to be published in a journal article compared to non-

randomized studies (34.9% vs 18.2%), and journal publication rates varied across disease areas,

ranging from from 42% for autoimmune diseases to 20% for oncology.

Conclusions: Trial design and result dissemination varies substantially depending on the type and

size of funding institution as well as the disease area under study.

What this paper adds

What is already known on this subject:

• Suboptimal study design and slow dissemination of findings are common amongst registered

clinical trials; almost half of all studies remaining unpublished years after completion.

• A wider range of organisations is now funding clinical trials

• Reporting rates and other quality measures may vary by the type of funding organization.

What this study adds:

• Design characteristics (as a marker of methodological quality) vary by funder and medical

specialty, and associate with reporting rates and the likelihood of publication in high-impact

journals.

• Interventional trials with industrial funding are, on average, more robustly designed than

those funded by non-commercial organizations.

• Industry-funded trials are also more likely to be disclosed than those from other funders,

although small pharmaceutical organizations lag behind “Big Pharma”.

• Trials with NIH funding are more likely to disclose results compared to other academic and

non-profit organizations.

• Big Pharma and NIH-funded studies show high rates of reporting within the clinicaltrials.gov

results database.

Page 52 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Only• Trials funded by other academic and non-profit organizations were disclosed primarily as

journal articles, and rarely submitted to clinicaltrials.gov.

• Result dissemination rate varies by medical specialty, mainly due to differential journal

publication rates, ranging from 42% for autoimmune diseases to 20% for oncology.

INTRODUCTION

Well conducted clinical trials are widely regarded as the best source of evidence on the efficacy and

safety of medical interventions.[1] Trials of first-in-class drugs also provide the most rigorous test of

causal mechanisms in human disease.[2] Clinical trial findings inform regulatory approvals of novel

drugs, key clinical practice decisions and guidelines, and fuel the progress of translational

medicine.[3, 4] However, this model relies on trial activity being high-quality, transparent, and

discoverable.[5] Randomization, blinding, adequate power, and a clinically relevant patient

population are among the hallmarks of high-quality trials.[1, 5] In addition, timely reporting of

findings, including negative and inconclusive results, ensures fulfilment of ethical obligations to trial

volunteers and avoids unnecessary duplication of research or creating biases in clinical knowledge

base.[6, 7]

To improve the visibility and discoverability of clinical trials, the Food and Drug Administration (FDA)

mandated the registration of interventional efficacy trials and public disclosure of their results.[8-11]

Currently, 17 registries store records of clinical studies.[12, 13] Clinicaltrials.gov, the oldest and

largest of such platforms,[14, 15] serves as both a continually updated trial registry and a database

of trial results, thus offering an opportunity to explore, examine, and monitor the clinical research

landscape.[10, 14]

Previous studies of clinical research activity used registered trial protocols and associated published

reports to investigate trends in quality, reporting rate, and potential for publication bias (e.g.[14, 16-

21]). However, many previous studies relied on time-consuming manual literature searches, which

confined their scope to pivotal randomized controlled trials or research funded by major trial

sponsors. Large-scale analysis of relationships between a wide range of protocol characteristics and

result dissemination rates, and the differences between large and small organizations has not been

previously undertaken. This is important, given a reported growth in trials funded by small industrial

and academic institutions.[22-26]

To address this gap, we performed the most comprehensive analysis to date of all interventional

studies of small molecule and biologic therapies registered on clinicaltrials.gov, up to July 19, 2017.

We linked trial registry entries to the publication of findings in journal articles, and used descriptive

statistics, semantic enrichment, and data visualization methods to investigate and illustrate the

landscape and evolution of registered clinical trials. Specifically, we investigated the influence of

funder type and disease area on the design and publication rates of trials across a wide range of

medical journals; changes over time; and relationships to reported attrition rates in drug

development. To facilitate comparison across diverse study characteristics, we designed a novel

visualization method allowing rapid, intuitive inspection of basic trial properties.

Page 53 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyMETHODS

Our large-scale cross-sectional analysis and field synopsis of the current and historical clinical trial

landscape uses a combination of basic trial protocol information and published trial results. The

developed data processing and integration workflow manipulates and links data from

clinicaltrials.gov with information about journal articles in which trial results were published.

Clinicaltrials.gov data processing and annotation

We downloaded the clinicaltrials.gov database as of July 19, 2017 – nearly two decades after the

FDA Modernization Act (FDAMA) legislation requiring trial registration.[8] The primary dataset

encompassed 249,904 studies in total. We then identified records of all pharmaceutical and

biopharmaceutical trials, defined as clinical studies evaluating small-molecule therapeutics,

biologics, adjuvants, and vaccines.[27] To extract the relevant subset, the Study type field was

restricted to “interventional” and Intervention type was restricted to either “Drug” or “Biological”.

This led to a set of 119,840 clinical trials with a range of organizations, disease areas, and design

characteristics (Suppl. File 1).

To enable robust aggregate analyses, we processed and further annotated the dataset. Firstly, date

formats and age eligibility fields were harmonised. Next, wherever possible, missing categories were

inferred from available fields as described by Califf and colleagues[16]. For example, for single-arm

interventional trials with missing allocation and blinding type annotations, allocation was assigned as

“non-randomized” and blinding as “open label”.[16] The dataset was further enriched with derived

variables. For example, trial duration value was derived from available trial start and primary

completion date; trial locations were mapped to continents and flagged if they involved emerging

markets, and trials making use of placebo and/or comparator controls were labelled based on

information from intervention name, arm type, trial title, and description field. For instance, if a

study had an arm of type “Placebo Comparator” or an intervention whose name included words

“placebo”, “vehicle” or “sugar pill”, it was annotated as “placebo-controlled”; similarly, studies with

an arm of type “Active Comparator” were classified as “comparator-controlled”; see Suppl. File 2 for

details. Finally, each trial was annotated with the number of outcome measures and eligibility

criteria following parsing of the unstructured eligibility data field as described by Chondrogiannis et

al.[28]

To categorize clinical trials by medical specialty, we mapped reported disease condition names to

corresponding Medical Subject Headings (MeSH) terms (from both MeSH Thesaurus and MeSH

Supplementary Concept Records)[29] using a dictionary-based named entity recognition

approach.[30] A total of 21,884 distinct condition terms were annotated with MeSH identifiers,

accounting for 86.4% of all disease names from 88% of the clinical trials. The mapping enabled

normalization of various synonyms to preferred disease names as well as automatic classification of

diseases into distinct categories using the MeSH hierarchy. To compare our results with previously

published clinical trial success rates, we focused on seven major disease areas investigated in the

study by Hay et al.[31]: oncology, neurology, autoimmune, endocrine, respiratory, cardiovascular,

and infectious diseases. Trial conditions annotated with MeSH terms were mapped to these

Page 54 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Onlycategories based on matching terms from levels 2 and 3 of the MeSH term hierarchy; see Suppl. File

2 for details on MeSH term assignment and disease category mapping.

To categorize studies by funder type, we first extended the agency classification available from

clinicaltrials.gov. Specifically, we used two main categories: industry and non-profit organizations;

and four sub-categories: Big Pharma, Small Pharma, NIH, and Other; the last category corresponding

to universities, hospitals, and other non-profit research organizations.[32, 33] Industrial

organizations were classified as “Big Pharma” and “Small Pharma” based on their sales in 2016 as

well as overall number of registered trials they sponsored. Specifically, an industrial organization was

classed as “Big Pharma” if it featured on the list of 50 companies from the 2016 Pharmaceutical

Executive ranking[34] or if it sponsored more than 100 clinical trials; otherwise it was classified as

“Small Pharma”; see Suppl. File 2. Next, we derived the likely source of funding for each trial based

on lead_sponsor and collaborators data fields, using a previously described method[16] extended to

include the distinction between big and small commercial organizations. Briefly, if NIH was listed

either as the lead sponsor of a trial or a collaborator, the study was classified as NIH-funded. If NIH

was not involved, but either the lead sponsor or a collaborator was from industry, the study was

classified as funded by either “Big Pharma” if it included any Big Pharma organization or “Small

Pharma” otherwise. Remaining studies were assigned “Other” as the funding source.

Establishing links between trial registry records and published results

We linked the trial registry records to their disclosed results, either deposited directly on

clinicaltrials.gov or published as a journal article, to establish if the results of a particular trial had

been disseminated. We implemented the approach described by Powell-Smith and Goldacre[20] to

identify eligible trials that should have publicly disclosed results. These were studies concluded after

January 2006 and by July 2015 (to allow sufficient time for publication of results that had not filed an

application to delay submission of the results (based on the firstreceived_results_disposition_date

field in the registry).[20] We included studies from all organizations (even minor institutions)

regardless of how many trials they funded. The selection process led to a dataset of 45,620 studies,

including 27,835 efficacy trials (Phase 2-4); summarized by the flow diagram in Suppl. File 1.

For each eligible trial, we searched for published results using two methods. Firstly, we searched for

structured reports submitted directly to the results database of clinicaltrials.gov (based on the

clinical_results field). Next, we queried the PubMed database with NCT numbers of eligible trials

using an approach that involves searching for the NCT number in the title, abstract, and Secondary

Source ID field of PubMed-indexed articles.[20, 35, 36] To exclude study protocols, commentaries,

and other non-relevant publication types, we used the broad “Therapy” filter – a standard validated

PubMed search filter for clinical queries[37] – as well as simple keyword matching for “study

protocol” in publication titles.[20] In addition, we excluded articles published before trial

completion.

We then determined the proportion of publicly disseminated trial reports for the studies covered by

mandate for result submission (efficacy trials: Phase 2-4) as well as for all completed drug trials

including those not covered by the FDA Amendment Act (Phase 0-1 trials and studies with phase set

to “n/a”). In addition to result dissemination rate, we calculated the dissemination lag for individual

studies to determine the proportion of reports disclosed within the required 12 months after

Page 55 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Onlycompletion of a trial. We defined dissemination lag as time between study completion and first

public disclosure of the results (whether through submission to clinicaltrials.gov or publication in a

journal).

Categorizing scientific journals

To investigate the relation between characteristics of published trials and quality metrics of scientific

journals, we classified journals as high or low impact based on journal metrics information from

SCImago Journal & Country Rank, a publicly available resource that includes journal quality

indicators developed from the information contained in the Scopus database.[38] Specifically, a

journal was classified as high-impact if its 2016 H-index is larger than 400 and as low-impact journal

if its H-index was below 50.

Statistical analysis and data visualization

We examined the study protocol data on 15 characteristics available from clinicaltrials.gov and from

our derived annotations. These were: (1) randomization; (2) interventional study design (e.g. parallel

assignment, sequential assignment or crossover studies); (3) masking (open-label, single-blinded,

double-blinded studies); (4) enrolment (total number of participants in completed studies or

estimated number of participants for ongoing studies); (5) arm types (placebo or active comparator);

(6) duration; (7) number of outcomes (clinical endpoints); (8) number of eligibility criteria; (9) study

phase; (10) funding source; (11) disease category; (12) countries; (13) number of locations; (14)

publication of results; (15) reported use of DMCs (Data monitoring committees). Further

explanation and definitions of trial properties are available in Suppl. File 2.

Based on these criteria characteristics, we examined methodological quality of trials funded by

different types of organizations and compared studies published in higher and lower impact journals

and reported on clinicaltrials.gov. In addition, we analysed trial dissemination rates across funding

categories and disease areas and compared them with clinical success rates reported previously.[31]

We used descriptive statistics and data visualization methods to characterize trial categories. For

categorical data we report frequencies and percentages; using medians and interquartile ranges for

continuous variables. To visualize multiple features of clinical trial design, we used radar plots with

each axis showing the fraction of trials with a different property, such as reported use of

randomization, blinding or active comparators. We converted three continuous variables to binary

categories: we classified trials as “small” if they enrolled fewer than 100 participants, as “short” if

they lasted less than 365 days, and as “multi-country” if trial recruitment centres were in more than

one country. Further details are available in supplementary tables. To facilitate comparisons

between several radar plots, we always standardized the order and length of axes. We used Python

for all data analysis tasks (scikit-learn, pandas, scipy, and numpy libraries), working with geographical

data (geonamescache, geopy, and Basemap libraries) as well as for data processing and visualization

(matplotlib, Basemap, and seaborn libraries).

Patient involvement

Patients were not involved in any aspect of the study design, conduct, or in the development of the

research question. The study is an analysis of existing publicly available data and therefore there was

no active patient recruitment for data collection.

Page 56 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyRESULTS

Overview of clinical trial activity by funder type

To date, 119,840 pharmaceutical trials have been registered with clinicaltrials.gov. As illustrated by

Fig 1A, the distribution of trials funded by different types of organizations has changed over time. In

particular, there has been a rise in the proportion of studies funded by organizations other than

industry or NIH – mainly universities, hospitals and other academic and non-profit agencies,

classified here as “Other”. Overall, these institutions funded 36.5% of all registered pharmaceutical

trials, followed by Big Pharma with 31%, and Small Pharma with 21.2% of studies. NIH funded 11.3%

of registered pharmaceutical trials (Fig 1B).

Amongst the 10 main study funders ranked by the overall number of trials, two belonged to the

“NIH” category (National Cancer Institute leading with 7,219 trials) and eight to the “Big Pharma”

category (GSK at the top with 3,171 trials). Out of 4,914 “Small Pharma” organizations, 73.6% had

fewer than five trials, whilst 43.1% funded only one study. Similarly, amongst the 6,468 funders

classified as “Other”, 79.2% had fewer than five trials, whilst 32.7% funded only one study.

The approach outlined in the Methods section identified 45,620 clinical trials in our dataset that

should also have available published results. The remaining analysis in the Results section focuses on

this subset of studies.

Trial design characteristics

The dataset included various types of study designs from large randomized clinical trials (RCTs) to

small single-site studies, many of which lacked controls (Suppl. File 2). Comparative analysis showed

that methodological characteristics of trials varied substantially across clinical phases (Fig 2A),

disease areas (Fig 2B), and funding source (Fig 3). The following section will focus on a comparative

analysis by funder type whilst detailed overview of differences across phases and diseases is


Industrial organizations were more likely to fund large international trials and more often reported

the use of randomization, controls and blinding, compared to non-industrial funders. Whilst

differences were evident across all clinical phases (Suppl. File 2), the largest variability was observed

for Phase 2 trials, illustrated by Fig 3. For instance, 68.5% of all industry-funded Phase 2 trials were

randomized (compared to 39.4% and 50.6% of studies funded by NIH and other institutions,

respectively). Studies with industrial funding were also substantially larger (median 80 patients

compared to 43 and 48) and more likely to include international locations (31.3% for industry vs

4.0% for Other and 11.0% for NIH). Despite larger enrollment size, industry-funded studies were

shorter on average compared to trials by other funders (median 547 days compared to 1,431 days

for NIH and 941 for “Other” funding category). Non-industrial funders were more likely to report the

use of data monitoring committees (DMCs): 41.9% of trials with NIH funding and 48.1% of studies

funded by other organizations involved DMCs compared to 26.7% of trials funded by industry).

Design characteristics did not differ substantially between Big and Small Pharma (Suppl. File 2).

Page 57 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyTrial design changed substantially over time with the trends most evident in Phase 2. As shown on

Fig 4, the methodological differences between industry and non-industrial funders have generally

decreased over time. Although overall a higher proportion of industry-led studies reported the use

of randomization, the proportion of RCTs funded by non-profit organizations in Phase 2 has recently

increased (Fig 4). Notably, the use of blinding is still much lower in non-commercial than commercial

trials. Additional analysis showed that the average number of trial eligibility criteria and outcome

measures have increased over time, leading to a greater number of procedures performed in an

average study, and hence to increase in trial complexity (Suppl. File 2).

Trial results dissemination and characteristics of published trials

Of the 45,620 completed trials, 27,835 were efficacy studies in Phases 2-4 covered by the mandate

for results publication (Suppl. File 1). Of these, only 15,084 (54.2%) have disclosed their results

publicly. Consistent with previously reported trends,[39, 40] dissemination rates were lower for

earlier clinical phases (not covered by the publication mandate) as only 23.4% of trials in Phases 0-1

had publicly disclosed results (Suppl. File 2).

Detailed analysis of Phase 2-4 trials showed that only 25.2% of published studies disclosed their

results within the required period of 12 months after completion. 18.6 months was the median time

to first public reporting, whether through direct submission to clinicaltrials.gov or publication in a

journal. Dissemination lag was smaller for results submitted to clinicaltrials.gov (median 15.3

months) compared to results published in a journal (median 23.9 months).

10,554 (37.9%) Phase 2-4 studies disclosed the results through structured submission to

clinicaltrials.gov, 8,338 (29.95%) through a journal article, and 3,808 (13.7%) disclosed through both

resources. Scientific articles were published in 1,454 titles of diverse nature and varying impact

factor. These ranged from prestigious general medical journals (BMJ, JAMA, Lancet, NEJM) to

narrower focus, specialist journals with no impact information available.

Journal publication status depended on study design. Trials reporting the use of randomization were

more likely to be published compared to non-randomized studies (34.9% vs 18.2%), and large trials

enrolling more than 100 participants were more likely to be published than studies with less than

100 participants (39.3% vs 20.5%).

In general, studies whose results were disclosed in the form of a scientific publication were more

likely to follow the ideal RCT design paradigms compared to those whose results were submitted to

clinicaltrials.gov only (Fig 5). Furthermore, a similar trend was observed for trials with results

published in higher vs. lower impact journals. For instance, 89.9% of trials published in journals with

H-index > 400 reported the use of randomization, compared to 79% of trials published in journals

with H-index < 50, and 60.4% of trials whose results were submitted to clinicaltrials.gov only (Fig 5).

The characteristics of trials disclosed solely through submission to clinicaltrials.gov did not

substantially differ from those that not disseminated in any form (Suppl. Files 2 and 3).

Trial dissemination rates across funder types

Trial dissemination rates varied amongst the different funding source categories. In general,

industry-funded studies were more likely to disclose results compared to non-commercial trials

Page 58 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Only(59.3% vs 45.3% of all completed Phase 2-4 studies). They were also more likely to report the results

on clinicaltrials.gov, overall (45.6% vs 24.4%) and within the required period of 12 months after

completion (23.3% vs 10.3%). Notably, only a small difference was observed for journal publication

rates (30.98% vs 28.4%). Additional analysis showed substantial differences across more granular

funder categories (Fig 6A). Among non-profit funders, there was a gap between NIH-funded studies

and those funded by other academic and non-profit organizations: 60% vs 40.6% of trials being

disclosed from the two categories, respectively. Similarly, among the industry-funded clinical trials,

Big Pharma had a higher result dissemination rate compared to Small Pharma (66.7% vs 45.2% of

studies).

In addition to overall reporting rates, the analysis also highlighted differences in the nature and

timing of results dissemination across trial funding sources (Fig 6A-D). Big Pharma had the highest

dissemination rate, both in terms of clinicaltrials.gov submission (53.2% of studies) and journal

publications (34.8%). By contrast, funders classified as “Other” showed the lowest percentage of

trials published on clinicaltrials.gov (15.9%). Organizations from this category disclosed trial results

primarily through journal articles (29.1% of all studies) and were the only funder type with fewer

trials reported through clinicaltrials.gov submission than through journal publication. The time to

reporting trials at clinicaltrials.gov was fastest for industry funded trials and slowest for studies

funded by organizations classed as “Other”; publication lag for journal articles showed a reverse

profile (see Fig 6C-D and Suppl. File 2 for detailed statistics).

Analysis of the fraction of trials with results disclosed within the required 12 months after

completion showed that dissemination rates have been steadily growing with time with the largest

increase observed after 2007, when FDAAA came into effect.[9] Further analysis showed that the

observed growth was largely driven by increased timely reporting on clinicaltrials.gov, most evident

for studies funded by Big Pharma and NIH (Suppl. File 2).

Trial dissemination rates across disease categories

Disease areas with highest result dissemination were autoimmune diseases and infectious diseases,

whilst neurology and oncology showed the lowest dissemination rates (Fig 7). As shown on Fig 7A,

these differences were largely driven by differential trends in journal publication, with less variability

observed for rates of result reporting on clinicaltrials.gov. Oncology in particular had consistently the

lowest journal publication rate across all clinical phases and funder categories.

Two disease categories with the highest fraction of published trials (autoimmune and infectious

diseases) were also previously associated with highest likelihood of new drug approval (LoA) (12.7%

and 16.7%, respectively). Similarly, two disease areas showing the lowest proportion of published

trials also had the lowest drug approval rates: neurology (LoA of 9.4%) and oncology (6.7%).[31]

Cardiovascular diseases had a relatively high publication rate, but low clinical trial success rate with

only 7.1% of tested compounds eventually approved for marketing.[31]

DISCUSSION

Page 59 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyThis analysis, enabled by automated data-driven analytics, provided several important insights on

the scope of research activity, design, visibility and reporting of clinical trials of pharmaceutical

interventions. There has been a growth and diversification of trial activity with a more diverse range

of profit and non-profit organizations undertaking clinical research across a wide range of disease

areas, and the analysis highlighted potentially important differences in trial quality and

dissemination.

In terms of trial design, industry-funded studies tended to use more robust methodology compared

to studies funded by non-profit institutions, although the methodological differences have been

decreasing with time. Result dissemination rates varied substantially depending on disease area as

well as the type and size of funding institution. In general, the results of industry-funded trials were

more likely to be disclosed compared to non-commercial studies, although small pharmaceutical

organizations lagged behind “Big Pharma”. Similarly, trials with NIH funding more often disclosed

their results compared to those funded by other academic and non-profit organizations. Although

overall dissemination rates improved with time, the trend was largely driven by increased results

reporting on clinicaltrials.gov for Big Pharma and NIH-funded studies. By contrast, the results of

trials funded by other academic and non-profit organizations were primarily published as journal

articles and were rarely submitted to clinicaltrials.gov. Finally, differences in dissemination rates

across medical specialties were mainly due to differential journal publication rates ranging from 42%

for autoimmune diseases to 20% for oncology.

Strengths and limitations

One of the major strengths of this work is the larger volume of clinical trials data and the greater

depth and detail of the analysis compared to previous studies in this area. Many factors were

analyzed for the first time in the context of both trial design quality and transparency of research.

These included the distinction between Big Pharma and smaller industrial organizations, analysis of

publication trends across distinct medical specialties, and differences between trials whose results

were disseminated through clinicaltrials.gov submission and publication in higher or lower impact

journals. In addition, the study provided a novel method for visualizing trial characteristics that

enables a rapid assimilation of similarities and differences across a variety of factors.

The study had several limitations related to the underlying dataset and the methodology used to

annotate and integrate the data. Firstly, it was limited to the clinicaltrials.gov database and hence

did not include unregistered studies. However, given that trial registration has been widely accepted

and enforced by regulators and journal editors,[8, 12] it is likely that most of the influential studies

have been captured in the analysis.[41] Additional limitations were due to the data quality issues

such as missing or incorrect entries, ambiguous terminology, and lack of semantic

standardization.[14, 16] Many trials were annotated as placebo or comparator-controlled based on

information mined from their unstructured titles and descriptions, as the correct annotation in the

structured arm type field was often missing. Although a subset of annotations was analyzed

manually, there might still be some trials that our automatic workflow has misclassified. In addition,

due to the non-standardized free-text format of the data, only a simple count of eligibility criteria

and outcome measures was used as an imperfect proxy for assessing trial complexity and the

homogeneity of enrolled population.

Page 60 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyFinally, our method for linking registry records to associated publications relied on the presence of

registry identifier (NCT number) in the text or metadata of a journal article. Its absence from a trial

publication would lead to misclassified publication status and, consequently, underestimated

publication rates. However, owing to the rigorousness of the search methods, it is unlikely that the

publication status was misclassified in a systematic manner, e.g. depending on study funding source

or disease area. Hence, although absolute estimates might be affected, it is unlikely that the

comparative analysis based on automated record linkage is biased towards one of the categories.

Notably, reporting of registration numbers is encouraged by medical journals through ICMJE and the

CONSORT checklist and it can be argued that compliance is part of investigators’ responsibility to

ensure that their research is transparent and discoverable.[20, 35]

Comparison with other studies

In addition to providing novel contributions, our analysis further confirmed and extended several

findings reported previously by others. Notably, larger sample size, wide inclusion criteria, and

additional mappings allowed us to calculate more accurate statistics and ask previously unexplored

questions in an unbiased way. For instance, Bala et al.[42] showed that trials published in higher

impact journals have larger enrollment size based on a sample of 1,140 RCTs reports and manual

search. Our study extended this analysis to entire clinicaltrials.gov registry (not only RCTs) and many

additional trial properties beyond enrollment size, including reported use of DMCs, duration of the

studies, and number of countries. Other previously reported trends confirmed and further examined

in this study included lower reporting rate observed for earlier clinical phases[39, 40] and academic

sponsors,[39, 40] longer dissemination lag for journal publications,[43] and overall improvements in

result dissemination.[44]

Several previous studies investigated the transparency of clinical research using various subsets of

clinicaltrials.gov and diverse approaches for classifying study results as published[17, 20, 36, 39, 40,

45, 46]; summarized in detail elsewhere.[47, 48] General trends showed by our study were within

estimates reported previously and our estimate of 54% for overall trial dissemination rate was in

agreement with two recent systematic reviews.[47, 48] Our study is methodologically most similar

to the work by Powell-Smith and Goldacre[20] with one important distinction: we did not restrict

the analysis to trials sponsored by organizations with more than 30 studies as this would exclude

small pharmaceutical companies and research centers, and bias the results towards large

institutions that routinely perform or fund clinical research. As a consequence, the analysis showed

analogous trends, but slightly lower overall reporting rates compared to Ref 20.[20]

Meaning and implications of the study

The higher methodological quality of industry funded trials may reflect the available financial and

organizational resources as well as better infrastructure and expertise in conducting clinical

trials.[49, 50] It is also possible that differential objectives of the industrial and non-profit

investigators (development of new medicines with a view to gaining a license for market access vs

mechanistic research and rapid publication of novel findings) contribute to the observed

differences.[49, 51]

The analysis suggests that trials funded by large institutions that regularly conduct clinical research

are more likely to disseminate their results publicly. Although industry led reporting overall, ‘Big

Page 61 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyPharma’ performed much better compared to smaller organizations, likely because many large

companies developed explicit disclosure policies in the recent years.[52-54] Similarly, NIH has

developed specific regulations for disseminating the results of the studies it funds. The most recent

such policy requires the publication of all NIH-funded trials, including (unprecedentedly) Phase 1

safety studies.[55] Whilst, large trial sponsors seem to actively pursue better transparency,[39]

smaller organizations, particularly those in the non-profit sector, were less responsive to the FDAAA

publication mandate. It is possible that such institutions are less able, or less motivated, to allocate

time and resources required to prepare and submit the results, or may be less familiar with the

legislation.[39, 41]

Our study suggests that the FDA mandate for trial results dissemination and the establishment of

clinicaltrials.gov results database led to improved transparency of clinical research. The time to

results dissemination is faster for clinicaltrials.gov compared to traditional publication and the

database stores the results of many otherwise unpublished studies. Lack of journal publications

might be explained by the reluctance of editors to publish smaller studies based on suboptimal

design or studies with “negative” or non-significant outcomes that are unlikely to influence clinical

practice.[42, 56, 57] Our analysis showed that journal publication varied depending on trial design

and disease area, with certain medical specialties and smaller non-randomized studies more likely to

be disseminated solely through clinicaltrials.gov. The difference was particularly evident for

oncology. Only a fifth of completed cancer trials were published in a journal, perhaps due to the high

prevalence of small sample sizes, suboptimal trial designs, and negative outcomes in this field.[14,

31] High clinicaltrials.gov dissemination rates, including cancer trials and small studies, suggests that

the resource plays an important role in reducing publication bias. Thus, to assure better

completeness of available evidence, the database should be used by clinical and drug discovery

researchers in addition to literature reviews.

Importantly, automated analyses and methods for identification of yet to be published studies could

be used by regulators and funders of clinical research to track trial investigators that failed to

disclose the results of their studies. Such a technology would be particularly useful when coupled

with automated notifications and additional incentives (or sanctions) for result dissemination.

Recently, the concept of automated ongoing audit of trial reporting was implemented through an

online TrialTracker tool that monitors major clinical trial sponsors with more than 30 studies.[20]

Improvements to the quality of clinical trial data could facilitate the development of robust

monitoring systems in the future. Firstly, data quality should be taken into account when registering

new trials and standardized nomenclature should be used whenever possible. Ideally, the user

interface of the clinicaltrials.gov registry itself should be designed to encourage and facilitate

normalization. In addition, stricter adherence to the CONSORT requirement for including the trial

registration number with each published article would improve the accuracy of automated mapping

of registry records to associated publications.[41]

In addition to improved monitoring, new regulations might be needed to ensure better transparency

and it is important that small industrial organizations and academic centres are reached by these

efforts as well. One possible solution was proposed by Anderson and colleagues[39]: journal editors

from ICMJE could call for submission of results to clinicaltrials.gov as a requirement for article

publication. Given dramatic increase in trial registration rates that followed previous ICMJE

Page 62 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Onlypolicies,[58, 59] it is likely that such a call would improve the reporting of ongoing clinical

research.[39]

Page 63 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyAcknowledgements

We thank our colleagues from BenevolentAI, University College London, and Farr Institute of Health

Informatics for much useful discussion and advice, in particular John Overington (currently

Medicines Discovery Catapult), Nikki Robas, Bryn Williams-Jones, Chris Finan, and Peter Cox. We also

thank clinicaltrials.gov team for their assistance and advice.

Funding

This study received no specific external funding from any institution in the public, commercial, or not

for profit sectors and was conducted by Benevolent Bio as part of its normal business.

Competing interests

All authors have completed the ICMJE uniform disclosure form at

http://www.icmje.org/coi_disclosure.pdf and declare: no financial relationships with any

organisations that might have an interest in the submitted work and no other relationships or

activities that could appear to have influenced the submitted work.

Licence

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non

Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this

work non-commercially, and license their derivative works on different terms, provided the original

work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-

nc/3.0/.

Contributorship

Contributors: MZ, MD, AH, and JH conceived and designed this study. MZ and MD acquired the

data. MZ, MD, AH, and JH analysed and interpreted the data. The initial manuscript was drafted by

MZ; all authors critically revised the manuscript and approved its final version. MZ takes

responsibility for the overall integrity of the data and the accuracy of the data analysis. JH is the

guarantor.

Transparency

The authors affirm that the manuscript is an honest, accurate, and transparent account of the study

being reported; that no important aspects of the study have been omitted; and that any

discrepancies from the study as planned (and, if relevant, registered) have been explained.

Data sharing

All the data used in the study are available from public resources. The dataset can be made available

on request from the authors.

Patient consent

Not required

Ethical approval

Not required

Page 64 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyReferences

1. Sibbald B and Roland M. Understanding controlled trials. Why are randomised controlled

trials important? BMJ 1998;316(7126):201.

2. Lindsay MA. Target discovery. Nat Rev Drug Discov 2003;2(10):831.

3. Sackett DL, Rosenberg WMC, Gray JAM, et al. Evidence based medicine: what it is and what

it isn't. BMJ 1996;312:71.

4. Woolf SH. The meaning of translational research and why it matters. JAMA 2008;299(2):211-

213.

5. Kendall J. Designing a research project: randomised controlled trials and their principles.

Emerg Med J 2003;20(2): 164-168.

6. Simes RJ. Publication bias: the case for an international registry of clinical trials. J Clin Oncol

1986;4(10):1529-1541.

7. Yamey G. Scientists who do not publish trial results are “unethical”. BMJ

1999;319(7215):939.

8. Food and Drug Administration Modernization Act (FDAMA) of 1997. Available from:

https://www.fda.gov/RegulatoryInformation/LawsEnforcedbyFDA/SignificantAmendmentst

otheFDCAct/FDAMA/default.htm; accessed 09/01/2018.

9. Food and Drug Administration Amendments Act (FDAAA) of 2007. Available from:

https://www.fda.gov/drugs/guidancecomplianceregulatoryinformation/guidances/ucm0649

98.htm; accessed 09/01/2018.

10. Zarin DA and Tse T. Moving towards transparency of clinical trials. Science

2008;319(5868):1340.

11. Zarin DA, Tse T, Williams RJ, et al. Trial reporting in ClinicalTrials. gov—the final rule. N Engl J

Med 2016;375(20):1998-2004.

12. ICMJE's clinical trial registration policy. Available from:

http://www.icmje.org/recommendations/browse/publishing-and-editorial-issues/clinical-

trial-registration.html; accessed 09/01/2018.

13. WHO International Clinical Trials Registry Platform (ICTRP): Primary Registries. Available

from: http://www.who.int/ictrp/network/primary/en/; accessed 09/01/2018.

14. Hirsch BR, Califf RM, Cheng SK, et al. Characteristics of oncology clinical trials: insights from a

systematic analysis of ClinicalTrials.gov. JAMA Inter Med 2013;173(11):972-979.

15. Zarin DA, Ide NC, Tse T, et al. Issues in the registration of clinical trials. JAMA

2007;297(19):2112-2120.

16. Califf RM, Zarin DA, Kramer JM, et al. Characteristics of clinical trials registered in

ClinicalTrials.gov, 2007-2010. JAMA 2012;307(17):1838-1847.

17. Chen R, Desai NR, Ross JS, Zhang W, et al. Publication and reporting of clinical trial results:

cross sectional analysis across academic medical centers. BMJ 2016;352:i637.

18. Hakala A, Kimmelman J, Carlisle B, et al. Accessibility of trial reports for drugs stalling in

development: a systematic assessment of registered trials. BMJ 2015;350:h1116.

19. Mathieu S,Boutron I, Moher D, et al. Comparison of registered and published primary

outcomes in randomized controlled trials. JAMA 2009;302(9):977-984.

20. Powell-Smith A and Goldacre B. The TrialsTracker: Automated ongoing monitoring of failure

to share clinical trial results by all major companies and research institutions. F1000Res

2016;5:2629.

21. Ross JS, Mulvey GK, Hines EM, et al. Trial publication after registration in ClinicalTrials. Gov:

a cross-sectional analysis. PLoS Med 2009;6(9):e1000144.

22. Small drug companies grow as big pharma outsources. Available from:

http://www.telegraph.co.uk/business/2017/01/22/top-50-privately-owned-pharma-

companies-britain/; accessed 09/01/2018.

23. Barden CJ and Weaver DF. The rise of micropharma. Drug Discov Today 2010;15(3):84-87.

Page 65 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Only24. Munos B. Lessons from 60 years of pharmaceutical innovation. Nat Rev Drug Discov

2009;8(12):959-968.

25. Munos BH and Chin WW. How to revive breakthrough innovation in the pharmaceutical

industry. Sci Transl Med 2011;3(89):89cm16.

26. Pammolli F, Magazzini L, and Riccaboni M. The productivity crisis in pharmaceutical R&D.

Nat Rev Drug Discov 2011;10(6): 428-438.

27. Thiers FA, Sinskey AJ, and Berndt ER. Trends in the globalization of clinical trials. Nat Rev

Drug Disov 2008;7(1):13.

28. Chondrogiannis E, Andronikou V, Tagaris A, et al., A novel semantic representation for

eligibility criteria in clinical trials. J Biomed Inform 2017;69:10-23.

29. Coletti MH and Bleich HL. Medical subject headings used to search the biomedical literature.

J Am Med Inform Assoc 2001;8(4):317-323.

30. Jensen LJ, Saric J, and Bork P. Literature mining for the biologist: from information retrieval

to biological discovery. Nat Rev Genet 2006;7(2):119-129.

31. Hay M, Thomas DW, Craighead JL, et al. Clinical development success rates for

investigational drugs. Nat Biotechnol 2014;32(1):40.

32. ClinicalTrials.gov Glossary of common site terms. Available from:

https://clinicaltrials.gov/ct2/about-studies/glossary; accessed 09/01/2018.

33. ClinicalTrials.gov Protocol registration data element definitions

for interventional and observational studies. Available from:

http://prsinfo.clinicaltrials.gov/definitions.html; accessed 09/01/2018.

34. Top 50 Global Pharma Companies. Annual ranking published by Pharmacutical Executive.

Available from: https://www.rankingthebrands.com/The-Brand-

Rankings.aspx?rankingID=370; accessed 09/01/2018.

35. Huser V and Cimino JJ. Linking ClinicalTrials. gov and PubMed to track results of

interventional human clinical trials. PLoS One 2013;8(7): e68409.

36. Ramsey S and Scoggins J. Commentary: practicing on the tip of an information iceberg?

Evidence of underpublication of registered clinical trials in oncology. Oncologist 2008;13(9):

925-929.

37. Lokker C, Haynes RB, Wilczynski NL, et al. Retrieval of diagnostic and treatment studies for

clinical use through PubMed and PubMed's Clinical Queries filters. J Am Med Inform Assoc

2011;18(5):652-659.

38. SCImago, SJR — SCImago Journal & Country Rank. Available from: www.scimagojr.com;

accessed 09/01/2018.

39. Anderson ML, Chiswell K, Peterson ED, et al. Compliance with results reporting at

ClinicalTrials.gov. New Engl J Med 2015;372(11):1031-1039.

40. Prayle AP, Hurley MN, and Smyth AR. Compliance with mandatory reporting of clinical trial

results on ClinicalTrials. gov: cross sectional study. BMJ 2012;344: d7373.

41. Manzoli L, Flacco ME, D'Addarion M, et al., Non-publication and delayed publication of

randomized trials on vaccines: survey. BMJ 2014;348:g3058.

42. Bala MM, Akl EA, Sun X, et al., Randomized trials published in higher vs. lower impact

journals differ in design, conduct, and analysis. J Clin Epidemiol 2013;66(3): 286-295.

43. Riveros C, Dechartres P, Perrodeau E, et al. Timing and completeness of trial results posted

at ClinicalTrials. gov and published in journals. PLoS Med 2013;10(12): e1001566.

44. Miller JE, Wilenzick M, Ritcey N, et al. Measuring clinical trial transparency: an empirical

analysis of newly approved drugs and large pharmaceutical companies. BMJ Open

2017;7(12):e017917.

45. Jones CW, Handler L, Crowell KE, et al. Non-publication of large randomized clinical trials:

cross sectional analysis. BMJ 2013;347: f6104.

46. Ross JS, Tse T, Zarin DA, et al. Publication of NIH funded trials registered in ClinicalTrials. gov:

cross sectional analysis. BMJ 2012;344:d7292.

Page 66 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review Only47. Chan AW, Song F, Vickers A, et al. Increasing value and reducing waste: addressing

inaccessible research. Lancet 2014;383(9913): 257-266.

48. Schmucker C, Schell LK, Portalupi S, et al. Extent of non-publication in cohorts of studies

approved by research ethics committees or included in trial registries. PloS One

2014;9(12):e114023.

49. Laterre PF and François B. Strengths and limitations of industry vs. academic randomized

controlled trials. Clin Microbiol Infect 2015;21(10): 906-909.

50. Martin L, Mutchens M, Hawkins C, et al. How much do clinical trials cost? Nature Rev Drug

Discov 2017;16(6): 381-382.

51. Ehlers MD. Lessons from a recovering academic. Cell 2016;165(5):1043-1048.

52. Novartis Position on Clinical Study Transparency – Clinical Study Registration, Results

Reporting and Data Sharing. Available from:

https://www.novartis.com/sites/www.novartis.com/files/clinical-trial-data-

transparency.pdf; accessed 09/01/2018.

53. Pfizer: Trial Data & Results. Available from: https://www.pfizer.com/science/clinical-

trials/trial-data-and-results; accessed 09/01/2018.

54. GSK: Data transparency. Available from: https://www.gsk.com/en-gb/behind-the-

science/innovation/data-transparency/; accessed 09/01/2018.

55. NIH Policy on the Dissemination of NIH-Funded Clinical Trial Information. Available from:

https://grants.nih.gov/grants/guide/notice-files/NOT-OD-16-149.html; accessed

09/01/2018.

56. Hopewell S, Loudon K, Clarke MJ, et al. Publication bias in clinical trials due to statistical

significance or direction of trial results. Cochrane Database Syst Rev 2009;1:MR000006.

57. Stern JM and Simes RJ. Publication bias: evidence of delayed publication in a cohort study of

clinical research projects. BMJ 1997;315(7109):640-645.

58. Laine C, Horton R, DeAngelis CD, et al. Clinical trial registration—looking back and moving

ahead. Lancet 2007;369(9577):1909-1911.

59. Viergever RF and Li K. Trends in global clinical trial registration: an analysis of numbers of

registered clinical trials in different parts of the world from 2004 to 2013. BMJ Open

2015;5(9):e008932.

Page 67 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Confidential: For Review OnlyFigure captions

Fig 1. Funding source distribution for all trials registered with clinicaltrials.gov. A: Trends in funding

source distribution for all pharmaceutical trials conducted after Jan 1, 1997 until July 19, 2017

(based on the start_date data field). B: Overall funding source distribution for all registered

pharmaceutical studies.

Fig 2. Trial design properties. Each radar chart illustrates the differences between three groups of

trials (represented by coloured polygons) with respect to fifteen trial protocol characteristics

(individual radial axes). Each axis shows the fraction of trials with given property, such as reported

use of randomization, blinding, or data monitoring committees (DMC). A: Trial characteristics by

clinical phase. B: Characteristics of Phase 2 treatment-oriented trials for three representative

diseases. The profile of glioblastoma (with many small non-randomized studies) is typical of

oncology trials; see Suppl. File 2.

Fig 3. Phase 2 trial properties by funding source. Detailed statistics across 4 funding categories and

remaining clinical phases are available in Suppl. File 4.

Fig 4. Changes in clinical trial design over time. Plots compare the characteristics of all Phase 2 trials

(regardless of completion status) divided into four temporal subsets based on their starting year.

Additional results are available in Suppl. File 2.

Fig 5. Characteristics of trials whose results were published as a journal article or through a direct

submission to the clinicaltrials.gov results database. Trial published as an article are divided into

high and low-impact classes based on the 2016 H-index of the corresponding journal; see Methods.

Amongst 945 trials published in high-impact journals (H index > 400), 459 (48.6%) were funded by

Big Pharma, 178 by Other, 169 by Small Pharma, and 139 by NIH. Detailed statistics available in

Suppl. File 3.

Fig 6. Trial results dissemination rates by trial funding category. A: Trial dissemination rates by

funder type: overall dissemination rate, journal publications, and submission of structured results to

clinicaltrials.gov. B: Time to first results dissemination (whether through submission to

clinicaltrials.gov results database or publication in a journal article). The vertical line indicates the

12-month deadline for reporting mandated by FDAAA. C: Time to results reporting on

clinicaltrials.gov. D: Time to journal article publication.

Fig 7. Trial results dissemination rates and funding source distribution across seven medical

specialties. A: For each specialty, the plot shows the fraction of completed Phase 2-4 trials whose

results were publicly disseminated, either as a journal article or through direct submission to

clinicaltrials.gov results database. B: Funding source distribution by medical specialty.

Page 68 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960


172x124mm (300 x 300 DPI)

Page 69 of 69


BMJ

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960