26
Office of Portfolio Analysis Artificial intelligence/machine learning at the National Institutes of Health (AI/ML at the NIH) Predicting translational progress in biomedical research George Santangelo, Ph.D. Ian Hutchins, Ph.D. Office of Portfolio Analysis, NIH

Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Artificial intelligence/machine learningat the National Institutes of Health

(AI/ML at the NIH)

Predicting translational progress in biomedical research

George Santangelo, Ph.D.Ian Hutchins, Ph.D.

Office of Portfolio Analysis, NIH

Page 2: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Mission of the NIH Office of Portfolio Analysis (OPA)

Consult &Collaborate

Develop New Analytical Methods

BuildTools

Data Cleaning & Analysis

Supportdata-driven

decision-making

Disseminate Best PracticesClassroom Training (Custom Classes Available)

Online TrainingWeb Resources (Case Studies, FAQs)

Office HoursSymposia

OPA website: https://dpcpsi.nih.gov/opa/index

Support data-driven decision-making

• Enable NIH research administrators and decision-makers to evaluate and prioritize current and emerging areas of research that will advance scientific knowledge and improve human health

•Help ensure that the NIH research portfolio ― is balanced― is free of unnecessary duplication― takes advantage of collaborative,

cross-cutting research― stimulates the emergence of

transformative ideas

Office of Portfolio Analysis

AI/MLBibliometricsContent analysisEmerging areas

Page 3: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

OPA developed the Relative Citation Ratio (RCR) metric to meet the need for a new and thoroughly validated way

to measure the influence of any/all biomedical research papers

Public website to retrieve RCR data:

iCite.od.nih.gov

Office of Portfolio Analysis

Page 4: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:

• Influence using bibliometric data The Relative Citation Ratio (RCR)

Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408

Page 5: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:

• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores

Hutchins BI et al. PLoS Biology 2019 17(10):e3000416

• Influence using bibliometric data The Relative Citation Ratio (RCR)

Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408

Page 6: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:

2.0Influence module

Translation moduleOpen Citation Collection

Hutchins et al. PLOS Biology 201917:e3000385

The publicly available OPA tool

• Influence using bibliometric data The Relative Citation Ratio (RCR)

Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408

• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores

Hutchins BI et al. PLoS Biology 2019 17(10):e3000416

Page 7: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:

• Development of drugs and devicesDisambiguated drug and lead compound name, FDA data

2.0Influence module

Translation moduleOpen Citation Collection

Hutchins et al. PLOS Biology 201917:e3000385

The publicly available OPA tool

• Influence using bibliometric data The Relative Citation Ratio (RCR)

Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408

• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores

Hutchins BI et al. PLoS Biology 2019 17(10):e3000416

Page 8: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:

• Development of drugs and devicesDisambiguated drug and lead compound name, FDA data

• Rate of scientific progress and emergence

2.0Influence module

Translation moduleOpen Citation Collection

Hutchins et al. PLOS Biology 201917:e3000385

The publicly available OPA tool

• Influence using bibliometric data The Relative Citation Ratio (RCR)

Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408

• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores

Hutchins BI et al. PLoS Biology 2019 17(10):e3000416

Page 9: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

OPA AI to track and predict the impact of NIH decision-making

• Detection of overlapping proposals submitted to different funders─ Hoppe et al. Science Advances 2019 5:eaaw7238

Track and parameterize:

• Influence using bibliometric data The Relative Citation Ratio (RCR)

Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408

• Development of drugs and devicesDisambiguated drug and lead compound name, FDA data

• Rate of scientific progress and emergence

2.0Influence module

Translation moduleOpen Citation Collection

Hutchins et al. PLOS Biology 201917:e3000385

The publicly available OPA tool

• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores

Hutchins BI et al. PLoS Biology 2019 17(10):e3000416

Page 10: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Publicly available web tool: iCite 2.0

Track influential publications

Track and predict translational impact

The Open Citation Collection

Page 11: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

The Translation module of iCite: tracking bench-to-bedside progress

Page 12: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo
Page 13: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo
Page 14: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

The Translation module of iCite: tracking bench-to-bedside progress

Development of cancer immunotherapy

HighLow Article DensityClinical ResearchTranslational ResearchFundamental Research

2009

1996

human

Citation patterniCite Translation1996 to 2009

2009

1996

Fundamental

Translational/Clinical

Office of Portfolio Analysis

Page 15: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Total citations are a poor predictor of translational progress(citation by one more clinical articles*)

Seminal articles leading to Nobel Prizes in Physiology or Medicine (2003 to 2015)

Topic Prize year Pub date RCR RCR %ile Total cites Cites by clinical articles % by clinical articles

Treatment of parasitic diseases 2015 1979 6.6 96.1 132 7 5.30%

In vitro fertilization 2010 1978 34.4 99.8 680 33 4.85%

Helicobacter pylori & ulcers 2005 1984 95.8 99.9 2285 100 4.38%

MRI 2003 1984 14.0 99.1 232 7 3.02%

HPV/Cervical Cancer & HIV/AIDS 2008 1983 102.4 99.9 2849 32 1.12%

Positioning system in the brain 2014 1971 60.4 99.9 1773 14 0.79%

Olfactory biology 2004 1991 60.2 99.9 1953 14 0.72%

Activation of the immune system 2011 1998 113.4 99.9 3976 21 0.53%

Telomeres and telomerase 2009 1978 10.9 98.5 407 2 0.49%

Reprogramming mature cells to pluripotency 2012 2007 238.8 99.9 7890 17 0.22%

RNA interference 2006 1998 163.5 99.9 6500 7 0.11%

Introducing specific gene modifications in mice 2007 1987 30.0 99.8 1180 1 0.08%

Vesicle trafficking 2013 1993 51.6 99.9 1776 1 0.06%

*a publication that describes a clinical trial or clinical guideline

Page 16: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Non-uniform probability of being cited by a clinical article

Low

Mid

High

Page 17: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Training a machine learning system to predict future translation

Page 18: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Training a machine learning system to predict future translation

*

*

*Random Forests outperformed Support Vector Machines, Neural Networks, logistic regression, et al.

Page 19: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Validation of machine learning predictions of future translation

Accurate predictions can be made two years after publication

Page 20: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Validation of machine learning predictions of future translation

Machine learning outperforms experts rating the clinical impact of publications

Page 21: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Validation of machine learning predictions of future translation

Increase in APT score is responding to new information in the citing networks

Approximate Potential to Translate (APT) scores are stable but can change over time

Page 22: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

“Genetic” analysis of machine learning predictions of future translation

Page 23: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Summary

• The Office of Portfolio Analysis at NIH develops tools and new methods, including AI/ML approaches, to improve data-driven decision-making

• Predicting translational progress in biomedical research has the potential to accelerate scientific advances that improve human health

• We built and validated an ML system that accurately determines, within two years post-publication, the likelihood any/all paper(s) will be cited by a clinical article (a paper that describes a trial or guideline)

• Making accurate predictions requires both the features of the paper in question and the features of the papers that cite it

• Analyzing citation patterns indicates that knowledge flows through different domains before percolating into the clinical arena

Page 24: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Temporal dynamics of translation

Page 25: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Using fractional in place of binary counting of MeSH terms

Page 26: Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology 2016 14:e1002541 ̶Hutchins BI et al. PLoS Biology 2017 15:e2003552 ̶Santangelo

Office of Portfolio Analysis

Temporal dynamics of translationLow, mid, or high fraction of Human MeSH terms