www.dreamchallenges.org
• A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis:– Transcriptional networks– Signaling networks– Predictions to response to perturbations – Translational research
DREAM: What is it?DIALOGUE FOR REVERSE ENGINEERING
ASSESSMENT AND METHODS
Benefits of crowd-sourcing
1. Performance Evaluation– Unbiased, consistent, and rigorous method
assessment– Discover the Best Methods– Determine the solvability of a scientific question
2. Sampling of the space of methods– Understand the diversity of methodologies
presently being used to solve a problem
Benefits of crowd-sourcing, cont’d
3. Acceleration of Research– The community of participants can do in 4 months
what would take 10 years to any group
4. Community Building– Make high quality, well-annotated data accessible.– Foster community collaborations on fundamental
research questions.– Determine robust solutions through community
consensus: “The Wisdom of the Crowds.”
• Six Years of DREAM Challenge Seasons– 34 DREAM Challenges opened– More than 500 team submissions– 1000 cumulative conference attendees, – 60 papers written using DREAM Challenges, two
edited books and a Special paper in PLoS One– Community email list includes > 7,000
participants
DREAM ChallengesBuilding communities of data experts since 2006
How Sage/DREAM Nurtures Challenge Communities
• Challenge webinars for live interaction between participants and organizers
• Community forums where participants can learn from each other
• Leaderboards on Synapse to motivate continuous participation
• Incentives to code-share: evolving models never before possible (machine learning + clinical insights
• Annual DREAM Conference to celebrate and discuss Challenge outcomes
DREAM Challenge Leaderboard
Structure of a Challenge
Synapse and DREAM Challenges
• Cloud-based (Amazon)
• IRB-approved data repository
• Central hub for all DREAM Challenges
• Registration and messaging services
• Real-time Challenge leaderboards
• Provenance tools for data reproducibility
• Living archive of DREAM methods and winning source code
… beyond a data repository …
CASE STUDY: Breast Cancer Prognosis Challenge
Goal: use crowdsourcing to forge a computational model that accurately predicts breast cancer survival
• Training data set: genomic and clinical data from 2000 women with breast cancer
• Data access and analysis tools: Synapse
• Compute resources: each participant provided with a standardized virtual machine donated by Google
• Model scoring: models submitted to Synapse for scoring on a real-time leaderboard
Unique Attributes of the Sage Bionetworks/DREAM Breast Cancer Prognosis
Challenge Open source with code-sharing:
– Synapse’s computational infrastructure enables participants to use code submitted by others in their own model building
– Winning code must be reproducible
New dataset for validation of winning model: – Derived from approx. 200 breast cancer samples– Data generation funded by Avon– Winning model: the one that, having been trained using Metabric
data, is most accurate for survival prediction when applied to a brand new dataset
Challenge assisted peer-review– Overall winner submitted a pre-accepted article
to Science Translational Medicine
DREAM 2 DREAM 3 DREAM 4 DREAM 5 DREAM 6 DREAM 7 DREAM 8 DREAM8.5+9
0
50
100
150
200
250
300
350
400
DREAM Participation
Num
ber o
f Tea
ms
Challenges DREAM 8.5 + 9
Registered Users
Leader- board
Forum Entries
Unique Submissions
Unique Teams
Total 1,780 11,459 669 368 159
2014: DREAM Challenge participation continues to increase
2015 DREAM9.5 and 10 Challenges… So Far
WHAT WILL HAPPEN BEYOND 2015Challenges with clinical impact
Ensemble methods that make use of best submissions to be tested in the clinic (grant under review)
Digital Mammography Challenge: Reduce the false negative rate in mammography screening
Modeling and simulation basedChallenges??
AcknowledgementsSage Bionetworks
Stephen Friend Thea Norman Andrew Trister Lara Mangravite Mike Kellen Mette Peters Arno Klein Solly Sieberts Abhi Pratap Chris Bare Bruce Hoff
IBM Erhan Bilal Kely Norel Elise Blaese Pablo Meyer Rojas Kahn Rrhissorrakrai
EBI Julio Saez Rodriguez Thomas Cokelaer Federica Eduati Michael Menden
L. Maximilians University Robert Kueffner,
Univ Colorado, Denver Jim Costello
OHSU Joe Gray Adam Margolin Mehmet Gonen Laura Heiser
Prize4Life Melanie Leitnerr Neta Zach
NCI Dinah Singer Dan Gallahan
ISMMS Eli Stahl Gaurav Pandey
Columbia University Andrea Califano Mukesh Bansal Chuck Karan
Rice University Amina Qutub David Noren Byron Long
MD Anderson Steven Kornblau
Broad Institute Bill Hahn Barbara Weir Aviad Tsherniak
Merck Robert Plenge
BYU Keoni Kauwe
OICR Paul Boutros
UCSC Josh Stuart