46
Center for Data to Health EAB and All Hands Meeting May 9 through May 11, 2018 Mt. Washington Conference Center 5801 Smith Ave. Baltimore, MD 21209 410-735-7964

Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Center for Data to Health EAB and All Hands Meeting May 9 through May 11, 2018

Mt. Washington Conference Center 5801 Smith Ave.

Baltimore, MD 21209 410-735-7964

Page 2: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

CD2H EAB and All Hands Meeting May 9 through May 11, 2018

Baltimore, Maryland

Table of Contents:

1. Logistics …………………………………………………………...………... 1 2. Meeting Agenda …………………………………………………...….……. 3 3. Roster of Participants ……………………………………………..….……. 8 4. Workgroup Charters …………………………………………….….…...… 10 5. Project Team Deliverable Descriptions ……………………….……….... 25

Page 3: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Logistics

Airport: Baltimore/Washington International Thurgood Marshall Airport (BWI)

Transportation From Airport: We recommend a BWI airport taxi, Uber or Lyft, which will take approximately 35 minutes.

Location: Mt. Washington Conference Center 5801 Smith Ave. Baltimore, MD 21209 410-735-7964

Parking: Complimentary

Directions and Map of Mt. Washington Conference Center: ● May 8th – EAB presentation and dinner are located in the Octagon building.● May 9th – meeting registration and plenary sessions located in the McAuley building.

1

Page 4: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

CD2H EAB and All Hands Meeting, May 9-11, Baltimore, Maryland

May 8th

5:30 pm – 8:30 pm External Advisory Board Reception and Dinner with Talk from Melissa Haendel The Octagon

May 9th

7:30 am - 8:30 am Registration and Breakfast MacAuley Lobby

Plenary Session: Welcome to the CD2H

8:30 am - 8:35 am Welcome Chris Chute Pullen Plaza

8:35 am - 9:05 am Introduction Melissa Haendel

9:05 am - 9:35 am Overview of CD2H Sarah Biber

9:35 am - 9:55 am Four corners Ice-Breaker All

9:55 am - 10:10 am Break Chesapeake Galley

10:10 am - 10:30 am Showcase example - the power of DREAM challenges Justin Guinney Pullen Plaza

Plenary Session: Workgroup Visions - where we are and where we are headed

10:30 am -10:45 am Data Workgroup Overview Chunlei Wu Pullen Plaza

10:45 am - 11:00 am Software Workgroup Overview Sean Mooney

11:00 am - 11:15 am Ontologies Workgroup Overview Peter Robinson

11:15 am - 11:30 am Education Workgroup Overview Shannon McWeeney

11:30 am - 11:45 am People Workgroup Overview David Eichmann

11:45 am - 12:00 pm Evaluation Workgroup Overview Kristi Holmes

Plenary Session: CD2H Lunch Talks - Initial Project Highlights

12:00 pm - 12:20 pm Lunch - Pick up lunch and return to Pullen Plaza for Project Highlight Talks Chesapeake Galley

12:20 pm - 12:25 pm HPO2LOINC Peter Robinson Pullen Plaza

12:25 pm - 12:30 pm Maturity model Adam Wilcox

12:30 pm - 12:35 pm Self service clinical data in the cloud demo Kari Stephens

12:35 pm - 12:40 pm Licensing Melissa Haendel

12:40 pm - 12:45 pm Strategic FHIR-based data harmonization Chris Chute

12:45 pm - 1:15 pm Finish lunch and announcements about breakouts

3

Page 5: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Thematic Breakout Sessions: Implementing the change ala CD2H

Time Theme Breakout topic Related CD2H Projects

Facilitator Panelist for May 10

Note taker

Room

1:15 pm - 2:25 pm

Data: Sharing and reuse

Licensing and Policy to Promote Sharing and Reuse

The Data Licensing Initiative

Sarah Biber John Wilbanks

Connor Cook

C-42

Semantic data interoperability across the translational divide

LOINC2HPO

Jennifer Sprecher

Peter Robinson

Tricia Francis

C-31

Creating a Healthy Metadata Ecosystem

Data Index Kari Stephens

Chunlei Wu TBD C-33

Clinical Data Models and Sharing Landscape

FHIR-based data harmonization Aligning and strategizing data sharing networks within CTSA hubs

Nicole Weiskopf

Chris Chute TBD C-18

2:25 pm - 2:40 pm

Break Central Break Area

2:40 pm - 3:50 pm

Tools: Collaborative building, accessibility, and interoperability

Supporting a collaborative environment for projects and data

Synapse and CIELO Connectivity

Beth Johnson

Philip Payne

Connor Cook

C-18

Leveraging the Cloud for Translational Research

Cloud Demonstration of Data EDW/EHR Sharing

Jennifer Sprecher

Kari Stephens

Tricia Francis

C-31

CD2H-DREAM Challenges

Framework for supporting CD2H-DREAM Challenges

Sarah Biber Justin Guinney

TBD C-33

Visual analytics to gain actionable insights

Dashboard for Hub Level common metrics

Robin Champieux

Keith Herzog

TBD C-42

4

Page 6: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

3:50 pm - 4:50 pm

Break Central Break Area

3:50 pm - 5:00 pm (70 min)

Community engagement and culture change

Finding people & leveraging expertise

CTSAsearch People Finder

Kari Stephens

David Eichman

TBD C-18

Attribution: giving credit where credit is due

Contribution Role & Research Outputs Ontologies Stakeholders & culture

Jennifer Sprecher

Kristi Holmes

Connor Cook

C-42

Innovative training strategies

Biodata Club Starter Kit Entrepreneurial training module

Robin Champieux

Shannon McWeeney

Tricia Francis

C-31

Taking action to improve your organization’s ability to do open science

Maturity Model for governance, policy, and data

Sarah Biber Adam Wilcox

TBD C-33

6:00 pm –

7:00 pm

Reception (Breakout topics for the afternoon of May 10 (2 - 5 pm) will be announced) South

Dining

Terrace

7:00 pm - Dinner - sign ups or freestyle

5

Page 7: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

May 10th

Panel Report Backs from Breakout Sessions on May 9 - Key takeaways

8:00 am - 8:05 am Introduction Melissa Haendel Pullen Plaza

8:05 am - 8:15 am Overview of day two Sarah Biber

8:15 am - 8:45 am Panel

Data: sharing and reuse

Panelists: John Wilbanks, Peter Robinson, Chunlei Wu, Chris Chute Moderator: Sarah Biber

8:45 am - 9:15 am Panel

Tools: Collaborative building,

accessibility, and interoperability

Panelists: Philip Payne, Kari Stephens, Justin Guinney, Keith Herzog Moderator: Sean Mooney

9:15 am - 9:45 am Panel

Community engagement and culture

change

Panelists: David Eichman, Kristi Holmes, Shannon McWeeney, Adam Wilcox Moderator: John Wilbanks

9:45 am - 10:00 am Break Chesapeake Galley

Idea-to-Implementation Workshop - What should the first CD2H Dream Challenge be?

10:00 am - 10:45 am I2I Brainstorming Session Everyone - large group Pullen Plaza

10:45 am - 11:15 am Breakouts to Vet and Refine ideas Everyone - small groups

11:15 am - 12:30 pm I2I Pitches and Voting Everyone - large group

12:30 pm – 1:30 pm Lunch Everyone but EAB Pullen Plaza

Working Lunch: Closed External

Advisory Board meeting

EAB C-18

1:30 pm - 2:00 pm EAB meets with CD2H Leadership for

debriefing

PDs + site PIs, C-18

Provide Input for the CD2H Style

Board

Anyone interested Pullen Plaza

Free time, ad hoc meetings C-31 & C-42 available

2:00 pm - 2:30 pm Put yourself in someone else’s shoes

in the future - team building exercise

Everyone

6

Page 8: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Attendees’ Choice CD2H Breakout Sessions*

Time Breakout topic Facilitator Note taker Room

2:30 pm - 3:15 pm FDA-NCATS-BRIDG harmonization effort and

CD2H

Kari Stephens TBD C-18

Guidebook to enable data reusability TBD TBD C-31

Educational Module delivery

Bill Hersh TBD C-33

3:15 pm - 4:00 pm NIH IT planning and infrastructure and

website growth and implementation, CD2H

labs instantiation, Seminar series platforms

Sean Mooney TBD C-18

Strategies for a nimble CD2H

TBD TBD C-31

Wrap Up

4:00 pm - 5:00 pm Leadership Debrief PDs + Site PIs C-18

May 11th

Optional: Specific Workgroup Working Sessions

8 am - TBD Contact your workgroup lead for more details

*Other potential topics for attendees’ choice breakout sessions on May 10, 2:30 - 4:00 pm Guidebook to enable data reusability

Strategies for a nimble CD2H Design and execute program evaluation & dissemination for the CD2H Work facilitation and planning

Seminar Series platform

CD2H labs instantiation - how implement to make most effective Evaluation and strategic management of projects Educational module delivery

Suggest topics here

7

Page 9: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

CD2H All Hands Meeting May 9 - 10, 2018 Participant RosterOrganization Last Name Email Address Role

Johns Hopkins University Blair Anton [email protected] CD2HJohns Hopkins University Toni Cheeks-Shaw [email protected] CD2HJohns Hopkins University Chris Chute [email protected] CD2HJohns Hopkins University Tricia Francis [email protected] CD2HJohns Hopkins University Young-Joo Lee [email protected] CD2HJohns Hopkins University Harold Lehmann [email protected] CD2HJohns Hopkins University Anne Seymour [email protected] CD2HJohns Hopkins University Richard Zhu [email protected] CD2HNorthwestern University Matt Carson [email protected] CD2HNorthwestern University Sara Gonzales [email protected] CD2HNorthwestern University Piotr Hebal [email protected] CD2HNorthwestern University Keith Herzog [email protected] CD2HNorthwestern University Kristi Holmes [email protected] CD2HNorthwestern University Austin Sharp [email protected] CD2HOregon Health and Science University Sarah Biber [email protected] CD2HOregon Health and Science University Robin Champieux [email protected] CD2HOregon Health and Science University Connor Cook [email protected] CD2HOregon Health and Science University David Dorr [email protected] CD2HOregon Health and Science University/Oregon State University Melissa Haendel [email protected] CD2HOregon Health and Science University Bill Hersh [email protected] CD2HOregon Health and Science University Ted Laderas [email protected] CD2HOregon Health and Science University Shannon McWeeney [email protected] CD2HOregon Health and Science University Justin Ramsdill [email protected] CD2HOregon Health and Science University Rose Relevo [email protected] CD2HOregon Health and Science University Nicole G. Weiskopf [email protected] CD2HOregon Health and Science University Beth Wilmot [email protected] CD2HOregon Health and Science University Amy Yates [email protected] CD2HSage Bionetworks James Eddy [email protected] CD2HSage Bionetworks Justin Guinney [email protected] CD2HSage Bionetworks John Wilbanks [email protected] CD2H

8

Page 10: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

CD2H All Hands Meeting May 9 - 10, 2018 Participant RosterOrganization Last Name Email Address Role

Scripps Research Institute Ali Torkamani [email protected] CD2HScripps Research Institute Chunlei Wu [email protected] CD2HThe Jackson Laboratory Peter Robinson [email protected] CD2HUniversity of Iowa Dave Eichmann [email protected] CD2HUniversity of Iowa Chaoqun Ni [email protected] CD2HUniversity of Washington Tim Bergquist [email protected] CD2HUniversity of Washington Pascal Brandt [email protected] CD2HUniversity of Washington Nic Dobbins [email protected] CD2HUniversity of Washington Fred Dowd [email protected] CD2HUniversity of Washington Sean Mooney [email protected] CD2HUniversity of Washington Jason Morrison [email protected] CD2HUniversity of Washington Justin Prosser [email protected] CD2HUniversity of Washington Sicheng Song [email protected] CD2HUniversity of Washington Jennifer Sprecher [email protected] CD2HUniversity of Washington Kari Stephens [email protected] CD2HUniversity of Washington Adam Wilcox [email protected] CD2HWashington University in St. Louis / AcademyHealth Beth Johnson [email protected] CD2HWashington University in St. Louis Thomas M. Maddox [email protected] CD2HWashington University in St. Louis Philip R.O. Payne [email protected] CD2HUniversity of Pittsburgh Michael J. Becich [email protected] EABUniversity of Virginia Phil Bourne [email protected] EABRegenstrief Institute Peter J. Embi [email protected] EABUniversity of Pennsylvania John Holmes [email protected] EABUniversity of Colorado Denver Larry Hunter [email protected] EABUniversity of Minnesota Genevieve Melton-Meaux [email protected] EABDuke University Rachel Richesson [email protected] EABNational Center for Advancing Translational Sciences Kenneth Gersing [email protected] NCATSNational Center for Advancing Translational Sciences Erica Rosemond [email protected] NCATS

9

Page 11: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

CD2H Workgroup Charters

Data Interest Group Charter Mission The mission of the CD2H Data Interest Group (CD2H-Data) is to help data providers ensure that data is made FAIR-TLC across CTSA hubs for facilitating clinical and translational research. This mission will be manifest in activities such as:

● working closely with CTSA iDTF/ACT groups to build: ○ a consensus on shared data model and ontologies ○ a clinical data dictionary conformant with consensus specifications

● building a data inventory and API registry to support discovery and translational knowledge integration of basic science and omic resources for translational research

○ provide data modeling and dissemination of best practices ○ support and contribute to community standards such as the GA4GH,

bioschemas, and the ontology community ● help fill the interoperability gaps between clinical data resources and translational

knowledge sources to address critical translational questions ● support robust data sharing via technologies such as Synapse ● provide licensing and data use guidance in a data use agreement repository ● develop data quality assessment standards

To fulfill this mission, the CD2H-Data will be guided by the following principles:

● We will be open to all who are interested in participating and we will be inclusive to all standards and platforms

● We will help adopt single platforms and encourage harmonization between platforms ● We will serve both the clinical and translational data communities and will act as a bridge

between them ● We will enhance data sharing by supporting all components of the FAIR-TLC guidelines ● We will encourage our activities to focus on new innovative data types, models, and

standards all while encouraging the highest level of quality for all usable data Year 1 Group Activities During the first 12 months of activity we will focus on the following activities (more detail in timeline):

● Engaging CTSA community to inventory and assess clinical and basic science data resources and sharing readiness

● Help develop and apply data standards, common data elements, and data models (FHIR, OHDSI)

● Perform landscape analysis of infrastructure, training, and collaborative environments to determine how to increase ease and extent of data sharing

● Build initial catalog of FHIR objects across CTSAs ● Build API-level interoperability across resources in the context of the NCATS Data

Translator, BioThings, Bio-Link, and other initiatives

10

Page 12: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

● Initialize a data use agreement repository ● Expand and communicate community data standards efforts such as GA4GH,

bioschemas.org, etc. within the CTSA community Education and Learning Innovation (ELI) Interest Group Charter Mission The mission of the CD2H Education and Learning Innovation Workgroup (CD2H-ELI) is to stimulate the use of cutting edge biomedical research informatics and data science education for CTSA Program researchers, in coordination with existing community efforts. This mission will be manifest in activities such as:

● Development of an Open, Modular, Dynamic Training Library that encompasses existing community efforts and provides a forum that will provide an assessment of the value of these products.

● Assessment and harmonization of Informatics & Data Science Competencies, leveraging the existing efforts in this area by other groups (e.g., CTSA, AMIA, NLM/BD2K, ASA, ISCB etc)

● Provide training opportunities in under-addressed areas such as interdisciplinary collaboration, dissemination, and entrepreneurial strategies

● Creation of a Mentoring Network for translational informatics In pursuit of this mission and constituent activities, the CD2H-ELI will be guided by the following principles:

● Follow an open science approach to make best practices, training and educational materials and opportunities accessible to a diverse array of learners, thereby improving reproducibility, rigor, and efficiency.

● Increase the relevance, effectiveness, and accessibility of training and education in translational informatics.

● Support the development of responsive environments and platforms that engage learners.

● Interact and include other partners and community efforts to extend our knowledge base and explore jointly effective approaches.

Year 1 Group Activities During the first 12 months of activity of the CD2H-ELI, we will focus on the satisfaction of the following milestones:

1. Aggregation and Harmonization (1st pass) of existing Data Science Competencies 2. Landscape assessment of existing data science and informatics educational materials 3. Development of the initial aggregation of these resources (known as the Collaboratory

Curriculum Library (Version 0.1)) 4. Assessment of existing mentoring networks and recruitment 5. Initial Gap assessment based on the harmonized competencies with available materials

in the Collaboratory library.

11

Page 13: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

6. Prototype using DREAM Challenge as interactive learning platform for data science (Version 0.1)

Engagement Interest Group Charter Mission The mission of the CD2H Engagement Workgroup (CD2H-EWG) is to engage key stakeholders in the broader CTSA ecosystem in specific CD2H projects, both by working to define projects that connect to CTSA goals and by working to increase awareness of those projects in the CTSA network. Our thesis is that engagement can go one of two ways: as a form of marketing, or as a form of project engagement. Our experience at Sage is that marketing fails unless the project itself is the marketing. Thus the first and most important activity of the WG is proposed to be shaping pilots and projects that have true demand on the user side. Broad engagement often fails in science, whereas targeted engagement of the form “We need persons X, Y, and Z from institutions A, B, and C in project N” works extremely well, both to recruit the correct people and to attain critical recruiting mass once the project is showing value. This mission will be manifest in activities such as:

● Pilot and project definition and shaping, particularly in connection with the Data Workgroup

● Identification of and outreach to recruit specific individuals and key stakeholders on the CTSA informatics side who might be interested in specific pilots and projects

In pursuit of this mission and constituent activities, the CD2H-EWG will be guided by the following principles:

● All pilots and projects should address a specific pain point, but solve that pain point with the most generalizable solution that addresses the specifics

● All pilots and projects start small to keep the early days manageable - and to create a “pull” for individuals to join and help shape them - “there’s only two seats left” is a powerful engagement tool

● All pilots and projects are “joinable” once they start to demonstrate value - allows them to scale and engagement to begin self-propagating

Year 1 Group Activities During the first 12 months of activity of the CD2H-EWG, we will focus on the satisfaction of the following milestones:

1. Work closely with other working groups to help select key projects and pilots that meet the requirements

2. Identify and individually recruit key participants to projects 3. Leverage workshops, panels, and more (i.e. AMIA) to promote the pilots and projects

and identify potential members 4. Provide robust communications to the community regarding CD2H activities 5. Support public private partnerships in pilots and other product development

12

Page 14: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

6. Support dissemination and education activities of project outcomes to new stakeholders Evaluation KPIs for this working group are complex, as the group’s impact in the early months of the project will be directly tied to other WGs. Successful application of heuristics to project selection: From our work at Sage Bioneworks, we know a certain set of parameters that can correlate to collaborative work: a small set of scientists and technologies (ideally meeting the “two pizza” rule for feeding the whole group in the early days), a very specific scientific problem to be solved, a wide space of generalizable solutions (process, culture, technology, scientific method), and low friction from a governance perspective. When these factors are combined, the project itself becomes the primary factor of engagement to the selected audience, and over time if successful acts as marketing for the project and CD2H generally. Applying these heuristics faithfully - but not so faithfully as to miss the individual characteristics of a single project - will be essential. KPIs should look at group size and problem scope but also might look to surveys of group members. Successful recruitment of participants to projects: Rather than attempt broad scale engagement, an essential KPI for the group will be the quantity and quality of the specific individuals recruited to projects. Quantity should be appropriate given the heuristics and the project, and quality will be quite subjective. A side benefit of this approach should be to create at least some measure of urgency to join - if there is a limited number of seats available, and the project desirable enough, competition to join will be a strong indicator of success. Evaluation & Analytics Working Group Charter Mission The mission of the CD2H Evaluation & Analytics Interest Group is twofold: Evaluate and perform Continuous Quality Improvement (CQI) of CD2H and Develop Opportunities for CTSA Analytics & Data. We will accomplish this through a pragmatic and collaborative approach, operationalized workflows, common tools and processes, and an enhanced culture of data-driven continuous improvement across the program. We will leverage existing tools and methods,1-4 while documenting and disseminating our own processes and lessons learned.

Evaluate and perform CQI of CD2H: carry out continuous process- and outcomes-based evaluation across all program activities using proven data-driven approaches to track and monitor progress and impact of project activities. Develop Opportunities for CTSA Analytics & Data: build and support several tools, workflows, and processes to support more robust, data-driven evaluation for CTSA Program hubs and collaboratively develop tools to empower better accountability and review across the CTSA Program.

● Expand the informatics evaluation capacity of the CTSA Consortium. ● Develop modular, dynamic evaluation library. ● Support consortium-wide peer review.

13

Page 15: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

● Build a framework for collaborative Innovation. ● Create tools and process to support accountability, attribution, dissemination, and

discoverability. This mission will be guided by a commitment to the following principles:

● All program activities will be carried out as open and collaborative processes, be driven by evidence, and support routine continuous quality improvement processes

● We will leverage best practices where available for team science, program evaluation, innovation – and document and share best practices that are learned through our own work with the community.

● We will support the development, testing, and deployment of operationalized approaches across all program areas.

● We share the Evaluation & Analytics Interest Group outputs and lessons learned openly, leveraging best practices for documentation, dissemination, implementation, and engagement across CTSA Program hubs and beyond.

The CD2H will use Results-Based Accountability (RBA) to identify meaningful performance measures and monitor performance for the program. The RBA framework is structured to be a simple, common sense process that everyone can understand. Using the RBA framework can help groups surface and challenge assumptions that can be barriers to innovation, build collaboration and consensus, and leverage data and transparency to ensure accountability for performance. The RBA framework is currently being leveraged by the Clinical and Translational Sciences Awards Program at NIH as part of their Common Metrics initiative at over 50 institutions. This, combined with the collaborative, stakeholder-engaged and outcomes-oriented approach of RBA, led CD2H to use this framework to monitor performance of the program and partnerships. CD2H program operations and interest groups are assessed against their own goals and objectives – and based on quantitative and qualitative outputs, outcomes, and impacts. Year 1 Group Activities During the first 12 months of activity of the CD2H Evaluation & Analytics IG, we will focus on the following milestones: Develop Opportunities for CTSA Analytics & Data: build and support several tools, workflows, and tools to empower better accountability and review across the CTSA Program.processes to support more robust, data-driven evaluation for CTSA Program hubs and collaboratively develop

1. Expand the informatics evaluation capacity of the CTSA Consortium. 2. Complete an inventory of tools and methods for data analysis and reporting, disseminate

to the CTSA Consortium 3. Begin development of dashboards to enable hubs to visualize key metrics from CTSA

Common Metrics and for routine performance metrics to drive strategic management 4. Support consortium-wide peer review. Begin development of NUCATS’ Competitions

software to support competitions nationwide. Competitions will be enhanced and expanded with a range of features (multi-site authentication, bidding, multi-level review, etc.), providing a Consortium-level merit-based review tool for projects and events; will be used to decide the I2I Pipeline projects and all other internal competitions on the project.

14

Page 16: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

5. Begin to develop framework for Collaborative Innovation. Complete landscape analysis and literature search of topic domain area.

6. Create tools and process to support accountability, attribution, dissemination, and discoverability.

7. Develop prototype data index and repository. 8. Landscape analysis of research outputs and contribution roles in the literature and

specifications for commercial and open source research information systems in preparation for development of the Contribution Roles Ontology (CRO) and Research Outputs Ontology (ROO) data models in Year 2.

9. Develop a model to optimize informatics and data dissemination. Evaluate and perform CQI of CD2H: carry out continuous process- and outcomes-based evaluation across all program activities using proven data-driven approaches to track and monitor progress and impact of project activities.

1. Establish the two prongs (Program evaluation, analytics) of CD2H Evaluation & Analytics Interest Group, set up calls, set meeting times, and determine priorities.

2. Convene members for a face to face CD2H Evaluation & Analytics Interest Group kick-off meeting in Chicago.

3. Work with each Interest Group and with the program leadership team to complete a streamlined RBA exercise to identify program priorities and performance indicators.

4. Identify performance measures to monitor progress/success of the CD2H Program. 5. Through conversations with each Interest Group, the program leadership team, and

other stakeholders, identify any evaluation- or dissemination-related dependencies. 6. Develop strategy to support accountability to NIH, CTSA Program hubs, and the broader

community. Targeted reporting to stakeholders will be accomplished through openly-available dashboards and monthly scorecards to communicate progress on Aims, team achievements, events, etc.

7. Make an inventory of tools and methods available to the CD2H to support program evaluation. Note tools and methods needed to complete program review (any surveys, visualizations, accountability or analysis tools).

8. Establish tools and processes for reporting to stakeholders (i.e., open dashboards and quarterly scorecards to communicate progress on Aims, team achievements, opportunities for engagement (e.g., events, collaborative calls, etc.).

9. Make note of any best practices or lessons learned and share these broadly. References

1. Kathryn E. R. Graham, Heidi L. Chorzempa, Pamela A. Valentine, Jacques Magnan; Evaluating health research impact: Development and implementation of the Alberta Innovates – Health Solutions impact framework, Research Evaluation, Volume 21, Issue 5, 1 December 2012, Pages 354–367, https://doi.org/10.1093/reseval/rvs027

2. Centers for Disease Control and Prevention. Framework for program evaluation in public health. MMWR 1999;48(No.RR-11):1-42. https://www.cdc.gov/eval/materials/frameworksummary.pdf

3. Trying Hard is Not Good Enough: How to Produce Measurable Improvements for Customers and Communities. Victoria, B.C., Canada: Trafford Press, 2005.

4. Fogarty International Center, Division of International Science Policy, Planning and Evaluation (DISPPE), NIH. Framework for Evaluation. Updated September 2016. Available: https://www.fic.nih.gov/About/Staff/Policy-Planning-Evaluation/Pages/evaluation-framework.aspx

15

Page 17: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Lifespan Interest Group Charter Mission The mission of the CD2H Lifespan Interest Group, is to leverage the CD2H and the CTSA Network to enhance research in the dependence of health and healthcare on age and lifespan of the patient using biomedical informatics. Much progress has been made on understanding how treatments should differ based on patient age and healthspan both clinically and molecularly. The focus of the group is to build ‘Idea 2 Implementation’ demonstration projects that leverage both CD2H and CTSA resources in biomedical informatics. This mission will be manifest in activities such as:

● Enabling collaboration between interest groups for the purpose of enhancing research in lifespan transitions, basic research in aging and informatics.

● Act as ‘Driving Biology/Medicine’ project to the working groups as a resource for application of outputs

● Enable both I2I and DREAM Challenges in the Lifespan domain This will be guided by the following principles:

● Bridges CD2H Interest Groups and CTSA DTFs ● Enhances both lifespan and biomedical informatics ● Develops and supports work efforts related to I2Is and other projects that will have direct

measurable benefits to CTSAs ● Open to all, including membership of stakeholders who fall outside of traditional CTSA

boundaries Year 1 Group Activities During the first 12 months of activity of the CD2H Lifespan IG, we will focus on the satisfaction of the following milestones:

1. Engage CTSA community and community outside of the network in the domain and convene a ‘thinktank’ of leaders in the area to propose and prioritize high priority demonstration projects

2. Identify resources required for specific projects as well as timelines and milestones 3. Begin implementation 4. Engage stakeholders outside of the IG including other CD2H IGs, relevant DTFs and

stakeholders outside of the CTSA community. Ontologies & Standards Interest Group Charter Mission The mission of the CD2H Ontologies Interest Group (CD2H-Ontologies) is to establish and maintain a robust community-of-practice, spanning CTSA hubs, that will enhance the development and use of standard ontologies capable of supporting the full spectrum of clinical and translational research. This mission will be manifest in activities such as:

16

Page 18: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

● Convening the community of ontologists, users, and other stakeholders to build robust and comprehensive descriptions of important domains useful for research on human health

● Identify standard ontologies and to champion the harmonization of ontologies where semantic concepts overlap

● Assist in the connection of research stakeholders with ontologies for the purposes of retrospective data harmonization with ontologies and prospective data collection using the most appropriate ontologies

● Utilize ontologies for indexing of educational materials, data, software, and data use agreements and licensing to maximize search and discovery

● Promote a broader and deeper connection to the value of ontologies in research In pursuit of this mission and constituent activities, the CD2H-Ontologies will be guided by the following guiding principles:

● We are inclusive to all stakeholders and promote specific standards sparingly when there is not a clear choice

● We facilitate the construction of new and innovative terminologies and will support enhancements to established standards

● We are focused on the standard, the technology to use the standard, and the application of the standard to harmonize data, to support search and discovery, and semantic analytics

● We aim to contribute to and develop practical application based ontological strategies Year 1 Group Activities During the first 12 months of activity of the CD2H-Ontologies, we will focus on the following activities:

1. Convene stakeholders for each WG and I2I project to identify ontology needed components Landscape/requirements analysis to identify key ontology-related efforts and opportunities across the Interest Groups:

a. Specification, development, and implementation of the Contribution Roles Ontology (CRO) and Research Outputs Ontology (ROO) data models, engaging the community for input and feedback

b. Extension of ontologies for indexing educational materials, such as from the N-lighten project

c. Development of a Medical Action Ontology for treatments related to rare disease 2. Evaluation of existing data use agreement ontologies (GA4GH, RDA, etc.) and

participation within these groups 3. Prototype Java library & framework for transforming LOINC-encoded data into ontology

(HPO) 4. Begin facilitation of the expansion of ontologies to improve and increase impact of the

use of biomedical ontologies in basic and translational research

17

Page 19: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

People, Expertise and Attribution Interest Group Mission The mission of the CD2H People, Expertise, and Attribution Workgroup (CD2H-PEA) is to support discovery of the rich landscape of expertise available within the CTSAs and beyond, and to provide a robust infrastructure for supporting attribution of the diverse types of contributions needed to perform translational team science. To achieve this, we will leverage effective strategies and inventive approaches to build connections within and beyond the CTSA Consortium. We will adapt and expand our existing research profiling infrastructure using open collaborative approaches . We will develop tools to identify, track, disseminate, and understand the contribution and impact of software, data, informatics, and other non-traditional scholarly products and activities to properly attribute credit. Finally, this extended knowledge about expertise across the CTSAs will be applied to assist in the creation and success of community-wide collaborative functions. This mission will be manifest in activities such as:

● Extending representation of expertise and related services across the CTSA consortium; ● Extending institutional adoption of research profiling platforms in a platform-agnostic

manner; ● Developing a practical, scalable model of contribution and fostering its adoption across

the CTSA consortium; ● Provisioning a CTSA-wide UBER index of all resource types utilized across the full

spectrum of CD2H activities; ● Creation of expertise visualizations and services for adoption and assimilation by CTSA

hubs for use in their local environments. In pursuit of this mission and constituent activities, the CD2H-PEA will be guided by the following five principles:

● Our top priority will be the design and delivery of efficient and accessible systems-level solutions that capture the full range of CTSA capabilities in a single searchable discovery framework that is relevant to the broadest possible clinical and translational science environment;

● We will collaborate broadly, working together with a diverse range of partners and stakeholders to execute our mission in a transparent manner;

● We will always seeks to adhere and champion FAIR-TLC guidelines and frameworks; ● We will strive to strategize, engineer and implement tools that make the CTSA

network more efficient, impactful and innovative; ● We will constantly seek input and feedback from the CTSA community as well as

provide it back; and ● We will create a culture of continuous improvement and transparency

Year 1 Group Activities During the first 12 months of activity of the CD2H-PEA IG, we will focus on the following milestones:

18

Page 20: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

1. Refactoring of the existing CTSAsearch architecture in preparation for extension of the data model and incremental distribution of components into local CTSA hub environments.

2. Development and deployment of a first versions of the CD2H-PEA query and services APIs.

3. Integration of additional research profiling systems into the CTSAsearch framework, in particular the multiple Elsevier Pure sites for which we currently do not have harvesting credentials.

4. Integration of additional external identity authorities, particularly GRID and ORCiD. 5. Development of a GitHub metadata harvester and integration of that metadata (people,

organizations, repositories) into the CTSAsearch framework. 6. Establishment of a development instance of a VIVO-compatible research profiling

system for eventual use by CTSA consortium personnel with no local profiling system. 7. The initiation of a broader environmental scan across and between CTSA hubs to

identify prevailing best practices for the sharing of software, tools, and algorithms, as well as an collaborations conducted in this context that extend beyond the immediate CTSA hubs (e.g., with other research organizations and/or industry). This environmental scan will serve as the basis for a follow-on gap analysis and planning process as it relates to creating a knowledge-base concerned with such topics. (Note that this task is in common with the CD2H-STA.)

8. Begin development of the Contribution Roles Ontology (CRO) and Research Outputs Ontology (ROO) data models, engaging the community for input and feedback. (in Evaluation & Analytics IG) Corporate

Rare Disease Interest Group Charter Mission Rare diseases are individually rare but collectively common, estimated to affect 5-8% of the population. Many people with rare disease wait years to get the correct diagnosis, and rare diseases are overrepresented in hospital patient populations but are not well characterized in hospital IT systems. Overall, CD2H will contribute to a socio-technical framework, collaborate with the CTSAs and rare disease communities worldwide, and build software that will organize the information available in EHR systems and enable analytics, sharing, and integration with genomic data as appropriate. Our approaches will be inclusive of patient participation, basic research science, and drug repurposing and discovery, as well as clinical interpretation and care and clinician education. Our work will also support translational collaborations such as providing information about specialists and special treatments available within the CTSA network, and local experts for functional validation of candidate disease-causing variants found in patients using biomedical informatics as an enabling platform. In pursuit of this mission and constituent activities, the CD2H-RARE will be guided by the following principles:

19

Page 21: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

● That all patients deserve the best quality of care possible, and that informatics can help improve rare disease diagnosis, prognosis determination, and selection of treatments

● We will collaborate broadly, working together with a diverse range of partners and stakeholders to execute our mission in a transparent manner;

● We will always seeks to maintain patient privacy ● We will facilitate empowerment of open patient data sharing ● We will strive to engineer and implement tools that make the CTSA network more

efficient at identifying and caring for rare disease patients ● We will constantly seek input and feedback from the CTSA community as well as

provide it back ● We will create a culture of continuous improvement and transparency

Year 1 Group Activities Definition of the first Idea-to-Implementation projects (note that the intention is to kick-start these with the community; we do not intend to solely define the I2I projects ourselves). Below is mostly year 1. Note that no DREAM challenge has yet been defined. Rare Disease I2I-A. N-of-1 patient matchmaking. Here we aim to support increased n-of-1 patient matchmaking across the CTSAs. This will be achieved by taking a “phenosnapshot” of a patient with suspected rare disease and sending it to other relevant software.

I2I-A Phase I:

● Find expert clinicians and diagnosticians at each CTSA (Engagement IG). Provide training (Education IG) to local clinicians and clinical informaticians on how to perform the deep phenotyping and share cases.

● Set up a matchbox instance for connecting n-of-1 patients to the GA4GH Matchmaker Exchange at as many CTSAs as possible (Software IG). At first, entry of the phenotype terms for matching and the candidate variants will be manual. A second phase will include tools to extract phenotypic information directly from the EHR (see below).

I2I-A Phase II: ● Implement GA4GH phenopackets in the matchmaking tools (Ontologies IG) ● Consent patients for open data sharing that they can achieve themselves, which enables

their cases to be included in open global efforts to support matchmaking in the context of diagnostic tools and labs, the literature, patient communities, registries, etc. (Engagement IG)

I2I-A Phase III: ● Provide a backend for more robust data sharing of additional contexts across sites using

the Synapse platform (Software and Data IGs). This would include additional clinical notes, lab results, imaging documents, etc.

20

Page 22: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

● The directory structure and access permissions would support patient and clinician controlled access to fully open content such as a phenopacket, or more strictly controlled access to additional clinical or personal materials. (Data and Software IGs)

I2I-A Phase IV: ● Match clinicians with n-of-1 or rare disease patients against local collaborators for

functional validation studies. Leverages both research profiling data and literature, filtered to local institution/region (People IG).

Rare Disease I2I-B. Rare disease EHR data extraction. Here we aim to introduce a software framework for capturing data relevant to rare disease directly from the EHR. These data can be used for additional analytics or to populate data sharing efforts as described in I2I-A. I2I-B Phase I

● We will utilize a FHIR interface and provide a SMART on FHIR tool that will allow clinicians to make a “snapshot” of their patients that will include Human Phenotype Ontology terms. (Software IG)

● The software will contain two modules, a) a LOINC to HPO module (Ontology IG); b) Text mining module that will extract HPO terms from clinical notes as well as radiology and pathology reports. The SMART on FHIR app will provide a summary of the data mining and allow clinicians to “vet” the results using checkboxes etc (Software IG)

● An additional software component (i.e., non SMART) will be developed for use in software pipelines and EHR systems directly (e.g., if the software concludes that there is more than an X% probability of a child in a pediatrics clinic of having an unrecognized rare disease, this will cause a warning to be emitted). (Software IG)

Rare Disease I2I-C. Rare disease treatment knowledge. Here we aim to create a rare disease medical action ontology to assist in relating to clinicians and patients the understanding the landscape of treatments that currently exist. Such an ontology will also support analytics that aim to generate candidate drugs for repurposing.

I2I-C Phase I

● Inventory and extract GeneReviews treatments for rare diseases ● Inventory and extract GARD treatments for rare diseases ● Inventory and extract Orphanet treatments for rare diseases ● Create a data model that will support informatics applications as per above (Ontology

IG) See this document for granular milestone planning.

21

Page 23: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Software, Tools, and Algorithms Interest Group Charter Mission The mission of the CD2H Software, Tools and Algorithms Interest Group (CD2H-STA) is to establish and maintain a robust community-of-practice, spanning CTSA hubs, that will predispose and enable the sharing, discovery, adoption, and adaptation of software, tools, and algorithms capable of supporting the full spectrum of clinical and translational research. This mission will be manifest in activities such as:

● Maintaining an inventory, metadata and ensuring FAIR-TLC surrounding available software, tools, and algorithms either of interest to and/or being developed and utilized at CTSA hubs;

● Developing best practices and standards that maximize efficiency, minimize duplicity and maximizes the principles of the open software community; and

● Creating strategies surrounding CD2H vendor engagement and hub technology transfer in a way that encourages partnership and collaboration, manages conflicts of interest, and follows all applicable policy and law.

● Supporting the CTSA network with the capacity to address emergent and important challenges and tasks surrounding software, tools and algorithms

In pursuit of this mission and constituent activities, the CD2H-STA will be guided by the following four principles:

● Our top priority will be the design and delivery of efficient and accessible systems-level solutions that predispose and enable the discovery and sharing of software, tools, and algorithms that are relevant to the broadest possible clinical and translational science environment;

● We will always seeks to adhere and champion FAIR-TLC guidelines and frameworks; ● We will strive to strategize, engineer and implement tools that make the CTSA

network more efficient, impactful and innovative; ● We will constantly seek input and feedback from the CTSA community as well as

provide it back; ● We will create a culture of continuous measurement, improvement, and

transparency Year 1 Group Activities During the first 12 months of activity of the CD2H-STA, we will focus on the satisfaction of the following milestones:

1. Identification and engagement of key stakeholders in the CTSA-affiliated community to inform the creation of virtual and physical communities-of-practice focused on the sharing of software, tools, and algorithms;

2. Development and deployment of a beta version of the CIELO data and tool sharing platform for use by CTSA hubs;

3. The development of use case for the use of CIELO, in conjunction with Sage Synapse where applicable, to enable collaborative activities that engage CTSA hubs

22

Page 24: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

in the design, adoption, and adaption of software, tools, and/or algorithms of common interest to our community-of-practice.

4. The initiation of a broader environmental scan across and between CTSA hubs to identify prevailing best practices for the sharing of software, tools, and algorithms, as well as an collaborations conducted in this context that extend beyond the immediate CTSA hubs.

5. Engage our community-of-practice to better understand priorities and methods surrounding that can improve the usability, efficiency, innovation and features of collaborative developed software, tools, and algorithms.

Key Performance Indicators All of the preceding activities will be conducted in a manner that adheres to and supports an integrative view of the implementation, processes, and outcomes associated with the CD2H program, as illustrated in Figure 1. Specific implementation methods to be informed and applied in light of the output of this evaluative framework include:

1. Continuous evaluation and process/outcome improvement: All programmatic activities will be designed, conducted and managed in a manner that emphasizes comprehensive and systematic quantitative and qualitative “instrumentation” of technologies and processes, coupled with integrative analytical platforms (including the use of both Google Analytics for all web-facing applications, and the Pentaho open-source reporting workbench). Such evaluation and process/outcome improvement will be tightly aligned with the overall CD2H evaluation program, and be responsive to both the CD2H leadership and External Advisory Board (EAB). A CD2H-CTSA program leadership team consisting of the committee chairs will conduct day-to-day oversight of the CD2H-STA with regards to such evaluation and process/outcome improvement activities.

2. Agile project management: All CD2H-STA related projects and programs will be executed using an agile project management methodology, emphasizing the rapid and iterative development, deployment and evaluation of minimum viable products (MVPs). Such an approach, particularly with regard to software development, will ensure that continuous stakeholder engagement and feedback is utilized to ensure the resource-efficient design and deployment of platforms and tools with minimum barriers to acceptance. A suite of web-based project and task management tools will be used to track and manage such agile methods, and such tooling will also serve to support/enable the previously described continuous evaluation and process/outcome improvement measures.

3. Clear recommendations of software tools that improve CTSA network efficiency: The CD2H-STA will identify and prioritize opportunities for improvement within the CTSA network surrounding software platforms. Prioritized tools will then be assessed for opportunities to use new technologies, such as the cloud, to improve efficiency and impact within the network.

4. Open-access dissemination of methods and technologies: The CD2H-STA will engage in the systematic dissemination of all major methods and technologies

23

Page 25: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

developed under the auspices of the CD2H, utilizing an open-source/access distribution mechanism where applicable (adhering to FAIR-TLC principle) and made accessible via a dedicated code repositories linked to the overall CD2H web portal (including the aforementioned CIELO platform). In addition, a best effort will be made at all times to adopt/adapt CTSA consortium-wide standards and models, such that technologies and methods developed by CD2H-STA will be interoperable and shareable at a national level.

Figure 1: Overview of process and outcomes evaluation framework to be used by the CD2H-STA to implement and evaluate all activities associated with the cores specific aims. The data to be used in supporting/enabling such evaluative efforts will be generated via comprehensive portfolio and resource management tools.

24

Page 26: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

CD2H Project Team Deliverable Descriptions

Data Workgroup 4/17/18

Deliverable #1: Clinical Data Model Landscape

Point Person: Christopher Chute, [email protected] Problem: Clinical Data sharing is integral to CTSA functioning and to CD2H coordination. A critical path requirement for this is a shared or targetable data model. The question is, as always, which one among the many contenders. CD2H should lead the community to a resolution, but we should start with a landscape analyses. Proposed Solution: Collect existing data and design possible survey. The existing data would arise from:

● TriNetX sharing their CTSA customer base (active and signed) [done] ● CTSA members who share PCORI data (via PCORNet) [in process] ● CTSA ACT sharing their current and promised members [done] ● Retrieving historical profiles collected for CTSA reporting metrics [in process] ● Design potential “dynamic spreadsheet” on CD2H page, encourage self report

and update of: ○ Confirm network status ○ Add information about other i2b2 or related repositories ○ Report on use of EHR vendor warehouse

■ Local implementation ■ Part of Vendor community network

○ add metadata ■ whether network is CTSA or other university ■ contact information

Output (3-6 months): We anticipate a series of outputs

● Summary slide of CTSA informatics network participation (by EAB) ● Design of data gathering instrument with review (2 months) ● Population of full instrument (4 months and ongoing)

Benefit: Inform discussions about downstream data harmonization efforts leading to strategic goals of CTSA-wide clinical data interoperability.

Deliverable #2: Data Index

Point Person: Chunlei Wu - [email protected]

25

Page 27: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Problem: CTSA host and access a large spectrum of data. To achieve the vision of data integration within, between, among, and beyond CTSAs, cataloging and indexing this data is a first step. Proposed Solution: Creating an index is the obvious solution, though it has many steps which can be sequentially pursued. We will start by cataloguing the data at the nine CD2H CTSA hubs following these steps:

● Work with PIs at the nine CD2H CTSAs to develop a redcap survey that encompasses all data types that their CTSAs house. The survey should record all of the metadata (the types of data, what the data contains, and how the data are disseminated).

● Distribute redcap survey to nine CTSAs and compile the results into a table. ● Coordinate with the Ontology Workgroup to standardize the dataset ontology

across the different types of data so that we can effectively query metadata across CTSA sites (make sure that everyone is using a standardized vocabulary for the same types of data).

● Build a user friendly interface (API and web tool) for conducting data queries across CTSA sites.

Later: ● Extend the BioLink data (Translator) model to encompass clinical data types and

build connections between clinical data sets and translational knowledge base (gene, variance, drug, disease, etc…). Integrating clinical and translational research findings will allow us to more effectively conduct translational research.

Output (3-6 months): Landscape of options for each step, preliminary choices for each. Alignment with related indices.

● Conduct a survey for available datasets within the nine CD2H CTSA sites (work with the engagement team) to have a base landscape of what is available.

● Demo the connections built between clinical datasets and the translational knowledgebase (e.g. from Translator API ecosystem). The current candidate datasets include:

○ Scripps Wellderly dataset ○ JHU patient-level synthetic datasets ○ OHSU biospecimens datasets

● Evaluate existing data indexing solutions (e.g. Northwestern framework, BioCADDIE), and determine the future CD2H strategy of data indexing models and repository.

26

Page 28: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Benefit: Provide the landscape of the CTSA datasets and promote the data integration across CTSA sites and different dataset types. This will advance the CTSA agenda by supporting both scientific discovery and clinical practice via the data integration across the full translational spectrum.

Deliverable #3: Data Licensing Best Practices

Point Person: Melissa Haendel - [email protected] Problem: Current licensing practices for data are highly variable or non-existant and very often require complex legal negotiations for use in aggregate or distributed forms where the data may be most valuable. The main reasons for this are twofold. First, many datasets either have no license or have unclear licenses due to a lack of explicit declarations and language. Second, licensing is often used as an attribution proxy or leveraged for cost recovery, which can significantly limit scientific use. Proposed Solution: Coalesce licensing best practices into a community recommended framework and guidelines to facilitate data reuse, commercialization, and efficient navigation of permissions. Output (3-6 months):

1. Create a 1-page engagement flyer to announce two new national partners, an Association for Technology Managers (AUTM) committee that will create the legal framework, and the Data Licensing Initiative, that will be the community partner for supplying use cases and evaluation of the new framework.

2. Create short “best practices” documents, one to distribute to VPRs and Organizational leaders, and the other for TTOs. These documents will create awareness and engagement to assist in evaluation of the AUTM framework.

3. Initialize the AUTM committee, similar to the successful approach used for the UBMTA agreement, to create a national framework for data licensing.

4. Create data use cases in the new Data Licensing INitiative that illustrate the requirements to the AUTM committee and to TTOs.

5. Continue to populate the ReusableData.org site with curated licenses from popular data repositories to facilitate public awareness and improvement tracking over time.

6. Launch a demonstration project across the 9 CD2H sites and recruit additional interested partners from the CTSA Program hubs.

27

Page 29: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Benefit: This work will assist CTSA institutions to better declare licensing for their data sources in support of data sharing and reuse to facilitate new innovative uses of our collective data to realize collective benefits and opportunities. Further, it supports new opportunities for commercial or public-private partnerships to be more readily formed due to the lessened legal burden.

Data Deliverable # 4: Maturity Models Point person: David Dorr, [email protected] Problem/opportunity: Understand how to change the current state of limited adoption of CD2H principles at healthcare systems and universities. Studies of maturity models have shown them to be effective at promoting self-assessment and organizational learning, especially in distributed structures. Since the CTSAs are heavily distributed, this would be an appropriate method for advancing organizational capabilities. Maturity models in clinical research informatics have been recently introduced by others, but their development is nascent and more specific adaptation is needed to represent the current and potential data capabilities of CTSAs. We need maturity models for CTSA use of data defined to a level that individual CTSAs could unambiguously be assessed, and important gaps in maturity and capability could be identified. Such a model could then be recommended for NCATS to promote through direct incentives (especially in scoring grant applications). Will be collaborating with Peter Embi and his team to make sure that efforts are collaborative. Proposed solution: Identify best practices and infrastructures related to governance/leadership, policy, and data that are aligned with and will fulfill the realization of CD2H aims, by looking at perceived most advanced groups; compare these structures in advanced to less advanced; and create initial definition of components of different stages for groups based on these best practices. Get a clear understanding of what key steps organizations have taken to be successful in these areas. Areas of focus: Governance/Leadership - David D.

28

Page 30: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

● Identify questions that demonstrate the commitment to and action on FAIR-TLC and other principles from CD2H (e.g., leadership expressed support; governance has a model to approve requests and allocate resources in these areas)

Policy - Robin

● Identify institutional policies and policy components that facilitate or hinder reusable data sharing.

Data - Adam

● FAIR-TLC ○ How tools and capabilities are made sharable ○ Value reuse over new development

● Identifying which data capabilities are most associated with successful support of clinical research activities.

Steps:

1. Generate or adapt targeted questions that elicit maturity and deployment in these areas.

2. Query 6-8 (half high performing, half median performing) institutions with these questions in semi-structured interviews.

3. Analyze data to generate potential definitions and assessments for maturity levels, adapting current work where appropriate.

4. Recommend a set of potential parsimonious metrics to help institutions understand their maturity and where they can improve.

Output in 3-6 months:

● Current state in 6-8 institutions; assessment questions; and metrics. ● Create a 100-day challenge for one focus area:

○ Outline steps that organizations can take to improve in these areas over a 100 day period

● Later (post 6-months): ○ Expand to other areas ○ Focus on effective solutions that institutions without a lot of resources

have come up with

Benefit: Could be used by institutions to improve maturity and deployment, and use these successes in CTSA creation, renewal, and reporting processes. Applicants and existing sites would be required to assess level and demonstrate improvement.

29

Page 31: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Education and Learning Innovation Working Group Deliverable #1: Competency Mapping and Gap Assessment Point Person: Shannon McWeeney - [email protected] Problem: Education and training is integral to the CTSA mission and a key area of CD2H coordination. Competencies in data science and informatics have been proposed by groups within the CTSA as well as externally. Dynamic changes in both data types and methods make it challenging for individual programs to assess the landscape of both the competencies necessary for their diverse groups of learners as well as associated training materials already developed, often leading to duplication of efforts. Proposed Solution: Identification of existing competencies as well as attribution and recognition of the groups who have developed them to allow for a more inclusive, collaborative environment to facilitate education and learning innovation. Mapping and initial harmonization of competencies to allow alignment of existing materials and initial gap assessment. Output: Collaborative white paper on competency harmonization and community guidance on current best practices with regard to competency re-evaluation. Prioritized areas for educational materials and training development. Benefit: Clear need for this from variety of stakeholders and highly synergistic with other efforts underway. This builds a road-map for focused educational output across diverse learners in CTSAs. This is a critical step for gap analysis and indexing resources by CD2H. Deliverable #2: Module template for material development as well as process for discovery of existing resources: Use case – entrepreneurship.

Point Person: Shannon McWeeney - [email protected]

Problem: A number of areas were identified as high-impact, non-traditional areas of training. One of these was entrepreneurship. Within CD2H, we need to begin to assess best frameworks for development, delivery, dissemination and evaluation of educational materials, as well as develop the community process for discovery of existing resources to prevent duplication of efforts and ensure attribution.

Proposed Solution: We propose to identify, align and as needed, develop new materials for a module on Introduction to entrepreneurship. This would cover (1) learning how to develop your ideas and value proposition: (2) how to evaluate their potential and market analysis: (3) how to recognize the barriers to success: (4)

30

Page 32: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

elements of a business plan/making a pitch: and (5) Discussion of Personal vs Professional Competencies needed in this area.

Output: Introduction to Entrepreneurship training module; Template for evaluation with respect to other educational materials. Initial process for resource discovery and re-use.

Benefit: Education is key to help facilitate culture of innovation and entrepreneurship within the health informatics scientific community. This initial use case will provide critical data on processes for discovery and aggregation of existing materials, development of new materials and evaluation. Deliverable #3: Biodata Club Starter Kit: non-traditional educational training meet-up model and resources which facilitate interdisciplinary discussion

Point Person: Ted Laderas - [email protected]

Problem: Interdisciplinary efforts are becoming more critical for scientific discovery and translational research efforts. Non-traditional learning approaches provide a neutral and inclusive mechanism for facilitating interdisciplinary collaboration and communication. The best mechanisms to disseminate these types of approaches and the resources needed is unknown.

Solution: We will leverage a highly successful model developed in collaboration with the Mozilla Foundation which is focused on skill sharing, co-working and community building. Templates, code base and materials will be packaged to create a “starter kit” to assess how well this can be duplicated at other sites.

Output: GitHub repo with instructions and examples; Implementation and evaluation by minimum of 1 other partner site.

Benefit: Education is key to help facilitate culture of interdisciplinary collaboration and communication. This initial use case will provide critical data on resources needed for both dissemination of non-traditional frameworks like this as well as resources needed on site to ensure success and ability to translates across institutions.

Deliverable #4: Extreme Team Science Admin Bootcamp

Point Person: Julie McMurry - [email protected]

Problem: A number of areas were identified as “high impact”, non-traditional areas of training. One of these was practical operational support for Team science. Within CD2H, we need to begin to assess best frameworks for development, delivery, dissemination and evaluation of educational materials, as well as develop the community process for discovery of existing resources to prevent duplication of efforts and ensure attribution. Within the CTSA program, there is enthusiasm for collaborative

31

Page 33: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

team science projects, but few people have the deep operational experience necessary to run them efficiently.

Solution: We will aggregate and develop materials focused on (1) Effective and efficient operations, (2) Transparent communications and (3) Emergent properties of team.

Output: There will be materials for two formats of dissemination: In person - tutorials on process, communications, “code of conduct” and Online: videos (5 -10 min) on same topics.

Benefit: This would provide a unique educational resource with respect to team science and the processes around management of these activities for the CTSAs. This will also support I2I projects coordinated by CD2H.

Engagement Workgroup 4/17/18

Deliverable #1: Framework for supporting CD2H DREAM Challenges

Point Person: Justin Guinney - [email protected] Problem: In supporting DREAM Challenges, we intend to simultaneously address several key goals of the CD2H program: (i) cross-CTSA data models that support federated model training and evaluation; (ii) development of a shareable tool repository; (iii) development of domain-specific benchmarks. Proposed Solution: Supporting a series of CTSA-initiated - and CD2H supported - data challenges Steps: Structure and mechanisms for soliciting data challenge ideas, reviewing proposals, and running challenges. Output (3-6 months): The following deliverables will be completed within 90 days:

● We will define a process for soliciting proposals (DRAFTED LETTER IN PROGRESS).

● We will define a governance structure for reviewing and vetting challenge proposals based on (i) clinical and/or technological priorities; (ii) feasibility to address challenge questions; and (iii) institutional support.

● We will circulate 1st “Request for Proposals” with CLIC, iDTF, and other CTSA-associated entities and define deadline for receipt of proposals.

32

Page 34: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Benefit: Through CTSA DREAM Challenges, we hope to foster the development of data and model ecosystem whereby teams can benchmark methods against multiple, external data sources. We also hope to engage the broader statistical & machine learning community in developing innovative clinical models.

Deliverable #2: Develop demonstration pilot to assess and showcase feasibility for “model 2 data” challenge

Point Person: Justin Guinney - [email protected] Problem: A platform ecosystem supporting algorithm deployment across heterogeneous data and compute environments is challenging and complex. Therefore, we intend to develop a pilot project to establish feasibility. Proposed Solution: Develop a pilot “challenge” among 2 or more sites that demonstrates feasibility of deploying containerized models using a common data model, and ability to conduct model assessment against data accrued longitudinally. Steps: We will need to work with multiple stakeholders - spanning data governance, IT and data managers, and algorithm developers - to develop a strategy for oversight and implementation. Output (3-6 months): Selection of prediction problem as the focus of the pilot (e.g., overall mortality, 30-day readmission, diagnosis)

● Selection of a data model upon which algorithms will operate. ● Selection and engagement of additional sites willing to participate in the pilot.

Benefit: A successful demonstration of a pilot program will serve as a potential blueprint for future challenges. Lessons learned from the pilot will allow future Challenges to operate more efficiently.

Evaluation & Analytics Workgroup 4/17/18

Deliverable #1: Design & execute program evaluation & dissemination for the CD2H

Point Person: Kristi Holmes [email protected] Need: This work establishes the processes and tools for evaluation and tracking of CD2H-related efforts to track progress, provide accountability, & support dissemination of CD2H efforts.

33

Page 35: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Proposed Solution: Program Evaluation: Create a meaningful strategy to evaluate the CD2H and understand and communicate the impact of CD2H efforts and outputs.

● Set up tracking workflows for CD2H project outputs (e.g., publications, datasets, data models, software tools and algorithms) (completed)

● Identify key metrics for assessing program performance via abbreviated Results-Based Accountability exercise (completed)

● Identify key metrics for assessing project and program performance via abbreviated Results-Based Accountability exercise for all Working Groups

● Develop yearly satisfaction/stakeholder survey ● Identify and develop (in cooperation with key champion) use cases (in progress) ● Develop yearly impact report for broad dissemination in addition to detailed

reporting (Fall 2018) ● To determine: frequency and details of reporting to various stakeholder groups;

identify dashboarding platforms and other tools that might be able to support accountability

Output (3-6 months): plan & resources to support program level evaluation, accountability, reporting Dissemination:

● Organize “author resources” for the CD2H project, such as recommended language for citing the grant, a link to ICJME authorship guidelines, set up CD2H corporate author identity.

● Establish event workflows for the project including list of conferences CD2H persons are attending and presentation details, as well as any outreach opportunities, site visits, etc. to enable proactive engagement

● Work with project librarians to develop materials and webinar to support “dissemination best practices” for data, software, etc. (Fall 2018)

Output (3-6 months): plan & resources to support authorship conversations across dispersed, multidisciplinary teams and better facilitate scholarly communications

Benefit: Solid program evaluation workflows will help CD2H with analysis, accountability, advocacy, and allocation needs of the project internally and will support accountability to CTSA Program hubs, NCATS, broader informatics and data science community; proactive dissemination of project efforts to the broader CTSA community also helps support accountability and engagement.

34

Page 36: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Select Supporting References: ● Guthrie S, Wamae W, Diepeveen S, Wooding S, and Grant J, Measuring

research: A guide to research evaluation frameworks and tools. MG-1217-AAMC, 2012 (available at www.rand.org/pubs/monographs/MG1217)

● Friedman M. Trying Hard is Not Good Enough: How to Produce Measurable Improvements for Customers and Communities. Victoria, B.C., Canada: Trafford Press; 2005. 179 p.

● Defining the Role of Authors and Contributors. International Committee of Medical Journal Editors. Available at http://bit.ly/ICJME-authorship

Deliverable # 2: Scoping review of the literature on evaluation of translational informatics Point Person: Adrienne Zell, [email protected] Problem: As informatics increases in importance across CTSAs, evaluators without informatics expertise are being asked to evaluate tools, initiatives, and products such as datasets and software. Moreover, translational informaticians may not be aware of the full complement of methods and tools for assessment and/or quality improvement strategies for computational or data projects. At this time, there is no published review of translational informatics evaluation methods or case studies. This review will provide a first step in collating and disseminating literature (manuscripts and other products) that evaluators can use to build their methodology. This exercise will also illuminate gaps in the literature, identifying areas where CD2H can contribute to the field. Proposed Solution:

● Steps: ○ Define the initial search parameters to be used by a CD2H expert in

scoping reviews, along with librarians. (completed) ○ Solicit feedback on initial search from CD2H evaluation team assigned to

this activity, as well as others who may want to contribute. This feedback includes identification of additional keywords, screen in/screen out recommendation for each publication, and suggestions for search expansion or limitation. (started)

○ Continue these iterations (previous two bullets) as necessary until the review is complete

○ Format the literature search ○ Disseminate the literature search

Output (3-6 months): webinar, paper, distribution to CTSAs through CTSA evaluator

35

Page 37: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

calls, TRE TIG (American Evaluation Association) ACTS Evaluation SIG, etc. Benefits to the CTSA Program and the CD2H:

● Resource for hub informatics teams and/or evaluation teams ● Better perspective of existing best practices and gaps in assessment of key

areas (generally tools, data, programs, training, informatics investments & activities, etc.)

● Benefit to the CD2H by giving us another tool to support local work - also a great way to act on ideas that the working group discussed and was excited about - engage WG members (original idea by Sean Yu)

● Identifies opportunities for CD2H evaluators to contribute to the field ● Increases the comfort level for CTSA evaluators around evaluation of informatics

initiatives and activities. Supporting references:

● Armstrong R, Hall BJ, Doyle J, Waters E. Cochrane Update. 'Scoping the scope' of a cochrane review. J Public Health (Oxf). 2011 Mar;33(1):147-50. doi: 10.1093/pubmed/fdr015. PubMed PMID: 21345890.

● Hilary Arksey & Lisa O'Malley (2007) Scoping studies: towards a methodological framework, International Journal of Social Research Methodology, 8:1, 19-32, DOI: 10.1080/1364557032000119616

● Peters MD, Godfrey CM, Khalil H, McInerney P, Parker D, Soares CB. Guidance for conducting systematic scoping reviews. Int J Evid Based Healthc. 2015 Sep;13(3):141-6. doi: 10.1097/XEB.0000000000000050. PubMed PMID: 26134548.

Ontology Workgroup 4/17/18

Deliverable #1: LOINC2HPO project

Point Person: (Peter Robinson - [email protected]) Problem: Laboratory tests and results are encoded using LOINC codes in most hospitals, but LOINC is difficult to use for computation because the codes are not easily integrated with other data due to their granularity and composition of use. In some systems, LOINC codes are transmitted as FHIR observations. In essence, this provides

36

Page 38: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

us with a bundle of information about the test performed (e.g., eosinophil count, encoded using LOINC) and its result (encoded in the FHIR message). Proposed Solution: The results of many laboratory tests can be coded using Human Phenotype Ontology (HPO) and thereby be made available for use in applications such as the Exomiser, a valuable tool for rare disease genomic diagnostics. We propose an approach that will map the combination of LOINC codes and FHIR encoded results to HPO terms. In this way, a set of results can be transformed into a set of HPO codes (including negated HPO codes for normal test results, e.g., NOT Abnormal eosinophil count). Such encodings will provision improved analytics that leverage summary-level information about the phenotypic characteristics represented by the laboratory assays. Work program: a) Develop a JavaFX app for accurate and efficient biocuration. The app will show the LOINC codes together with other relevant LOINC fields such as the name, scale, and component (analyte) of the test. The app will use a text mining approach to propose HPO terms that represent close matches. If correct HPO terms are found, then the user can easily drag them to the corresponding fields for biocuration or conveniently create a GitHub issue on the HPO tracker to request a new term. We have currently developed a working prototype of this tool and have begun to annotate LOINC, with currently about 50 annotated terms. b) Develop a Java library to transform FHIR/LOINC messages into HPO terms. This part of the project is intended to be used in EHR environments. It will exploit the biocurated annotation file from part (a) to transform a FHIR Observation containing a LOINC code into the corresponding HPO code in an efficient fashion. We have currently developed a prototype that works correctly for the LOINC codes with Qn (quantitative) scale. We will extend the code and the unit tests to more codes and to the Ordinal scale types. c) Develop a SMART on FHIR app to demonstrate the library in (b). We are just beginning with this part of the project. Documentation: http://loinc2hpo.readthedocs.io/en/latest/ https://github.com/monarch-initiative/loinc2hpo Output (3-6 months):

● Annotations for roughly 750 terms (selected from the “top 2000 LOINC codes”) at 6 months.

● Working SMART on FHIR app at 6 months. We expect that the software will have attained sufficient functionality at the end of 3 months that we will seek

37

Page 39: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

collaboration from the Hopkins and U WAsh groups to begin to test and validate the code in an EHR environment.

● Work with others involved in CD2H to integrate the software into Synapse and/or CIELO; once here it will be interesting to compare the patterns of LOINC annotations at different CTSA sites to assess consistency of laboratory data representation.

Benefit: This software library will be a first step towards a larger software library that will semantically encode phenotype data from EHRs using HPO and possibly other terms. This will enable heterogeneous clinical data to be exchanged between CTSA centers, and will enable certain types of analysis that cannot be performed on “raw” EHR-based data. For instance, in order to use phenotype data for clinical genomics software, it needs to be coded using ontology terms--usually HPO terms, a manual process that could be substantially accelerated by tools such as the one proposed here.

Deliverable #2: Contribution Role and Outputs Ontologies

Point Person: (Kristi Holmes - [email protected]) Problem: Research is changing: no longer are scientists measuredconsidered simply from the perspective of the number of papers written, citations garnered, and grant dollars awarded. There has been a fundamental shift that recognizes both the interdisciplinary, team-based approach to science as well as the more fine-grained characterization and contextualization of the hundreds and thousands of contributions of varying types and intensities that are necessary to move science forward. Unfortunately, little infrastructure exists to identify, aggregate, present, and (ultimately) assess the impact of these contributions. These significant problems are technical as well as social. They require an approach that assimilates cultural and social aspects of these problems in an open and community-driven manner. Proposed Solution: Work in support of this problem statement will produce two complementary data models: (1) a Contribution Role Ontology to represent the types of contributions that a person makes, whether at a micro or macro scale and (2) a Research Outputs Ontology to represent the outputs that people create during the research process. The two components together will support “roll up” or transitivity of contributions. Work done to date: We have piloted an early version of a contribution ontology in the OpenVIVO platform. We have also “ontologized” the existing CRediT taxonomy in this context in support of extension and interoperability. We have completed early work mapping the different schema for the ROO. We have hosted a number of community engagement and stakeholder relationship building to define requirements.

38

Page 40: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Work program Contribution Role Ontology (CRO)

● Represent role attributes that can exist for research outputs as larger program or project-level roles.

● Extend CRO using existing community contributions we’ve gathered in workshops on the types of activities for which people wish to receive credit

● Iterate internally with project members, confirm final version ● Release V1 ontology (end of 3 months) ● Implement in OpenVIVO (end of 6 months)

Research Outputs Ontology (ROO)

● Use the NISO research output work as a base document to map research output schema from various research information systems (ResearchFish, Web of Science, InCites, Scopus, Symplectic Elements, CASRAI, figshare, etc.)

● Add additional output concepts identified in prior workshops (end of 3 months) ● Identify “low-hanging fruit”; are there any common concepts that persist across

the various schema as concepts to include in the first version of the ROO ● Release V1 ontology (end of 6 months)

General workflows that need to be better understood/established

● Identify a process by which key stakeholders and the community are engaged at a larger level (regular calls or an event, etc.)

● Engage community in a process to identify and promote concepts to include. ● Develop community-driven process for input before the release up updates.

Perhaps model this after the NISO community input process they use for each release.

Output (3-6 months): Contribution Role Ontology (CRO)

● Release V1 ontology (end of 3 months) ● Implement in OpenVIVO (end of 6 months)

Research Outputs Ontology (ROO)

● Release V1 ontology (end of 6 months) Benefit: Improved representation of roles and outputs in systems will enable better recognition and crediting of work and improve our ability to make more meaningful connections between people, their roles and work, the outputs, and outcome/impacts.

39

Page 41: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Select references:

1. Outputs of the NISO Alternative Assessment Metrics Project. Information Standards Organization. (2016). [Recommended Practice RP-25-2016]. Available from: https://goo.gl/J5ypcV

2. Ilik V, Conlon M, Triggs G, White M, Javed M, Brush M, Gutzman K Essaid S, Friedman P, Porter S, Szomszor M, Haendel MA, Eichmann D and Holmes KL (2018) OpenVIVO: Transparency in Scholarship. Front. Res. Metr. Anal. 2:12. doi: 10.3389/frma.2017.00012

3. K Gutzman, M White, M Brush, V Ilik, M Conlon, M Haendel, KL Holmes. (2016) Contribution Ontology: representation of a person's role in research processes and outputs. Data model available at https://github.com/openrif/contribution-ontology & poster describing work available at https://goo.gl/XVSg4D .

People, Expertise & Attribution Workgroup 4/18/18

Deliverable #1: Reskin the CTSAsearch discovery service for the CD2H website

Point Person: David Eichmann - [email protected] Problem: Identifying particular expertise in support of a project becomes extremely challenging when that expertise is diffusely distributed over dozens of CTSA hubs. Additionally, existing research networking systems (e.g, VIVO and Harvard Profiles) focus mainly on grants and publications rather than the full spectrum of professional and scholarly activities. A primary focus for CD2H are the informatics personnel at the respective CTSA hubs which requires additional information sources and discovery services. Proposed Solution: The CD2H discovery service is leveraging our existing CTSAsearch tool in support of identification of skills and expertise across the CTSA consortium and the broader biomedical research community. However, our existing expertise tool is tucked away in a corner of the Iowa CTSA’s website. This activity repositions CTSAsearch as the CD2H portal to expertise within the consortium and begins the process of extending the nature of expertise supported. CTSAsearch is the core of the planned CD2H discovery service as it already contains profile data on the investigators affiliated with roughly half of the CTSA hubs (and a comparable number of non CTSA institutions). Additional elements (e.g., metadata on software and data) will be added to the aggregate model and user interface as they become available. See Deliverable #2 below for an example already in process.

40

Page 42: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Steps:

● Transition to a new server architecture to increase performance and capacity ● Begin mapping in some of the planned extensions from the proposal ● Rebrand as CD2H; and design a new look-and-feel.

Output (3-6 months): A prototype CD2H discovery service running in the labs.cd2h.org domain. Benefit: CTSA consortium members will be able to discover and assess expertise across a broad spectrum of professional and scholarly areas. The CD2H discovery service will integrate information from diverse sources (particularly those created by the Data and Software Working Groups) into a single interface providing focused and up-to-date information.

Deliverable #2: Integrate GitHub data into the discovery service’s aggregate data model and CTSAsearch

Point Person: David Eichmann - [email protected] Problem: As noted in Deliverable #1, much of the CTSA consortium’s informatics expertise is expressed not in grants and publications (the core of the research networking tools’ models), but rather in source code repositories, slide decks, etc. Proposed Solution: Extend the CTSAsearch information harvesting framework with connections to the open source repositories, starting with GitHub. Additional sources will be tackled once we have a GitHub harvester completely integrated. Output (3-6 months): GitHub repositories and users represented in the CD2H discovery service. Benefit: The CTSA Programconsortium’s ability to identify key technical personnel will be significantly enhanced, as the CD2H discovery service will support discovery of informatics personnel not appearing as PIs on grants or as coauthors on papers. This work extends our expertise coverage beyond traditional (post hoc) resources to include current activity in open software repositories. The CTSA Programconsortium’s ability to identify key technical personnel will be significantly enhanced.

Deliverable #3: Visualize the REDCap GitHub Community

41

Page 43: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

Point Person: David Eichmann - [email protected] Problem: Identification of technical expertise for projects, particularly short-term projects is challenging at best. Many of the software platforms commonly used by the CTSA consortium are supported by a complex mesh of interlocking software developers, repositories and organizations. Driven by remarks in the iDTF breakout relating to the challenges in identifying REDCap power developers, this deliverable explores as an early milestone what we can provide to the community supporting an expressed need. Proposed Solution: Use the data harvested in Deliverable 2 to build out a visualization of the REDCap GitHub community. This is a standalone visualization for now. In the longer term (still within the 90 day timeframe), this will be integrated into the CD2h discovery service. Output (3-6 months): A visualization of a commonly used software platform (REDCap) within the GitHub environment. This is intended primarily for now as means of identification of expertise relating to specific tools. Benefit: Providing visualizations of these communities, together with the supporting profiles of these experts, will allow more ready identification of key capabilities and resources. REDCap was chosen as the initial visualization target due to its ubiquity in the CTSA consortium.

Software, Tools, and Algorithms Workgroup 4/17/18

Deliverable #1: Cloud Data Sharing Demonstration Project

Point Person: Kari Stephens ([email protected]) Problem: CTSAs use federated networks within and between themselves with little to no ability to interoperate, bottlenecked by human analysts for access to aggregated and raw datasets. CTSAs are struggling to leverage cloud based data sharing architectures to support federated ownership of datasets, both technologically and socially through proper scalable governance solutions aimed at research use. Front end tools are generally not scoped to fit cloud based backend solutions, creating the need for scalable API’s. CD2H is positioned to discover and disseminate data sharing solutions

42

Page 44: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

that improve internal CTSA hub and cross CTSA hub data sharing, independent of data model and front end tooling. Proposed Solution: Create a front end tool demonstration across at least two CTSA partner organizations, centralizing data in the cloud with federated governance, to serve as a scalable model for the CTSA Program. The following high level steps are involved: 1) establish a relationship with a cloud vendor, 2) configure two harmonized datasets into the cloud environment, 3) deploy an existing front end tool with the cloud back end, with an eye towards engineering a scalable solution for future front end tools (i.e., exploring incorporation of an API layer), 4) establish a pathway to governance that allows limited use of the datasets in a combined fashion, and 5) release use of the front end tool to a limited set of users to access their own data Output (3-6 months): In three months, create a plan and begin foundational work to include: Begin self-service tool adaptation, leveraging UW’s proprietary LEAF tool, using a Windows VM in the cloud (currently not cloud enabled or congruent with OMOP); Select a cloud vendor and scope services (likely AWS); establish a demo dataset in the cloud (i.e., OHDSI’s synpuf dataset); lay out governance requirements (i.e., DUAs, MOUs, BAAs, etc.) to share the cloud environment across 2 CD2H sites (UW and Wash U) and tool usage; and define a key use case(s) to outline the value proposition to bringing the repositories together across sites (e.g., medication / LOINC related opioid data refinement). In six months, create a demonstration cloud platform and self-service tool by: executing governance across the sites; uploading OMOP site proprietary data to the cloud from UW and Wash U; completing front end tool adaptation and making the tool available to a small user test group at each site (exploring scalable API solutions); planing next phase (e.g., demonstrate tool scalability to a different or expanded data model, solidify API solutions, ACT interface for front end tool) Benefit: The demonstration project will provide opportunities for CD2H to discover a pathway to: 1) data sharing governance structures that could be a model for the CTSA Program, 2) cloud infrastructure that could be adapted for use by multiple CTSAs, operating multiple data models, and 3) cloud vendor partnerships. It will scale an existing front end tool for self-service against an OMOP repository that can be co-opted by other CTSA institutes in the near term. This project will create groundwork for building scalable solutions that allow more nimble data sharing within and across CTSAs (i.e., API solutions for multiple self service tools to operate against multiple shared data models like OMOP, i2b2, and FHIR. Deliverable #2: Architecture and Proof-of-Concept for Synapse-CIELO Interoperability

Point Person: Philip Payne ([email protected]) Problem: The Sage Synapse platform provides an environment for the conduct of shared software, tool, and algorithm development tasks in either a collaborative or competitive (e.g., challenge) context. In a complementary manner, the CIELO platform

43

Page 45: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

provides an “app store” like environment via which a variety of stakeholders can share, discovery, and interact with “bundles” consisting of both analytical code or executable applications and associated data that can be used to demonstrate or otherwise evaluate the functionality of such items. In order to enable the efficient and accessible conduct of collaborative software, tool, and/or algorithm development efforts spanning multiple CTSA hubs, we will link these two platforms, such that well validated and high quality software, tools, and/or algorithms created in Synapse can be published to CIELO and subsequently made available for discovery, sharing, and reuse by CTSA hubs not immediately involved in initial development projects. This will ultimately create and efficient and virtuous open-source software lifecycle for these types of activities. Proposed Solution: In order to address the aforementioned problem, we will create the architecture for and proof-of-concept implementation of an interoperability solution spanning the Synapse and CIELO platforms. This solution will support the “bundling” of software, tools, or algorithms, along with accompanying reference data sets and documentation, in the Synapse environments; and then the subsequent semantic annotation (e.g., assignment of ontology-anchored descriptive “tags”) and “publication” of those bundles to the CIELO environment, such that they are discoverable by individuals not engaged in the shared task being conducted in Synapse. Underlying this activity will be the deployment of a CIELO beta version for use by the CTSA community, the specification of necessary API interfaces in both Synapse and CIELO, and the selection of appropriate ontologies to enable the annotation of said content. Output (3-6 months): The following bulleted list outlines the major output of these efforts over the next 3-6 months:

● Selection of a cloud vendor for the CIELO public beta and deployment of that platform using a three-tiered architecture (development, testing, deployment);

● The release of CIELO beta to members of Software, Tools, and Algorithms WG and the collection of end-user requirements for optimization of CIELO beta

● Architecture, design, and testing of proof-of-concept API-level interfaces for the integration of Synapse and CIELO as described above;

● The selection of necessary and appropriate domain ontologies for the annotation bundles;

● The loading of exemplary projects and bundles, contributed by CD2H sites/investigators into Synapse and CIELO, as is needed to evaluate the preceding outputs;

● End-user driven acceptance and usability testing of all of the aforementioned platforms and their existing and/or expanded functional components.

Benefit: The implementation of interoperability between Synapse and CIELO, along with appropriate tools to bundle and annotate software, tools, and algorithms, along with corresponding data sets and documentation, will have multiple benefits, including but not limited to:

44

Page 46: Center for Data to Health EAB and All Hands Meeting May 9 ......Airport: Baltimore/Washington International Thurgood Marshall Airport ( BWI) Transportation From Airport: We recommend

● 1) Eenabling the conduct of shared task or challenge activities that span CTSA hubs and result in reusable software components, thus enabling greater economies of scale associated with those efforts

● ; 2) Ffacilitating the publication of the output of such shared tasks or challenges to an easy-to-use and accessible “app store” like environment, thus encouraging discovery and uptake of said components by individuals and hubs not directly involved initial shared task or challenge events; and

● 3) Ccreating a modern and agile test-bed for collaborative and full “life-cycle” software, tool, and algorithm development efforts.

Deliverable #3: Documented framework for assessing quality of software tools that are released and branded CD2H.

Point Person: Sean Mooney - [email protected] Problem: CD2H will deploy many software tools and technologies that will adhere to both branding and quality guidelines. A process is needed to ensure quality. We are developing a process for onboarding and putting into production CD2H- branded systems of high quality. Proposed Solution: Here we will develop a process for putting into production software tools that are branded CD2H. This process will consist of strategy, process and checklist documents that will be used when accepting CD2H software. As part of this assessment we will identify relevant software quality assurance frameworks that are applicable to CD2H development tools and platforms and integrate these into a set of standards that include:

○ Architectural review ○ Unit testing ○ User acceptance testing ○ Security ○ Quality and CD2H branding assessment

● Develop technical architecture and deployment plans for CD2H automated build and testing environment

● Establish and communicate standard product review processes and governance for CD2H related software products

Output (3-6 months): We will develop Standard Operating Procedures (SOPs) focusing on process surrounding deployment of applications for CD2H. Benefit: This will ensure that CD2H will maintain software quality and branding standards across the consortium.

45