36
Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD Captain, U.S. Public Health Service Associate Director for Clinical Research Acting Director, National Children’s Study Eunice Kennedy Shriver National Institute of Child Health and Human Development

Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Embed Size (px)

Citation preview

Page 1: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Client Logo

caBIG Vocabularies and Common Data ElementsWorkspace Meeting

January 19, 2012

Harmonizing Pediatric Terminology

Steven Hirschfeld, MD PhDCaptain, U.S. Public Health ServiceAssociate Director for Clinical ResearchActing Director, National Children’s StudyEunice Kennedy Shriver National Institute of Child Health and Human Development

Page 2: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Why terminology?

• Outcomes in research are based on concepts– For example, man with pneumonia

• Concepts require terms that are specific to describe them and differentiate them from other concepts– For example, man with respiratory inflammation,

influenza, tuberculosis or silicosis• Terminology is the tool for precision to allow

consistency and multiple analyses2

Page 3: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

What are current options?

• Systematized Nomenclature of Medicine (SNOMED)- in use for medical records and research

• International Classification of Disease (ICD)- in use for epidemiology and reimbursement. Several versions in use concurrently

• Medical Dictionary for Regulatory Activities (MedDRA)- in use for therapeutic and diagnostic product development and registration

• Multiple subspeciality and niche terminologies

3

Page 4: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

What is the dilemma?

• The major terminologies do not readily map to one another

• None are robust for child health and development, particularly at the youngest ages

• All are episodic in that they describe a single circumstance and do not relate concepts across a developmental time line

4

Page 5: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

5

SNOMED: Systematized Nomenclature of Medicine

CDC: Centers for Disease Control and PreventionAAP: American Academy of Pediatrics

CDISC: Clinical Data Interchange Standards ConsortiumICH-E11: International Conference on HarmonizationEPA: Environmental Protection Agency

Page 6: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

What is new and different?

• The NICHD terminology system differs from other terminology systems by incorporating into all concepts a dimension of time and position along a developmental scale to relate concepts to one another

6

Page 7: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Rationale for terminology initiative

The NICHD has an ongoing effort to establish, through stakeholder consensus, a core library of consistent and harmonized pediatric terms. Reaching stakeholder consensus on terminology will benefit pediatric clinical researchers in the following ways:

• Provide the infrastructure necessary to compare and aggregate data and information.

• Prevent misinterpretation. • Improve precision of data sharing. • Permit more robust meta analysis. • Establish consistency with the health care delivery system across

the NICHD’s clinical research portfolio, across the portfolios of other NIH Institutes/Centers, as well as with the broader research community.

7

Page 8: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Harmonization Process

• The terminology harmonization process involves identifying relevant concepts, identifying terms and definitions to describe the concepts, and graphically depicting the structure of and relationships between the concepts.

8

Page 9: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

9

hasLifeStage

resultsIn

The NICHD Pediatric Terminology Metastructure provides a common information model associated with various child life stages

occursIn

associatedWith

evidenceOf

associatedWith

FetalInfant

(0 days – 12 mos)

Early Childhood (2-5 years)

Neonatal(0-27 days)

Toddler(13-24 mos)

Middle Childhood

(6-11 years)

Early Adolescence (12-18 years)

Late Adolescence (19-21 years)

Child Life Stage

Healthcare Activity

Research Activity

PreventativePediatric Care

Intervention orProcedure

Observation

ObservationalResults

Signs and Symptoms

Finding

affectsDisorder

Health Condition

Disease

Child

Person

Newborn Infant Toddler Adolescent

Organismal Process

Physical Development

Neurological Development

Behavior

Sleep

Nutrition

MotorDevelopment

CognitiveDevelopment

BehavioralDevelopment

Embry-onic

affects

Page 10: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Current Activities

• Focus on neonatal terminology because:– Largest gaps in major terminology schema– Existence of robust research networks

• Multi Step Process– Identify general domains– Align concepts– Map concepts to a common resource– Agree on mapping– Publish map

10

Page 11: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Advantages to Current Process

• Retention of legacy tools• Ability to pool data and perform meta-

analyses• Systematic identification of knowledge gaps

and opportunities• Path forward for further harmonization and

consensus terminology incorporating model and framework

11

Page 12: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

The National Children’s Study as a case study

12

Page 13: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Overview of the National Children’s Study (NCS)

• The NCS is mandated by the U.S. Congress and implemented by the Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health with advice and input from other NIH components, the Centers for Disease Control and Prevention and the Environmental Protection Agency

• It is a multi-year research study that will examine the effects of environmental influences on the health and development of more than 100,000 children across the United States, following them from before birth until age 21 years

• The goal of the Study is to improve the health and well-being of children and contribute to understanding the influence of various factors on health and disease

13

Page 14: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

The NCS is an integrated system of activities

• The Study is an integrated system of activities that include – a pilot Study which began in January 2009 with the goal of determining the

feasibility, acceptability and cost of Study activities, – a Main Study scheduled to begin in calendar year 2012 to determine

exposure-response relationships and – various substudies and formative research projects to examine specific

methodological questions

• The pilot Study, also known as the Vanguard Study, will run for 21 years, enroll about 4000 families, and precede Main Study activities by about 3 years so that every aspect of the Main Study is field tested prior to scale up and implementation

14

Page 15: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

NCS Vanguard Study Goals

• Vanguard Study designed to evaluate: – Feasibility (technical performance)– Acceptability (impact on participants, study personnel, and

infrastructure)– Cost (personnel, time, effort, money)

• of – Study recruitment– Logistics and operations– Study visits and study visit assessments

15

Page 16: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

The National Children’s Study takes an informatics approach that is flexible to support innovation and accommodate evolving technology

• The approach to informatics for the National Children’s Study is informed by several trends in informatics, including:– modular architecture– use of standardized terminology with curation– semantic awareness– scalability– defined transmission standards– open architecture and open source platforms with development

communities– vertical and horizontal integration of process– interoperability

16

Page 17: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

The National Children’s Study informatics approach is standards-based

• During the Vanguard phase of the NCS, multiple informatics platforms and tools are in the field to determine the performance characteristics of each

• This approach entails the use of NCS specifications to which each potential informatics solution must comply plus a systematic evaluation scheme to compare performance

• Use of such standards complements an interoperable approach that allows support for common interfaces and data exchange specifications

• Such standards include:– Data Documentation Initiative (DDI)– Clinical Data Acquisition Standards Harmonization (CDASH)– CDISC Operational Data Model (ODM)– ISO 11179 / 21090– CRoss-Industry Standard Process for Data Mining (CRISP-DM)

17

Page 18: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Standards + Modular = Flexibility

• The NCS emphasis on interoperable modular architecture means that any component of a data system can accurately and efficiently communicate with other data systems, while adhering to international data standards such as ones developed by the Clinical Data Interchange Standards Consortium (www.cdisc.org), such that its components can be reused or adapted for other studies

18

Page 19: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

NCS Data Life Cycle

• From concept to archive, the NCS has a consistent approach to the data life cycle

• Description can be found in the NCS Data Life Cycle Concept of Operations

http://www.nationalchildrensstudy.gov/about/overview/Pages/NCS_concept_of_operations_04_28_11.pdf

19

Page 20: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

The NCS incorporates Operational Data Elements

• Operational Data Elements are defined as data elements that capture the research process. In some contexts the term paradata is used

• The Operational Data Elements will allow systematic and objective evaluation of how the study is conducted and provide a basis for continuous improvement of efficiency

• The NCS developed a catalog or code list of about 500 Operational Data Elements for various study operations

• The NCS would like to contribute to the establishment of standards for Operational Data Elements

20

Page 21: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Metadata derived from harmonized terminology for the Study provides a layer of semantic interoperability across the data life cycle

• The NCS data life cycle follows data approach through data acquisition to data analysis, maximizing transparency and the understanding of NCS data

• Study data elements are guided by the NICHD Pediatric Terminology framework developed across many sources in the research, healthcare delivery, and standards development spectrum

• Consistent metadata will assure:– Semantic interoperability and compliance with international data

standards– Syntactic interoperability between NCS information management

systems as they exchange data in line with the data plan

21

Page 22: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Various semantic schemas are harmonized so that data may be accurately exchanged and analyzed among pre-existing systems

• A bridging schema, or metastructure, provides a mapping among concepts and codes from individual terminology schema used by networks or research endeavors

• The metastructure is publicly available, and the source terminology schema is the property of and maintained by the original owners

• As the National Children's Study proceeds, all developmental stages through age 21 years will be covered in a Pediatric Terminology Metastructure and many fields of research will be included

22

Page 23: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

A metadata model has emerged that meets the semantic and syntactic requirements of the Study

• The Data Documentation Initiative (DDI) is a metadata specification and international standard for describing data from the social, behavioral and economic sciences

• The DDI model is aligned for CDISC SDTM vocabularies and CDISC BRIDG protocol definition

• The CDISC family of standards (including BRIDG, SDTM and CDASH) include objects useful in describing health research not found in DDI

DDI Combined Life Cycle Model

23

Page 24: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Ongoing and Future Collaboration

• As the National Children’s Study evolves over time, it will continue to seek continuous input and partnership from willing collaborators and adhere to and inform international data standards to the highest extent possible

• The National Children’s Study aims to connect people, data and diverse systems to exchange and use information and to work together as a platform for innovative research and analysis, to ultimately improve the health and well-being of children

24

Page 25: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Summary and Plans

• The NCS utilizes standards from multiple sources to ensure an open source sustainable and interoperable informatics environment

• The NCS is field testing in the Vanguard or Pilot phase several tools and platforms concurrently in a systematic fashion to determine performance characteristics

• The integration of multiple standards and models allows exploration of meta data analyses, operational data elements and project management across all study activities

• The NCS will publicly disseminate findings as rapidly as possible and actively seeks collaborators

25

Page 26: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

For more information on the National Children’s Study

• Please visit the main website: http://www.nationalchildrensstudy.gov/

• Organizations, groups or individuals that are interested in contributing to the effort or learning more are encouraged to contact:

Steven Hirschfeld, MD [email protected]

26

Page 27: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Appendix

More information: NCS data lifecycle

27

Page 28: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

The DDI-based metadata repository supports cyclical processes for both pre-analytical datasets and analytical datasets

• Pre-analytical data datasets can be produced and repurposed for new uses such as support for additional performance metrics or linkage with extant datasets

• Data analysis may uncover recruitment, retention and/or compliance problems which lead to protocol change

28

Page 29: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

The DDI-based metadata repository for the NCS is an end-to-end solution that allows scoping and incremental development

• The CRoss-Industry Standard Process for Data Mining (CRISP-DM) can be used to standardize project management and to maximize data transparency and eventual analysis of NCS data

AnalysisPreparation

DataPreparation

Data Understanding

Business Understanding

ResultsEvaluation

• Identify key business objectives

• Identify key constraints & assumptions

• Translate business objectives into metrics or questions

• Identify potential data sources

• Assess suitability of each data source for analysis

• Extract data• Describe data

• Explore data• Assess data

quality• Match and

merge data• Clean data• Reformat

data for analysis

• Select analytic algorithms

• Code algorithms

• Validate algorithms

• Validate assumptions

• Execute algorithms

• Capture and interpret results

• Iteratively improve any discrepancies / shortcomings

• Translate results into business metrics or answers to questions

• Present results including detailed documentation of entire process

AnalysisExecution

29

Page 30: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

DDI covers the entire NCS data lifecycle from protocol definition and sampling strategy through data collection, analysis, and distribution

30

Page 31: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

In the NCS data lifecycle, forms and questionnaires are first specified

• Domain groups define measures to capture study operations and the child development life cycle

• Forms and questionnaires are specified around the measures

• These specification occur early in the NCS data life cycle and are captured by the NCS end-to-end metadata repository

An NCS Incident Report is captured in the DDI model31

Page 32: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Data elements corresponding to the questions are typed

• Form and questionnaire code lists are typically composed without regard to common data elements and standard code lists

• Instead they are responsive to context and the exigencies of form and questionnaire design

• This has led us to the approach that corresponds to DDI classifications in which there are categories and codes or, in other words, master code lists and specific code lists

32

Page 33: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Master and specific code lists are maintained for internal (Study-specific) and external harmonization

• Form and questionnaire specific code lists almost always include missing values

• In NCS the Incident Report is not restricted to adverse events, but encompasses other classifications

• All of these classifications are captured in the Incident Type code list which goes with the question “What category best describes the incident (mark one)?”

33

Page 34: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Specific code lists are the product of master code lists

• A code list for a question is compositional

• The composition of question specific code lists typically includes a subset from a category/master collection of missing values

• Mixing and matching questions across many categories leads to better comparison of answers across forms and across time in a longitudinal study

34

Page 35: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

Metadata tagging provides a path for mapping to external references

• Data elements are associated with concepts, with external references through unique identifiers

• External references can be made to an ISO 21090 Concept Descriptor and/or an OpenEHR archetype

• In these external references each value of a code list might be linked to a concept

• This is our path to ISO 11179 compliance and, in the case of incident type, an NCS code list that combines code lists from many vocabularies

35

Page 36: Client Logo caBIG Vocabularies and Common Data Elements Workspace Meeting January 19, 2012 Harmonizing Pediatric Terminology Steven Hirschfeld, MD PhD

The NCS data life cycle reaches the production of analysis datasets that conform to the CDISC ODM interchange standard

• Variables are packaged into logical records

• Physical dataset definitions are constructed that document the various datasets researchers will request and receive

36