23
Multidisciplinary Data Operations Approach Page 1 May 17, 2013 Multidisciplinary Data Operations Approach for Clinical/Translational Research Brad H. Pollock, MPH, PhD Professor and Chairman, Department of Epidemiology and Biostatistics, School of Medicine, University of Texas Health Science Center at San Antonio Disclosures (Brad Pollock) No financial conflicts of interest Director, Biomedical Informatics Core, Clinical Translational Science Award (UL1 RR025767) Director, Biostatistics, Epidemiology, Research Design (BERD) Core (UL1 RR025767) Director, Biostatistics and Informatics Shared Resource, Cancer Therapy & Research Center, University of Texas Health Science Center at San Antonio (P30 CA054174) Principal Investigator, Children’s Oncology Group (COG) Community Clinical Oncology Program (CCOP) Research Base (U10 CA095861) Clinical Science Course Faculty, Society of Clinical Research Associates (SoCRA) Data quality Venues for data operations Data management Tools for data operations Data plans – General considerations – Data plans Local approach in San Antonio Other considerations

Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 1

May 17, 2013

Multidisciplinary Data Operations Approach for

Clinical/Translational Research

Brad H. Pollock, MPH, PhD

Professor and Chairman, Department of Epidemiology and Biostatistics, School of Medicine, University of Texas Health Science Center at San Antonio

Disclosures (Brad Pollock)

• No financial conflicts of interest

• Director, Biomedical Informatics Core, Clinical Translational Science Award (UL1 RR025767)

• Director, Biostatistics, Epidemiology, Research Design (BERD) Core (UL1 RR025767)

• Director, Biostatistics and Informatics Shared Resource, Cancer Therapy & Research Center, University of Texas Health Science Center at San Antonio (P30 CA054174)

• Principal Investigator, Children’s Oncology Group (COG) Community Clinical Oncology Program (CCOP) Research Base (U10 CA095861)

• Clinical Science Course Faculty, Society of Clinical Research Associates (SoCRA)

• Data quality

• Venues for data operations

• Data management

• Tools for data operations

• Data plans

– General considerations

– Data plans

• Local approach in San Antonio

• Other considerations

Page 2: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 2

May 17, 2013

“May 6, 2010 Flash Crash”

• On May 6, 2010, a mishandled trading order temporary sent US stocks plummeting: – Dow Jones Industrial Average (DJIA) plunged

about 1,000 points (about 9%) only to recover those losses within minutes.

– It was the 2nd largest point swing and the biggest one-day point decline, 998.5 points, on an intraday basis in DJIA history.

• A small error strikingly upset an important data-heavy system.

DATA QUALITYThe first critical element in the reproducible research chain

Criteria for Reproducible Research*

Research Component

Requirement

Data Analytical data set is available.

Methods Computer code underlying figures, tables, and other principal results is made available in a human-readable form. In addition, the software environment necessary to execute that code is available.

Documentation Adequate documentation of the computer code, software environment, and analytical data set is available to enable others to repeat the analyses and to conduct other similar ones.

Distribution Standard methods of distribution are used for others to access the software, data, and documentation.

*from Peng, Dominici, Zeger. Am J Epidemiol 2006;163:783–789

Little emphasis on how we get to this point!

Little emphasis on how we get to this point!

Page 3: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 3

May 17, 2013

HOW DO YOU GET GOOD DATA FOR CLINICAL TRANSLATIONAL RESEARCH?You need outstanding research information technology (IT) resources and a multidisciplinary team all of whom are deeply invested in the research.

Data Operations “Truths”

• Research information technology (IT) is a vital and a non-trivial core resource

• Many examples of investigations being derailed by IT deficits

• Quality and cost-efficiencies have been gained by excellent research IT

Data Operations “Truths” (continued)

• Good IT operations are very resource intensive

• Requires knowledgeable and dedicated staff

• Sustained financial support is a prerequisite for long-term ROI

Page 4: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 4

May 17, 2013

Where do data operations take place to support biomedical research?

Data Operation Venues

• Biostatistics units/cores

• Central university IT

• Hospital IT departments

• Other academic departments – e.g., computer science, biomedical

informatics

Who should define, manage, and oversee clinical translational research data operations?

Page 5: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 5

May 17, 2013

Who Should Coordinate Data Operations?

• Biostatistics has played a central role in managing NIH Data Coordinating Centers

• With some exceptions, computation in biostatistics has been heavily focused on analysis.

Biostatistics Core Functions

• Design studies– Clarify hypotheses and objectives

– Select study design

– Define data elements/endpoints

– Sample size/power calculations

– Develop analytic plans

• Monitor studies– Efficacy/futility

– Safety/quality

• Analyze studies– Statistical analysis

– Writing reports/manuscripts

Co

mp

uta

tio

n

Who Should Coordinate Data Operations?

• Biostatistics has played a central role in managing NIH Data Coordinating Centers

• With some exceptions, computation in biostatistics has been heavily focused on analysis.

• With the advent of the CTSAs, coordination has broaden to include other disciplines including biomedical informatics.

Page 6: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 6

May 17, 2013

Biomedical Informatics

Focus areas:– Ontologies– Vocabulary/terminology– Machine learning, human-machine

interfaces– Natural language processing– Electronic health records– Data repositories– Tool development (e.g. CTSA IKFC)

Who should define, manage, and oversee clinical translational research data operations?

It depends…

Answer:

• No brainer if you are a:– NCI cooperative Group Statistician

– Run a NIH-funded Data Coordinating Center (DCC)

– Direct a structured biostatistics core: e.g., Centers for AIDS Research (CFAR), Alzheimer’s Disease Core Centers, etc.

• Often a requirement of the RFA

Page 7: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 7

May 17, 2013

Answer:

• Less clear if you are a: Direct a CTSA BERD unit

Direct a CCSG P30 Biostatistics Core

Direct an institutional biostatistics support unit with a separate group informatics group

National Children’s Study center

Data Operation Venues• Biostatistics units/cores

• Central university IT

• Hospital IT departments

• Other academic departments – e.g., computer science, biomedical

informatics

Some of these choices place great faith in others to appropriately manage study data

Biostatistics and Information Management

• A symbiotic relationship:– Biostatistics helps define the “What,” i.e.,

endpoints

– Information Management defines “How,” how to management data operations

BiostatisticsInformation

Management

What do we collect?

How do we manage?

Page 8: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 8

May 17, 2013

Biostatistics vs. Informatics Perspectives

Medical Informatics

1. Tool development

2. Messy data from integrated data repositories– EHRs

3. Hypothesis-generating orientation

Biostatistics

1. Develop data models to address specific study hypotheses

2. Cleaner data sources:– Registries

– Protocol-generated

3. Hypothesis-testing orientation

Optimal Configuration• Who should be at the table?

– Biostatistics/epidemiology

– Research IT

– Data management

– Regulatory personnel

– Project managers (PMP)

– Biomedical informatics

• Ideally, use a multidisciplinary approach

Data Management

• The development, execution and supervision of plans, policies, programs and practices that control, protect, deliver, and enhance the value of data and information assets*

*Data Management Association, Data Management Body of Knowledge (DAMA-DMBOK), 2008

Page 9: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 9

May 17, 2013

Who’s Involved in Data Management

SubjectsParticipantsPatients Investigators

CliniciansResearch StaffClinical Staff

StatisticiansEpidemiologistsAnalytic Staff

Central ITCIOISOSNO

Research ITAnalystsProgrammersDBAs

End-to-End Process

Data Management within the Research Process

Final StatisticalAnalysis

ProtocolDevelopment

Data ManagementProcess

ITInvolvement

Data Management Changing Within the Research Process

Final StatisticalAnalysis

ProtocolDevelopment

Data ManagementProcess

Data managementconsiderations arebeginning to influencethe science

}

{

Storage and long term utilization affect the data long after the protocol’s final analysis

Page 10: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 10

May 17, 2013

Data Management Responsibilities

• Maintain a functional, flexible, scalable, cost-efficient resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, environmental)

• Data quality and compliance with regulatory requirements– HIPAA– 21 CFR Part 11– FISMA

• Planning for:– Long time horizons (e.g., National Children’s Study)– Interoperability and federation (e.g., caTissue Suite,

caGRID, OpenMDR)

TOOLS FOR DATA OPERATIONS

Initial Planning Process

• What is an investigator to do about the data operations?

– The investigator only wants to be able to easily do their research.

– They don’t want a lot of barriers put in the way.

• Solution: Let’s use Excel…

Page 11: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 11

May 17, 2013

“Database Management” Software

Microsoft Excel

Excel Characteristics

• Good Points– Easy to work with

• Quick start up, low costs

– Potentially can force data types

• Bad Points– Too easy to work with

• Doesn’t require you to clearly define your needs

– “Interprets” data

• Will not allow you to override

Page 12: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 12

May 17, 2013

“Database Management” Software

Page 13: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 13

May 17, 2013

REDCap Characteristics

• Good Points– Easy to set up, not resource intensive– Requires a data dictionary– Central server model (security & data integrity) – Web front-end

• Less than Good Points– Display interface not very customizable

• Layout, skip patterns, etc.– Each application is a separate instance– Adverse events monitoring difficult– Not truly relational – No data curation, electronic data collection only

“Database Management” Software

How Data Are Handled?

• Paper forms (CRFs) and keypunch

• Client-server DBMS and networked DBMS

• Web-front end DBMS– Pediatric Oncology Group replaced paper

in 1998• Web front-end

• Oracle back-end

• Clinical Trials Management System (CTMS)

Advancing Technology

Page 14: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 14

May 17, 2013

Clinical Trials Management Systems (CTMS)

IMPACT® CTMS

• Uses: Planning, preparation, monitoring and

reporting of clinical trials Administrative/financial capabilities Electronic case report forms (eCRFs) ± Interoperate with other systems

IDEAS

DATA PLANS FOR CLINICAL TRANSLATIONAL RESEARCH

Researchers

I.T. Staff

Problems Can Start Early Without Statistical Input

Page 15: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 15

May 17, 2013

Required Data Elements for a Study Plan

• Design studies– Clarify hypotheses and objectives

– Select study design

– Define data elements/endpoints

– Sample size/power calculations

– Develop analytic plans

• Monitor studies– Efficacy/futility

– Safety/quality

• Analyze studies– Statistical analysis

– Writing reports/manuscriptsC

om

pu

tati

on

Data Plan: Study Design

• Select study design– Prospective assessment

– Retrospective assessment

– Cross-sectional assessment

• Define data elements/endpoints– Demographics– Baseline characteristics (clinical, laboratory, imaging)

– Interventional characteristics– Outcomes (clinical, laboratory, imaging, PRO)

Data Plan: Monitoring

• Efficacy/futility– Interim stopping rules– Group sequential methods– Bayesian approaches (e.g. adaptive

randomization)

• Safety/quality– Safety stopping rules– Ongoing remediation of quality problems

• Visualization

Page 16: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 16

May 17, 2013

Data Plan: Analysis

• Statistical analysis– Data access

• Direct: SQL (e.g., PROC SQL)

• Export to other format (e.g., StatTransfer 11)

– Writing reporting/Manuscripts• Print

• Web/Internet– Data Sharing

Human Studies Database Project

Human Studies Database (HSDB) Project

• A CTSA multi-institutional project to federate study design descriptors and results of the human research portfolio over a grid-based architecture.

• Uses: Inform the design of new studies

Facilitate systematic reviews/meta-analyses

Identify potential collaborators

Aid in research management

Page 17: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 17

May 17, 2013

Ontology of Clinical Research (OCRe)

• HSDB developed using the Ontology of Clinical Research (OCRe)

• Focus on: Study design (Study Design Classifier), interventions,

exposures, and analytic methods of individual-human studies

Any design type, for any intent, in any clinical domain

Federation across CTSAs

HOW DID WE ADDRESS THESE CHALLENGES IN SAN ANTONIO?

University of Texas Health Science Center at San Antonio

Page 18: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 18

May 17, 2013

University of Texas Health Science Center at San Antonio BERD Unit

• Initial sit down with a faculty biostatistician or epidemiologist

• Follow-up meeting will add:

–Information Services Director or Co-Director (IDEAS)

–Masters-level public health staff to help co-develop REDCap application

• Been doing this for 10+ years.

INFORMATICS DATA EXCHANGE AND ACQUISITION SYSTEM (IDEAS)

University of Texas Health Science Center at San Antonio

Complexity Encapsulation• Object-based templates• Common business objects• Custom object libraries• Standard Interfaces

User Interface

Data

Business Rules

WebProgrammers

Domain experts and Informatics analysts

DBA

Informatics Data Exchange and Acquisition System

The IDEAS

FrameworkAn interwoven structure of

interdependent components

Security Application

Data Collection Database

• Web• Interface• Batch

Pathology&

Genetics

Security

Protocols

Patient

IDEASThree Tier MVC Framework

Page 19: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 19

May 17, 2013

IDEAS Features

• Application Development: Custom Meta-data generator

• Optional asynchronous operation

• Interoperability:• Shibboleth: Federated Single Sign-On

Authentication Service• Patient Study Calendar (PSC) • Qualtrics• caTissue Suite / Freezerworks• Velos*

*future feature

Other Considerations of UT HSC San Antonio Data Operations

• Standard Operations Procedures (SOPs)

• Disaster recovery

• Version control (Surround SCM)

• Audit

• Separation of duties – DBAs, analysts, statisticians

• Electronic Sign-offs (Editor Monitor PI)

• Honest broker role (PHI-related)

Unique Challenge:

Integrating Practice-Based Research Networks (PBRNs) into IDEAS

1. StarNet (family practice) PBRN

2. Psychiatry PBRN

3. Dental PBRN

4. VA PBRN

Page 20: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 20

May 17, 2013

Integrating Omics Data: Genetics and Biology of Liver Tumorigenesis in Children*

• Bioinformatics and Biostatistics Core (BIBSC)

• Brought together disparate data: Pediatric Oncology Group, Children’s Cancer Group,

Children’s Oncology Group, the Cooperative Human Tissue Network (CHTN), Baylor pathology reference lab

Bioinformatics data from a range of high-throughput platforms: Illumina, Affy, NextGen Sequencing, etc.

Demographic, clinical and outcome information

*Cancer Prevention Research Institute of Texas – MIRA RP101195

Need for Ongoing Quality Improvement

Important enough topic that Chris Lindsell and I launched the Data Management and Quality (DMQ) Working Group for the Biostatistics/Epidemiology/Research Design (BERD) Key Function Committee (KFC)

Page 21: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 21

May 17, 2013

Other Considerations for Data Operations

• Future-proofing database designs for repurposing

• Open source vs. commercial solutions

• Imaging informatics

• mHealth technologies

Summary

• Computation technologies are at the heart of data-driven research

• High quality data are fundamental to reproducible research and study validity

• Good data management High quality data

• High quality data Analytic quality

Summary (continued)

• Multidisciplinary team should be involved:– Biostatistician/epidemiologist

– Research IT

– Data management

– Regulatory personnel

– Biomedical informatics, computer science

Page 22: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 22

May 17, 2013

Summary (continued)

• Core competencies in data operations should be a requirement for academic clinical translational research training programs.

• Technologies for managing data are changing faster than technologies for analysis.

• Selection of software tools depends on their capability as well as sustainability over the long haul.

“Ultimate Goal” To Advance Clinical Translational Research

• Seamless integration of data operations using complementary innovative technologies.

Promise for the Future

• Clinical data and research data gap will be bridgedCTMS and EHR interoperability

• Precision medicine will be driven by information infrastructure

• Disparate data sources will be federatede.g., I-SPY 2 Breast Cancer Trial

Page 23: Multidisciplinary Data Operations Approach · 2013-05-28 · resource to handle a variety of data (demographic, clinical/laboratory/imaging, bioinformatics, ... • Doesn’t require

Multidisciplinary Data Operations Approach

Page 23

May 17, 2013

• Data liquidity is rapid, seamless, secure exchange of useful, standards-based information among authorized individual and institutional senders and recipients.

Promise for the Future (continued)

• Information systems will simultaneously address research, financial, and regulatory needs

Thank you very much