24
NEW ENHANCEMENTS IN THE SYNTHETIC DERIVATIVE AND WHAT THAT MEANS FOR THE RESEARCHER Jacqueline Kirby June 7 th , 2013

New Enhancements in the Synthetic Derivative and What that Means for the Researcher

  • Upload
    conley

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

New Enhancements in the Synthetic Derivative and What that Means for the Researcher. Jacqueline Kirby June 7 th , 2013. Resources. StarPanel Identified clinical data; designed for clinical use Record Counter De-identified clinical data; sophisticated phenotype searching - PowerPoint PPT Presentation

Citation preview

Page 1: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

NEW ENHANCEMENTS IN THE SYNTHETIC DERIVATIVE AND WHAT THAT MEANS FOR THE RESEARCHER

Jacqueline KirbyJune 7th, 2013

Page 2: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Resources• StarPanel

• Identified clinical data; designed for clinical use• Record Counter

• De-identified clinical data; sophisticated phenotype searching• Returns a number – record counts and aggregate demographics

• Synthetic Derivative• De-identified clinical data; sophisticated phenotype searching• Returns record counts AND de-identified narratives, test values,

medications, etc., for review and creation of study data sets• Research Derivative

• Identified clinical data• Programmer (human) supported

• BioVU • Genotype data• De-identified clinical data; sophisticated phenotype searching• Able to link phenotype information to biological sample

Page 3: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

The Synthetic Derivative Record Counter (RecordCounter) provides exploratory data figures and counts to members of the VU research community for research planning purposes and feasibility assessment. 

• Available to ANYONE with the VUNET id• Allows the user to input basic medical data, such as ICD 9

codes or text keywords, e.g., lung cancer, as well as demographic information, and then search the Synthetic Derivative database to determine the approximate number of records that meet those criteria.  

What is the RecordCounter?

Page 4: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

• Rich, multi-source database of de-identified clinical and demographic data

• User Interface tool that can be used for access and analysis• Services are available to help deliver results for non-standard

queries (temporal queries, controls matching, etc)• Contains ~2.3 million records

• ~1 million with detailed longitudinal data• averaging 100k bytes in size • an average of 27 codes per record

• Records updated over time and are current through December, 2012• Soon to be 5/31/2013

What is the Synthetic Derivative (SD)

Page 5: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

The RecordCounter Vs. The SDThe RecordCounter – Users can use search criteria to return exploratory counts (The results returned are not exact and are meant for a high level assessment of the available data.) 

The SD - User can use search criteria to returns exact count and the associated longitudinal data for review.

Page 6: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

What is BioVU? • The move towards personalized medicine requires very large

sample sets for discovery and validation

• BioVU: biobank intended to support a broad view of biology and enable personalized medicine

• Contains de-identified DNA extracted from leftover blood after clinically-indicated testing of Vanderbilt patients who have not opted out

• Linked to Synthetic Derivative: de-identified EMR

• Current sample number: 166,397

o 147,292 adult sampleso 19,220 pediatric samples

Page 7: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Synthetic Derivative vs. BioVU

Page 8: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Documents, such as:• Clinical Notes• Discharge Summaries• History and Physicals• Problem Lists• Surgical Reports• Progress Notes• Letters

Diagnostic Codes, Procedural Codes Forms (intake, assessment) Reports (pathology, ECGs, echocardiograms) Clinical Communications Lab Values and Vital Signs Medication Orders TraceMaster (ECGs)Tumor Registry

Synthetic Derivative Data Types

Page 9: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Technology + policyDe-identification

• Derivation of 128-character identifier (RUI) from the MRN generated by Secure Hash Algorithm (SHA-512)

• HIPAA identifiers removed using combination of custom techniques and established de-identification software

Date Shift• Our algorithm shifts the dates within a record by a time period (up to

364 days backwards) that is consistent within each record, but differs across records

Restricted access & continuous oversight• Access restricted to VU; not a public resource• IRB approval for study (non-human)• Data Use Agreement• Audit logs of all searches and data exports

Page 10: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Synthetic Derivative 3.0 was launched with on February 25, 2013. SD 3.0 leverages the power of an IBM Netezza data warehouse appliance to provide faster, near-immediate counts as the user builds their search criteria and new review features that includes enhanced data visualization and covariate annotation capabilities.

SEARCH: Counts are provided for each search item in real-time as you build your algorithm letting you adjust your criteria immediately. Modifiers for ICD 9 codes allow searches to require 2 or more codes.REVIEW: Filter and highlight documents, medications and labs to make review efficient.ANNOTATE: Create your own set-based annotations that are sharable across the study team.

The New SD…

Page 11: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

General algorithm for determining a phenotype• Definition of phenotype for cases and controls is critical

• May require consultation with experts• Basic understanding of data elements; uses and

limitations of particular data points is important• Reviewing records manually to make case determination

(or even to calculate PPV of search methodology) will be somewhat time consuming

Page 12: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

The problem with ICD9 codes• ICD9 give both false negatives and false positives

• False negatives:• Outpatient billing limited to 4 diagnoses/visit• Outpatient billing done by physicians (e.g., takes too long to find the

unknown ICD9)• Inpatient billing done by professional coders:

• omit codes that don’t pay well • can only code problems actually explicitly mentioned in documentation

• False positives:• Diagnoses evolve over time -- physicians may initially bill for suspected

diagnoses that later are determined to be incorrect• Billing the wrong code (perhaps it is easier to find for a busier clinician)• Physicians may bill for a different condition if it pays for a given treatment

• Example: Anti-TNF biologics (e.g., infliximab) originally not covered for psoriatic arthritis, so rheumatologists would code the patient as having rheumatoid arthritis

Page 13: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Lessons from preliminary phenotype development• Eliminating negated and uncertain terms:

• “I don’t think this is MS”, “uncertain if multiple sclerosis”

• Delineating section tag of the note • “FAMILY MEDICAL HISTORY: Mother had multiple sclerosis.”

• Adding requirements for further signs of “severity of disease”• For MS: an MRI with T2 enhancement, myelin basic protein or

oligoclonal bands on lumbar puncture, etc.• This could potentially miss patients with outside work-ups,

however

Page 14: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Once you have logged in…

The New SD gives a cleaner Home page interface with aggregate SD graphs.

New features for the Investigator:• A welcome and

announcement section to give the Investor any immediate information/Help when accessing the SD

• Overall SD/BioVU population demographics with to give an up-to-date population details of the resource

Page 15: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Improved Search Features

Once you have selected “Start a New Search”, you will go to the Search Interface. Users can select search criteria to see record counts by dragging and dropping Search Criteria (e.g. ICD codes, Labs, Document Keywords, Medications) into the Search box.

New Search Features include:• Counts for each specific criteria element as denoted to the right hand side of the search

box(circled in red), summary counts for combined criteria (this OR that) indicated at the bottom of the group box(circled in blue), and a final Total count at the right corner of your search(circled in green)

• Limit Search To BioVU Records, Non-compromised BioVU Samples, or only BioVU Samples available for external assay

• Limit your search based on number of ICD code occurrences in the subject record to require multiple instances of a ICD code

Page 16: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Improved Set Review

After you have build your set, you can be begin reviewing your records. The New SD has both a Summary view to see a high level graphic view of a subject AND a Detail view that allows you to customize your view with a new Tabular view.

What’s new in Review:• Subject ids listed on left hand side to move

easily through the records.• Tabular view of the different data elements

with custom sorting of tabs• Arial buttons for determining Subject status

Page 17: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

New Data Visualization Features

In the Summary tab and in the Vitals view, the new SD has new data visualization features that allow a reviewer to get a quick view of a subject’s longitudinal data.

Page 18: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Improved Document View

Documents are divided into three tabs: • High Value Documents• Other Documents• Problem ListsOn each Document tab, you can 1. Filter based on Keywords, Document Type,

Subtypes2. Filter keywords searches and display only

the context3. Highlight based on Keywords and display

either the full documents or the word(s) in context

Page 19: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

New Medications and Labs Display

Medication and Lab view now have two displays for easier review. The Summary view displays aggregate mentions of meds/labs with beginning and end dates. The Details view show each instance of the meds/lab full detail display with the ability to filter by data element.

Page 20: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Improved Annotations

Annotations allow for easier identification and saving of covariate information during set review. Create your own set-based annotations that are sharable across the study team. These can be exported to excel when performing your data analysis.

Page 21: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

What’s Next?• Data Export into REDCap• Adding PheWAS to the search criteria• Predict Labs in the Lab view• Custom and Timeline View• ….

The SD has evolved greatly in the past six months and this is largely due to suggestions and needs from its users. Please let us know what YOU would like in the SD so that the SD can continue to evolve.

Page 22: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

SD Access Protocol

Researcher

Requests IRB

Exemption

Signs DUAResearcher

accesses SD

SD staff verify/access granted

Enters StarBRITE to

complete electronic

application (IRB status is in StarBRITE)

Page 23: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

Leveraging VICTR Resources• Record Counter (RC) – part of SD but open to anyone

with a Vunet ID:https://biovu.vanderbilt.edu/RC/RC.html• SD (/BioVU) – Erica Bowton (via StarBrite)• RD – email or call me, or fill out a Request form at

https://starbrite.vanderbilt.edu/• (

https://starbrite.vanderbilt.edu/managedata/datarequest.html )

Page 24: New Enhancements in the Synthetic Derivative and What that Means for the Researcher

SD User Group Sessions will be held the fourth Wednesday of each month at 1 pm. All are welcome.Time: 1:00-2:00 PMLocation: Light Hall, Room 439

If you have any questions or feedback about the new SD, please contact us, email [email protected]

Questions or Comments?

THANK YOU!