61
An Approach to Using Controlled- Vocabularies in Clinical Information Systems Jeff Wilcke, DVM, MSc, DACVCP and Art Smith, M.S.

An Approach to Using Controlled-Vocabularies in Clinical Information Systems

Embed Size (px)

DESCRIPTION

An Approach to Using Controlled-Vocabularies in Clinical Information Systems. Jeff Wilcke, DVM, MSc, DACVCP and Art Smith, M.S. Lists of words…. Nomenclature The system or set of names for things, etc., commonly employed by a person or community (Petchamp, SNVDO, SNOMED) Vocabulary - PowerPoint PPT Presentation

Citation preview

An Approach to Using Controlled-Vocabularies in

Clinical Information Systems Jeff Wilcke, DVM, MSc, DACVCP

and

Art Smith, M.S.

Lists of words…

• Nomenclature– The system or set of names for things, etc., commonly employed

by a person or community (Petchamp, SNVDO, SNOMED)

• Vocabulary– A collection or list of words with explanations of their meanings

(SNOMED)

• Classification– The result of classifying; a systematic distribution, allocation, or

arrangement, in a class or classes; esp. of things which form the subject-matter of a science or of a methodic inquiry. (SNOMED)

What do we need?

• Nomenclature ONLY– Provides a simple list for data entry

• Vocabulary / Classification– We can be CERTAIN that the “term” (description in

SNOMED) means what we think it means.– We can develop rules that allow us to combine

concepts to express ideas more complicated than those contained in the nomenclature.

– We can use the knowledge base supported by the vocabulary/classification to search, retrieve and analyze our data.

Vocabulary Suitability

• Adequate Content

• Multiple granularities

• Functional Subsetting

• Rich Semantic Structure

Adequate Content

• Lower boundary?– Values for all patient-care context(s) in the medical

record system.– Values to allow for patient and patient-care specific

specializations.• left, severe, chronic, etc.

• Upper boundary?– Adequate content IS NOT the same thing as “any

conceivable medical utterance”.– Some content belongs to specialized vocabularies

• Pharmacy (e.g., specific brand name items)

Multiple granularities

• Granularities appropriate for various patient care settings.– Problem list

• Fractured femur

– Surgery report summary• Closed spiral fracture of the midshaft of the femur

Functional Sub-setting

• We only need PORTIONS of SNOMED for any one part of a Clinical Information System (CIS)

• We need DIFFERENT portions of SNOMED for different parts of CIS.

• We must be able to use ALL of SNOMED to search, retrieve, analyze data produced using sub-sets.

Medical Record Semantics

• CIS Data Structure– Meaning of fields carried in the data dictionary for the

CIS

• Meta-semantics– Internal Vocabulary Semantics

• Instance-semantics– Structurally identical to Meta-semantics

– Some attributes from the Vocabulary

– Some attributes used ONLY in instance semantics

CIS Data structure

Simple

“Write everything about the patient in the box”

Complex BodySystem

“Each field must have at least one entry before you proceed to the next screen.”

Organ Tissue

PathProcess

Etiology

DurationEpisodicNature

Severity

Vector

Find a Balance

Tell about Patient here.

BodySystem

Organ Tissue

PathProcess

Etiology

DurationEpisodicNature

Severity

Vector

Middle Ground

• CIS “Content”– Problem list

– Rule out

– Final diagnosis

– Treatments

– Surgical Procedures

– Diagnostic Procedures

• Vocabulary “Content”– Body system

– Morphology

– Etiology

– Approach

– Instrument

– Generic drug name

Finding Middle Ground

• Option A – Single “findings” field– “Problem diabetes mellitus”

– “Final diagnosis diabetes mellitus”

– “Tentative diagnosis diabetes mellitus”

– “Rule-out diabetes mellitus”

Option A puts all the complexity of nomenclature management at the interface level.

Finding Middle Ground

• Option B – Multiple “findings” fields– Problem list field – value = “diabetes mellitus”

– Final diagnosis field – value = “diabetes mellitus”

– Rule-out field – value = “diabetes mellitus”

Option B shifts complexity to the medical record structure.

Messages re-introduce nomenclature complexity.

SNOMED RT Definition

“Closed fracture of shaft of femur (disorder) ”

Attribute Value

Is a Fracture of shaft of femur

Is a Closed fracture of femur

Associated morphology Fracture, closed

Associated topography Shaft of femur

Meta-semantics

• The built-in semantic structure of the nomenclature system itself.

• Object – attribute – value triples– Defined and sanctioned attributes

• Associated Morphology

• Associated Topography

• Associated Etiology

– Allowed value sets

Defined Attribute(s)

• “ASSOC-ETIOLOGY names the direct causative agent (organism, toxin, force) of a disease or disorder. It does not include vectors (such as the mosquito that transmits malaria). It also does not include method or mechanism by which the etiology is introduced to the body”.

Instance semantics

• Instance-semantics are used to express a particular occurrence of a concept by allowing the addition of details.

• Object – attribute – value triples– Instance attributes

• Has severity

• Has laterality

• Has duration

Defined Attribute(s)

• “HAS LATERALITY* names the specific organ when that organ exists as left and right pairs (such as left and right femur)”

*NOT sanctioned at this time. Meta-semantics will require this attribute.

Medical Record “Instance”

“Closed fracture of shaft of the LEFT femur”

Object – Attribute – Value

Closed fracture of shaft of femur (disorder) Has laterality – Left

Choosing the correct object

Closed fracture of shaft of femur (disorder)

Has laterality – Left

It is NOT a left FRACTURE

It is NOT a left SHAFT

It IS a left femur

Choosing the correct object Refineability (SNOMED)

• It is NOT a left fracture– Fracture of shaft of femur is not refineable by

laterality, but has associated topography shaft of femur.

• It is NOT a left shaft– Shaft of femur is not refineable by laterality, but “is

a” femur structure.

• It IS a left femur– Femur is refineable.

Refineability(SNOMED)

• The instance semantics need not include the femur itself to establish laterality, but must processed against the meta-semantics.

• Rule (sic) – “When laterality is processed against a finding, assume that it is assigned to the topography. If the first occurrence of topography is not refineable by laterality, find a parent that is refineable.”

SNOMED Structure

• Concepts are linked to other concepts by specific named Relationships (which are also concepts in SNOMED).

• The full linkage of associated concepts can be staggering -- not just a tree, but rather a complex network of relationships.

• Concepts and Relationships are stored in separate relational tables.

Complicated!

Traumatic Abnormality

Fracture

Fracture, Closed

Thigh

Femur

Shaft of Femur

Closed Fracture of the Shaft of the Femur

Closed Fracture of the Femur

Fracture of the Femur

Fracture of bone

Fracture of lower limb

Fracture of the Shaft of Femur

Bone of lower extremity

Bone

Long Bone

Injury of thigh

Bone of extremity

Findings Concept, ISA connection

Anatomy Concept, Associated topography connection

Morphology Concept, Associated Morpholgoy connectionKey:Part of connection

Powerful!

SNOMED Tables(two of them, anyway)

• Concepts– Concept ID

– SNOMED ID

– Status

– Fully Specified Name

• Relationships– Concept ID 1

– Relationship ID

– Concept ID 2

– Refineability (SNOMED CT only)

SNOMED Tables

7270400111667600871620000

Concept IDRelationship IDConcept ID

Relationship Table

Fracture (morphologic abnomality)M-1200072704001

Associated MorphologyG-C504116676008

Fracture of Femur (Disorder)DD-1310071620000

Concept NameSNOMED IDConcept ID

Concepts Table

Instance Semantics Structure

• Must follow the same pattern as the SNOMED structure to allow seamless searching.

• The table structure used to represent the nomenclature should also be used to represent controlled vocabulary entries.

• There are some differences, though….

Differences in Semantics

Meta Semantics

• Each concept appears just once (abstract concepts).

• Linkage can be and usually is a complex network

• Abstract (defining) relationships (e.g., IS-A, Part-of, Associated Topography)

Instance Semantics

• Concepts may occur more than once (concrete instances).

• Linkage is a simple tree structure (modeling a noun phrase)

• Concrete (qualifying) relationships (e.g., Has laterality, Has severity, Associated-topography)

Consolidation of caudal and middle lobes of the right lung.

(Tree Structure)

D2-50020 Consolidation of lung

T-28A20Caudal lobe

of lung

T-28A20Middle lobe

of lung

G-C505 Associated

topography

G-C505Associated

topography

G-C220Has laterality

G-A100Right

G-C220Has laterality

G-A100Right

InstanceSemantics

T-28000 Lung(Refinable – left, right, both)

T-28A20 Caudal lobe of lung(Not refineable)

T-28A20 Middle lobe of lung(Not refineable)

MetaSemantics

Three Sources of Information

• The field in the system: – Data entered under the “Discharge diagnosis” field

is semantically different than the same data in the “Rule-out list” field.

• The data entered in that field:– Either a single SNOMED concept or a phrase

constructed using explicit instance semantics.

• The nomenclature system:– The related SNOMED concepts determined by the

implicit meta-semantics.

Searching the TableUsing Semantics

• When searching the table, automatically expand all concepts to include their IS-A descendants (children, grandchildren, etc.).

• A match on any of those is considered a match on the parent.

• Consider an example:

“Find all diagnoses of lung disease occurring in the caudal lobe”

“Find all diagnoses of lung diseases occurring in the caudal lobe.”

• Search for a diagnosis field entry with both D2-50000 (Disease of Lung) AND T-28A20 (Caudal Lobe of Lung) in the Value column.

• Expand with IS-A descendants– D2-50000 has 41 IS-A children including D2-61010

(Abscess of Lung). Many of these children have IS-A children which are also included.

– T-28A20 has no IS-A children.

• Search for: (D2-50000 or D2-61010 or…) and T-28A20

“Tight” vs. “Loose” Searches

• Tight Search:– Target known to match search criteria.

Find all diagnoses of known lung disease that are known to occur in the caudal lobe of the lung.

• Lose Search:– Target might match search criteria.

Find all diagnoses that may be lung disease that may be located in the caudal lobe of the lung.

• Generally we want a “tight” search.• Looking at our example search…

“Tight” vs. “Loose” Searches• Consider a diagnosis of simply D2-61010 Abscess of Lung• A “Tight” search would not find this diagnosis.

– It does not match the criteria or the IS-A descendants of the criteria (i.e., not known to be caudal lobe).

• A “Loose” search would find this diagnosis.– It could match the criteria or the IS-A descendants of the criteria

(i.e., it might be in the caudal lobe).

• “Tight” searches match criteria and their IS-A descendants.• “Loose” searches match criteria, their IS-A descendants

AND their IS-A ancestors.

Pre-coordinated vs. Post-coordinated Concepts

• Pre-coordinated concept:– DD-13152 = Closed fracture of shaft of femur

• Post-coordinated concept phrase:– DD-13100 = Fracture of femur

• Associated Morphology– Fracture, Closed

• Associated topography– Shaft of Femur

Pre-coordinated vs. Post-coordinated Concepts

Closed fracture of shaft of femur

Is a

Is a

Fracture of shaft of femur

Closed fracture of femur

Associated morphology

Associated topography

Fracture, closed

Shaft of femur

Fracture of femur

Is a

Is a

Fracture of lower limb

Injury of thigh

Associated morphology

Associated topography

Fracture

Femur

Associated morphology

Associated topography

Fracture, closed

Shaft of femur

Are these computational equivalents?

Pre-coordinated(meta only)

Post-coordinated(meta + instance)

Instance Template

Main Concept

Attribute (1) Value (1)

Attribute (1.1) Value (1.1)

Attribute (2) Value (2)

Attribute (n) Value (n)

Attribute (2.1) Value (2.1)

Attribute (1.n) Value (1.n)

Coding an Instance(why?)

• We need a compact yet unambiguous format for representing an instance structure.– Transmission of records

– Storage of record data for presentation

– NOT for searches or statistical reports.

• Two forms – verbose and terse– Terse: concepts, attributes & values are just codes.

• T-12710

– Verbose: concepts, attributes & values contain English• T-12710[Femur]

Coding an Instance(how?)

• Concept• Concept ( attribute : value )• Concept ( attribute1 : value1; attribute2 :

value2 )• Concept (attribute1 : value1 ( attribute 1.1 :

value1.1 ) ; attribute2 : value2 )• Concept ( attribute1 : value1 ( attribute1.1 :

value1.1 ; attribute1.2 : value1.2 ) )

Storing an instance(why?)

• We need a way to store the data in a relational database that will facilitate searches and statistics– Must allow for representation of structure.– Must fit the relational model (tables).– Must allow easy searching for concepts.– Must be efficient

• Simple concepts take little space.

• More complicated instances take more space.

Storing an instanceFive columns in Table

• Key –unique identifier for the row.• Entry –unique identifier for the instance.

– Single concept instance is one row.– Complicated instances take multiple rows.

• Parent – for attribute/value modifiers.– Key for concept that they modify.– Empty for main concept.

• Attribute– Shows attribute (relationship type) for attribute/value modifiers.– Empty for main concept

• Value– Shows main concept or value of a attribute/value modifiers.– Searchable field (indexed). Match pulls all rows with same Entry.

Example 1

Closed spiral fracture of shaft of the left femur

Most specific SNOMED Leaf =

Closed fracture of shaft of femur

Missing additional modifiers =

Fracture, spiral

Left (femur)

Medical Record Statement =

DD-13152 Closed fracture of shaft of femur

M-12030Fracture, spiral

G-C504 Associated

morphology

G-C220Has laterality

G-A100Right

Closed spiral fracture of shaft of the left femur(Tree Structure)

DD-13152[Closed fracture of shaft of femur](G-C504[ASSOCIATED-MORPHOLOGY]:M-12030[Fracture, spiral];

G-C220 [HAS-LATERALITY]: G-A101[Left])

DD-13152 (G-C504:M-12030;G-C220:G-A101)

Verbose:

Terse:

Closed spiral fracture of shaft of the left femur(Coding)

Closed spiral fracture of shaft of the left femur(Suggested Storage Form)

Key Entry Parent Attribute Value

1 1 [Empty] [Empty] DD-13152 [Closed fracture of shaft of femur]

2 1 1 G-C504 [ASSOCIATED MORPHOLOGY]

M-12130 [Fracture,closed,spiral]

3 1 1 G-C220 [HAS LATERALITY]

G-A101 [Left]

DD-13152 Closed fracture of shaft of femur

M-12130Fracture, closed, spiral

G-C504 Associated

morphology

G-C220Has laterality

G-A100Right

Is a Fracture of shaft of femur

Is a Closed fracture of femur

Is a Fracture of Femur

Is a Fracture of lower limb

Is a Fracture of bone

Associated morphology Fracture, closed

Associated morphology Fracture, spiral

Associated morphology Traumatic abnormality

Associated morphology Fracture

Associated topography Thigh

Associated topography Shaft of Femur

Associated topography Femur

Has laterality Left (x.1)

Assignment computed

Closed spiral fracture of shaft of the left femur(Computed definition, partial)

From “root” definition

Example 2

Abscess of caudal lobe of right lung due to Mannheimia heamolytica

Most specific SNOMED Leaf =

Abscess of Lung

Missing additional modifiers =

Caudal lobe of lung

Right

Medical Record Statement =

Mannheimia heamolytica

Abscess of caudal lobe of right lung due to Mannheimia heamolytica

(Tree Structure)D2-61010

Abscess of lung

T-28A20Caudal lobe

of lung

G-C505 Associated

topography

G-C220Has laterality

G-A100Right

G-C503 Associated

etiology

L-22803Mannheimia haemolytica

D2-61010[Abcess of Lung](G-C505[ASSOCIATED-TOPOGRAPHY]:T-28A20[Caudal Lobe of lung](G-C220[HAS-LATERALITY]:G-A100[Right]);

G-C503 [ASSOCIATED-ETIOLOGY]: L-22803:[Mannheimia heamolytica])

D2-61010 (G-C505:T-28A20 (G-C220:G-A100); G-C503:L-22803)

Verbose:

Terse:

Abscess of caudal lobe of right lung due to Mannheimia heamolytica

(Coding)

Abscess of caudal lobe of right lung due to Mannheimia heamolytica

(Relational Table)

Key Entry Parent Attribute Value

1 1 [Empty] [Empty] D2-61010 [Abscess of Lung]

2 1 1 G-C505 [Associated topography]

T-28A20 [Caudal lobe of lung]

3 1 2 G-C220 [Has laterality] G-A100 [Right]

4 1 1 G-C503 [Associated etiology]

L-22803 [Mannheimia haemolytica]

D2-61010 Consolidation of lung

T-28A20Caudal lobe

of lung

G-C505 Associated

topographyG-C220

Has lateralityG-A100Right

G-C503 Associated

etiology

L-22803Mannheimia haemolytica

Is a Inflammatory disorder of lower respiratory tract

Is a Disease of lung

Is a Abscess of thorax

Is a Disease of lower respiratory tract

Is a Disease of thorax

Is a Disease of respiratory tract

Associated morphology Abscess

Associated topography Lung

Associated topography Caudal lobe of lung

Has laterality Right

Associated etiology Mannheimia heamolytica

Associated etiology Family pasteurellaciae

Abscess of caudal lobe of right lung due to Mannheimia heamolytica

(Computed Definition, partial)

Assignment statedFrom “root” definition

Example 3

Consolidation of caudal lobe of right lungand middle lobe of left lung.

Most specific SNOMED Leaf =

Consolidation of Lung

Missing additional modifiers =

Caudal lobe of right lung

Medical Record Statement =

Middle lobe of left lung

Consolidation of caudal lobe of right lungand middle lobe of left lung.

(Tree Structure)D2-50020

Consolidation of lung

T-28A20Caudal lobe

of lung

T-28A25Middle lobe

of lung

G-C505 Associated

topography

G-C505Associated

topography

G-C220Has laterality

G-A100Right

G-C220Has

laterality

G-A101Left

Consolidation of caudal lobe of right lungand middle lobe of left lung.

(Coding)

D2-50020 [Consolidation of Lung](G-C505[ASSOCIATED-TOPOGRAPHY]:T-28A20[Caudal Lobe of lung](G-C220 [HAS-LATERALITY]:G-A100[Right]);

G-C505[ASSOCIATED-TOPOGRAPHY]:T-28A25[Middle Lobe of lung](G-C220[HAS-LATERALITY]:G-A101[Left]))

D2-50020 (G-C505:T-28A20 (G-C220:G-A100); G-C505:T-28A25 (G-C220:G-A101))

Verbose:

Terse:

Consolidation of caudal lobe of right lungand middle lobe of left lung.

(Relational Table)D2-50020

Consolidation of lung

T-28A20Caudal lobe

of lung

T-28A20Middle lobe

of lung

G-C505 Associated

topography

G-C505Associated

topography

G-C220Has laterality

G-A100Right

G-C220Has laterality

G-A101Left

Key Entry Parent Attribute Value

1 1 [Empty] [Empty] D2-61010 [Abscess of Lung]

2 1 1 G-C505 [Associated topography]

T-28A20 [Caudal lobe of lung]

3 1 2 G-C220 [Has laterality] G-A100 [Right]

4 1 1 G-C505 [Associated topography]

T-28A20 [Middle lobe of lung]

5 1 4 G-C220 [Has laterality] G-A101 [Left]

Is a Disease of lung

Is a Disease of lower respiratory tract

Is a Disease of thorax

Is a Disease of respiratory tract

Associated morphology Consolidation

Associated topography Lung

Associated topography Caudal lobe of lung

Has laterality Right

Associated topography Middle lobe of lung

Has laterality Left

Consolidation of caudal lobe of right lung and middle lobe of left lung.

(Computed Definition, partial)

Assignment statedFrom “root” definition

Assignment: retrieve “Disorders of left lung (all lobes)? ”

Requirements for Incorporation in a Clinical Information System• System must provide appropriate sub-

vocabularies for each C.V. field.– Controlled Vocabulary is a data type, like date/time.

• User interface must allow selection from sub-vocabulary.

• User interface must provide a mechanism for constructing phrases.

• System must support the semantic structure (both meta-semantics and instance-semantics) to allow for searches and statistical reports.

Cost to the Vendors(dollars, time, and space)

• SNOMED-RT license (probably passed-on).• Construction of sub-vocabularies for each field.

This includes main concepts, relationships (attributes) and modifiers (values).

• Concept and Relationship tables for every term in the sub-vocabularies and all their parents (by any relationship).

• Phrase tables for each field (not just a single entry).• Automatic expansion of searches to include all IS-

A descendants.

Cost to the Users (YOU!)

• Attention to precise definitions of SNOMED concepts. (G.I.G.O.)

• Construction of meaningful phrases for an appropriate granularity (whatever that is).– If it’s not entered, you can’t find it.

– Constructing phrases will never be as easy as typing a sentence. How close it comes depends on the vendor’s interface.

• Passed-on development and licensing costs.• Disk space for vocabulary tables.

Benefits to the Users (YOU!)• Accurate and complete searches.

– Can search on concepts not directly entered, but implied by the meta-semantics.

– Use to locate specific cases, do retrospective studies, or generate outcome-based reviews.

• Accurate and complete statistics.– Know what cases you are really seeing.– Know what referrals you are really making.

• Pooled data for research (e.g., VMDB)– How does your practice compare with others?– Distribution of diagnoses, procedures, etc.

Will the Vendors Do It?

Only if YOU demand it!• Are the benefits meaningful to you?

• Are you willing to take the time to enter data with a controlled vocabulary?

• Are you willing to pay the passed-on costs from the vendors?– That cost is dependent on how many of you

demand it.