33
L & C Dr. W. Ceusters Language & Computing nv www.landc.be 1 L&C’s LinkBase: L&C’s LinkBase: a multi-lingual Hub a multi-lingual Hub to medical terminologies to medical terminologies Dr. W. Ceusters Dir R&D Language & Computing nv

L & C Dr. W. Ceusters Language & Computing nv 1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

L & CDr. W. Ceusters Language & Computing nv www.landc.be 1

L&C’s LinkBase:L&C’s LinkBase:a multi-lingual Huba multi-lingual Hub

to medical terminologiesto medical terminologies

L&C’s LinkBase:L&C’s LinkBase:a multi-lingual Huba multi-lingual Hub

to medical terminologiesto medical terminologies

Dr. W. Ceusters

Dir R&D

Language & Computing nv

Page 2: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 2

L & CPresentation overviewPresentation overviewPresentation overviewPresentation overview

• Short history of L&C

• L&C’s integrated approach to medical natural language understanding– Focus on medical terminology management

• Position in the international market

• Relevant demonstrations– LinkFactory– Ontology Browser

Page 3: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 3

L & CGoal of Language & Computing nvGoal of Language & Computing nvGoal of Language & Computing nvGoal of Language & Computing nv

To provide

users and developers

of systems for

knowledge management

with tools and services

for efficient and accurate

data-entry and retrieval by

exploiting the full power of

automated (medical) natural

language understanding

We hereby declare ...

Page 4: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 4

L & Cspeech

recognition TTS

natural language

understandingtext

generation

Language EngineeringLanguage EngineeringLanguage EngineeringLanguage Engineering

speech speech

text text

semantic representations

language models

semantic models

dialogue models

speech models

information processing

Page 5: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 5

L & CThe three pillars of Healthcare ITThe three pillars of Healthcare ITThe three pillars of Healthcare ITThe three pillars of Healthcare IT

EHCRS

Language

Terminology

Individual patient careSeamless care

Historical overview...

Comparability of dataCrossborder careDecision support

Abstraction / grouping...

Faithful data recordingSufficient level of detail

...

Domain of discourse:healthcare

Page 6: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 6

L & CHistory of R&D in L&CHistory of R&D in L&CHistory of R&D in L&CHistory of R&D in L&C

0

5

10

15

20

25

30

35

40

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001

Employees Share value (1000 Euro)

Anthem Multi-Tale Dome GIU Select C-CareLiquid Mobidev

R/D ratio

Page 7: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

L & CDr. W. Ceusters Language & Computing nv www.landc.be 7

L&C’s integrated approachL&C’s integrated approachL&C’s integrated approachL&C’s integrated approach

Page 8: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 8

L & CThe L&C integrated solutionThe L&C integrated solutionThe L&C integrated solutionThe L&C integrated solution

Data structure andfunction library for

language understanding

Medical and linguisticknowledge required for

language understanding

NLU enabling tools forknowledge supported

data-entry and -retrieval

Page 9: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 9

L & CThe L&C Linguistic Concept FactoryThe L&C Linguistic Concept FactoryThe L&C Linguistic Concept FactoryThe L&C Linguistic Concept Factory

Linguistic-semantic Function Library

C-DEFINE(c-meningitis, c-inflammation HAS-LOC c-meninges)

T-DEFINE(“méningite”, french, c-meningitis)

Storage Functions

Retrieval Functions

GET-TERMS(c-meningitis, {french, dutch})

“méningite”, “hersenvliesontsteking”

Page 10: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 10

L & CArchitectual overviewArchitectual overviewArchitectual overviewArchitectual overview

RMICorbaSoap

LinkFactoryServer

PC

Mac

LinkBaseDatabase

LAN

WAN

Internet

JDBCJava

UnixWorkstation

LinkFactory Workbench

ServerBusinessObjects

Concept tree

...

Translate

Linktype tree

Criteria / Full definitions

Page 11: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 11

L & CClient Graphical ObjectsClient Graphical ObjectsClient Graphical ObjectsClient Graphical Objects

Page 12: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 12

L & CBuild-in quality controlBuild-in quality controlBuild-in quality controlBuild-in quality control• Knowledge entered is immediately used to check validity

of subsequent entries• Version management• User-management with :

– Allowed actions based on experience

– Personal audit trail

• Clear and formal separation with 3rd party systems to avoid copying mistakes such as:– UMLS’ cyclical ISA relationships

– SNOMED-RT ‘s “very usual = always” modelling

– Most systems’ overloaded hierarchical relations

Page 13: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 13

L & CThe L&C Linguistic Concept DatabaseThe L&C Linguistic Concept DatabaseThe L&C Linguistic Concept DatabaseThe L&C Linguistic Concept Database

Formal Domain Ontology

Lexicon

Grammar

Language ALanguage A

Lexicon

Grammar

Language BLanguage B

Cassandra Linguistic Ontology MEDRA

ICD

SNOMED

ICPC

Others ...

Proprietary Terminologies

Page 14: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 14

L & CA formal terminologyA formal terminologyA formal terminologyA formal terminology

• Separation of terms and concepts

• To be used by machines, not people

• All information is explicit in the structure, not implicit in the terms

• Clean subsumption hierarchies

• Formal, “computable” definitions of concepts

• Internal, automated quality control

Page 15: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 15

L & CExpl: Joint anatomyExpl: Joint anatomyExpl: Joint anatomyExpl: Joint anatomy

• joint HAS-HOLE joint space• joint capsule IS-OUTER-LAYER-OF joint• meniscus

– IS-INCOMPLETE-FILLER-OF joint space– IS-TOPO-INSIDE joint capsule– IS-NON-TANGENTIAL-MATERIAL-PART-OF

joint

• joint – IS-CONNECTOR-OF bone X– IS-CONNECTOR-OF bone Y

• synovia– IS-INCOMPLETE-FILLER-OF joint space

• synovial membrane IS-BONAFIDE-BOUNDARY-OF joint space

Page 16: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 16

L & CExpl: Relative spatial localisationExpl: Relative spatial localisationExpl: Relative spatial localisationExpl: Relative spatial localisation

IS-TOPO-

INSIDE-OF

IS-GEO-INSIDE-

OF

IS-INSIDE-

CONVEX-HULL-OF

IS-PARTLY-IN-CONVEX-

HULL-OFIS-OUTSIDE-CONVEX-HULL-OF

HAS-DISCONNECTED-

REGION

HAS-EXTERNAL-

CONNECTING-REGION

HAS-DISCRETED-REGION

HAS-TANG.-SPAT.-PART

HAS-NON-TANG.-SPAT.-PART

IS-SPAT.-

EQUIV.-OF

IS-TANG.-SPAT.-PART-

OF

IS-NON-TANG.-SPAT.-PART-

OF

HAS-PARTIAL-SPATIAL-OVERLAP

HAS-PROPER-SPATIAL

-PART

IS-PROPER-

SPAT.-PART-

OF

HAS-SPATIAL

-PART

IS-SPATIAL-PART-

OF

HAS-OVERLAPPING

-REGION

HAS-CONNECTING-

REGION

HAS-SPATIAL-POINT-

REFERENCE

Page 17: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 17

L & CExpl: Patient at risk (risk patient)Expl: Patient at risk (risk patient)Expl: Patient at risk (risk patient)Expl: Patient at risk (risk patient)

Having a healthcare phenomenon

Generalised PossessionHealthcare phenomenonHuman

IS-A

Has-possessor Has-

possessed

PatientIs-possessor-of

Patient at risk

IS-A Has-Healthcare-phenomenon

Risk Factor

IS-AIs-Risk-

Factor-Of

Patient at risk for osteoporosis

Risk factor for osteoporosis Osteoporosis

Has-Healthcare-phenomenon

Is-Risk-Factor-Of

IS-A IS-A IS-A

11 1

2

2

IS-A

3

3

44

Page 18: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 18

L & CLinkBase size per 01-04-2001LinkBase size per 01-04-2001LinkBase size per 01-04-2001LinkBase size per 01-04-2001• 920.000 (850.000) concepts• 2.300.000 terms• 320 link-types• 2.000.000 link instances• 300.000 links to 3rd party systems

• But:– Never finished !– Quality sufficient for current applications

Page 19: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 19

L & CTex

tT

ext

Res

ult

Res

ult

ProcessorProcessor

Domain representationDomain representation

Goal representationGoal representation

LinguisticLinguisticKnowledgeKnowledge

TaskTaskKnowledgeKnowledge

Form

al d

omai

n

Form

al d

omai

n

onto

logy

onto

logy

L&C Linguistic componentsL&C Linguistic componentsL&C Linguistic componentsL&C Linguistic componentsT

ext

Tex

t

Res

ult

Res

ult

ProcessorProcessor

Domain representationDomain representation

Goal representationGoal representation

LinguisticLinguisticKnowledgeKnowledge

TaskTaskKnowledgeKnowledge

Form

al d

omai

n

Form

al d

omai

n

onto

logy

onto

logy

Page 20: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 20

L & CL&C application serversL&C application serversL&C application serversL&C application servers

• Coding tools: FastCode

• Semantic indexers: Tessi

• Spell checkers and type ahead: FastType

• Semi controlled language parsers in restricted domains: FreePharma

• Ontology browser

• Stochastic dependency-based indexer: C-Link

• (Ir)relevant document classifier for very low prevalence data sets

Page 21: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 21

L & CFastCodeGenerator

LinC-Factory

IIntegrated coding approachntegrated coding approachIIntegrated coding approachntegrated coding approach

Formal representation of Classification system

LinCBase

Mapping data

Domain+Linguistic ontology

FastCode client

FastCode server

Codingdata

Page 22: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

L & CDr. W. Ceusters Language & Computing nv www.landc.be 22

Benefits of formal multi-lingual Benefits of formal multi-lingual terminology managementterminology management

Benefits of formal multi-lingual Benefits of formal multi-lingual terminology managementterminology management

Page 23: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 23

L & CSemi-automatic mapping Semi-automatic mapping (ICPC-ICD10)(ICPC-ICD10)Semi-automatic mapping Semi-automatic mapping (ICPC-ICD10)(ICPC-ICD10)

Zenker’sZenker’s diverticulumdiverticulum (D84) (D84)

diverticulumdiverticulumesophagusesophagus

HAS-LOCHAS-LOCpressurepressure

HAS-CAUSEHAS-CAUSE

intraluminalintraluminal

HAS-ORIGHAS-ORIG

Acquired diverticulum of esophagus (K22.5)Acquired diverticulum of esophagus (K22.5)

HAS-HAS-LOCLOC

AcquiredAcquired

HAS-AcqModeHAS-AcqMode

HAS-AqModeHAS-AqMode

Page 24: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 24

L & CReclassify: Reclassify: FOOT EXARTICULATIONFOOT EXARTICULATIONReclassify: Reclassify: FOOT EXARTICULATIONFOOT EXARTICULATION

• Definitions given by domain-expert:

– ( ( FOOT EXARTICULATION) • { [ IS_A ] ( EXARTICULATION ) } { [HAS_THEME] ( FOOT ) } )

– ( (AMPUTATION OF FOOT) • { [ IS_A ] (AMPUTATION ) } { [ HAS_THEME ] ( FOOT ) } )

– ( (EXARTICULATION)• { [ IS_A ] (AMPUTATION ) } { [ HAS_SOURCE ] ( JOINT ) } )

• Redefinition by automatic classifier– ( ( FOOT EXARTICULATION )

– { [ IS_A ] (AMPUTATION OF FOOT ) }

– { [ IS_A ] ( EXARTICULATION ) } )

Page 25: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 25

L & CDetection of missing termsDetection of missing termsDetection of missing termsDetection of missing terms

Page 26: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 26

L & CResolving conflicting viewsResolving conflicting viewsResolving conflicting viewsResolving conflicting views

MESH-2001 : “Seizures”

MESH-2001 : “Convulsions”

Snomed-RT : “Convulsion”

Snomed-RT : “Seizure”

L&C : ConvulsionL&C : Seizure

L&C : Health crisis

L&C : Epileptic convulsion

IS-AIS-A

IS-AIS-A

IS-narrower-than ISA

Has-CCC

Has-CCC

Has-CCC

Has-CCC

Page 27: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

L & CDr. W. Ceusters Language & Computing nv www.landc.be 27

Position in the marketPosition in the marketPosition in the marketPosition in the market

Page 28: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 28

L & CMain business modelMain business modelMain business modelMain business model

Software developersIntegrators

Hospitals Internet Service ProvidersPharmaceutical companies Research OrganisationsMedical Publishers GovernmentHealthcare Insurance Companies MCO

Page 29: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 29

L & CProject-based product developmentProject-based product developmentProject-based product developmentProject-based product development

Service Component

Product ComponentProject Definition

Corpus analysis

Set up service

Product development

Workbench development

Teach and deliver

Page 30: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 30

L & CCurrent major partners/clientsCurrent major partners/clientsCurrent major partners/clientsCurrent major partners/clients• Coding tools

– Several hospitals using ICD-9-CM FAstCode

• Terminology management services + NLU based data entry– IDEWE: largest Belgian occupational medicine services

provider

– First Databank UK

– Belgian military medical service

• Semantic indexing– Belgian Professional Association of Pharmaceutical industry

Page 31: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 31

L & CAcademic Competitors/ColleaguesAcademic Competitors/ColleaguesAcademic Competitors/ColleaguesAcademic Competitors/Colleagues• Main characteristics:

– Prototypes with very small coverage– No professional support

• Relevant examples:– OpenGalen (VUMAN):

• Very small “LinkBase”• “Toy”-link to language (language ignored as medium)

– Protégé (Stanford):• Ontology editor

– Several DL-systems: FacT, Cyclop, LOOM, ...• Tested with very small (tiny) ontologies• More powerful reasoning mechanisms than LinkFactory but totally

intractable on ontologies of over a few 1000 distinct concept classes

Page 32: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 32

L & CCommercial competitors/colleaguesCommercial competitors/colleaguesCommercial competitors/colleaguesCommercial competitors/colleagues

• Health Language Inc.

• Apelon Inc.– Ontyx– Lexical Technologies

Page 33: L & C Dr. W. Ceusters Language & Computing nv  1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language

Dr. W. Ceusters Language & Computing nv www.landc.be 33

L & CL&C’s strong positionL&C’s strong positionL&C’s strong positionL&C’s strong position• Multi-lingual and multi-cultural approach • Modelling independent from specific languages but not

from language as communication medium• Proven scalability of our approach• Support at all levels

– Services to migrate existing client dictionairies

– Large tool set for terminology development, maintenance, and/or use

• Only company with in-house expertise in medicine, computational linguistics in many languages, formal ontologies and informatics