62
LIS 663 Basic Database Searching Using controlled vocabulary and database thesauri. Index browsing, term mapping and clustering Session 3. Péter Jacsó Fall 2015

LIS 663 Basic Database Searching Using controlled vocabulary and database thesauri. Index browsing, term mapping and clustering Session 3. Péter Jacsó

Embed Size (px)

Citation preview

LIS 663 Basic Database Searching

Using controlled vocabulary and database thesauri. Index browsing, term mapping

and clustering

Session 3.

Péter JacsóFall 2015

Jacsó

• Inherited from the limits of the card catalog & print world

• Organization by documents classification codes and subject headings

• Space, time & cost limits and tree saving of the print era

• Only human indexing limit in the digital era

• Human indexing vs. automatic indexing

Jacsó

Jacsó

Jacsó

Jacsó

• The unsung horrors of uncontrolled vocabularies for

AU, JN, CS, DT, LA, CY data elements

• The good the bad and the ugly of controlled subject vocabularies

• The appealing and appalling software tools for Controlled vocabulary searching

Jacsó

• The indifference and incompetence of staff at datafile producers

• Here today gone tomorrow for $0.50 more

• The myth and reality of looking up controlled vocabulary terms

• The convenience and ease of free-text search in TI, DE, AB

• I teach and preach but you decide

• Not looking up AU, JN is reckless negligence, malpractice

• You MUST browse

Jacsó

Bi-lingual thesauri

Brunei?East-Timor?Burma?

Jacsó

Jacsó

Blatant and pervasive misspellings in the descriptor field in MHA

Jacsó 10-03

• Nice idea - brutal reality

• Increase recall OR precision?

• Increase recall AND precision idea

• Garden of synonyms as descriptors

• Learn to love synonyms, but mind implications

Jacsó

• Whose thesaurus is it, anyway

• How many Englishes there are

• Chances of guessing, agony of look-up

• Convenience governs the search

• Descriptor changes across time in single database

• Cross-database searching nightmare

Jacsó

British Education Index thesaurus

Jacsó

European Education Thesaurus (what about the Brits?)

Pupils and students have behaviour, but students' have no attitudes in Europe?

Jacsó

This side of the ocean - ERIC

Jacsó

• Same flavor of English no panacea

• APAIS - domestic violence

• ATED - family violence

• BEI - family violence

• Whitaker - domestic violence

• CBCA - family violence

• Canadian News - family v & domestic v

• PsycINFO - family violence

• e-psyche - domestic abuse (lead-in)Jacsó

• LCSH - conjugal violence

• Gale - conjugal violence

• PAIS (all but DIALOG) - spousal abuse

• PAIS (DIALOG) –still no spousal abuse

• This is user abuse by thesaurus

Jacsó

so some preferred/related terms were assigned 0 times?

Jacsó

Jacsó

Champagne promises

Jacso

From fake

rigor...

Jacso

...to wishful thinking...

Jacso

...to delusionWho are you fooling?

Jacso

Reality check

Jacso

How many duplicates, triplicates, quadruplicates? ...

Jacso

Fatal obesity

Jacso

Jacso

In spite of “rigorous quality control”...

Jacso

Rigor mortis sets in

• The Myth of Cross References

• Few lead-in terms

• Marcia Bates' side-of-the-barn principle

• Keep dreaming, but …

• PubMed UMLS MetaThesaurus

• My plea to include (invisibly) the common misspelled variants

Jacsó

Jacsó

Jacsó

Jacsó

Jacsó

Jacsó

Jacsó

"Battered females" - mapping

Jacsó

Jacsó

Jacsó

What English is used by UN?

Jacsó

Jacsó

Variety of Thesaurus Terms

Look up but don’t search by thesaurus terms alone

Jacsó

• By now you know that

- Not all thesauri are born equal

- Not all implementations of the same thesaurus

are born equal

• National anthem

- Same tune, same lyrics

- Delivery by Whitney Houston vs. Roseanne Barr

People do not use thesaurus unless they must

Jacsó

• Extra step

• "Good enough without it" attitude

• Archaic, stale content, uneasy ESL structure

• Aging crooner at retirement home party

• Very few are Tina Turners

• Less then 10 percent of databases have thesaurus

implemented on DIALOG, Ovid, OCLC, ProQuest

Jacsó

• You can tell the horse that there are rivers

• You can tell the horse to go to the river

• You can spur the horse to go to the river

• You can lure the horse to the river

But will it drink the water?

Will it like the muddy and stale water?

How to lure in end users to use CV

Jacsó

• Bring the water from the river to the horse

But the water still may be muddy and/or

lukewarm

• Mapping users' terms automatically into

thesaurus terms

• Once again, it depends on who does the delivery

Excuses

Jacsó

• Attitude remained the same

• I wanna search not browse

• Real men don't ask for directions

• They just drive around the block as if for fun

• Give up finding the gourmet restaurant and go to BK

• Only hope: even the “machoest” machos look at road

signs

• Use term mapping (but call it term translation when

talking to male patron)

Wife beatingOvid is still much better for MeSH

Jacsó

It is a wonderful British & American equivalency (at first sight) in MeSH in Ovid

Jacsó

when you enter the British term

Jacsó

it directly maps to the American exact MeSH term (note the e and the A)

Jacsó

the trick is in the scope note where haemophilia and hemophilia are listed as UF

Jacsó

but you can't count on consistency for British terms to directly mapping into MeSH (it is a MeSH deficiency, not sw)

Jacsó

and the American version directly maps to the same

Jacsó

What if mapping is not on?

Jacsó

What if mapping is set to ON by default (as it should be in Ovid)? It suggests sciatica

Jacsó

Is sciatica a really equivalent term to ischialgia? Look at the scope note. It does not appear as synonym, but its definition seems real good match

Jacsó

and it is assigned to 2,625 records, so well worth checking further

Jacsó

what does MEDLINE arch-rival EMBASE say? It is certainly used in European literature

Jacsó

and it maps to the exact EMTree term (MeSH equivalent)

Jacsó

and even better, sciatica is mapped to ischialgia. BINGO, isn't it?

Jacsó

and here is its scope note (sorely missing a medical definition, but the UFs are clearly convincing).

Jacsó