25
2-5 Nov 2008 34th ILO Meeting 1 International Atomic Energy Agency Workshop Workshop on on Computer-assisted Indexing Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of INIS Liaison Officers 2-5 November 2008, Vienna, Austria

International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

Embed Size (px)

Citation preview

Page 1: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 2008 34th ILO Meeting 1

International Atomic Energy Agency

WorkshopWorkshoponon

Computer-assisted IndexingComputer-assisted Indexing

Alexander Nevyjel

34th Consultative Meeting of INIS Liaison Officers2-5 November 2008, Vienna, Austria

Page 2: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 2 International Atomic Energy Agency

AgendaAgenda

• Review CAI procedures (workflow, formats, conventions)

• Thesaurus extension: Hidden terms tables

• Problems and how to overcome

• Discussion and exchange of experiences

• Hands-on training by INIS Subject Specialists(in their offices, open end for this afternoon)Tips, tricks, recommendations

Page 3: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 3 International Atomic Energy Agency

Objectives of Computer-assisted IndexingObjectives of Computer-assisted Indexing

Maintaining database quality

Saving of subject analysis manpower

Improving indexing consistency

Page 4: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 4 International Atomic Energy Agency

CAI InteractiveTraining of CAI

Records with FullIndexing

INIS Verification andProduction System

CAI Offline/Batch

Records withCAI-suggested

Descriptors

INIS SubjectAnalysis Module

Input fromMember States

FullIndexing

Proposed Terms/No Indexing

Electronic Recordsfrom Publishers

Proposed Terms/No Indexing

CAI-Workflow

Interactive CAI ProcessingBatch Mode

Conventional Processing

Page 5: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 5 International Atomic Energy Agency

CAI Batch and Online ProcessingCAI Batch and Online Processing

• Input: MemSt-CC-yymmdd-xxxxxxxxxxx

• Output: _MemSt-CC-yymmdd-xxxxxxxxxxx

• MemSt is a standard prefix (meaning “member state”)

• CC is the country code

• yymmdd is the date when the file was generated

• xxxxxxxxxxx is any additional identification

• Examples• MemSt-AR-041203-thisismytestfile

• MemSt-FR-041212-fileidentification

Page 6: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 6 International Atomic Energy Agency

CAI Batch ProcessingCAI Batch Processing

• Output: _MemSt-CC-yymmdd-xxxxxxxxxxx

• These files will carry the CAI suggested descriptors in tag 800, preceded by the string

##CAI suggestions##;

• Example:• 800^##CAI suggestions##; DESCRIPTOR1; DESCRIPTOR2;

DESCRIPTOR3; …….

• sent back to the member state for reviewing

Page 7: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 7 International Atomic Energy Agency

CAI OnlineCAI Online

• File loaded to CAI online

• All files of a Member State appear on the queue page as batch

MemSt-XX

• Please open only your own batch, do not touch other queues

• Files in a queue will be opened one after the other, in the sequence as they have been loaded

Page 8: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 8 International Atomic Energy Agency

CAI Batch ProcessingCAI Batch ProcessingReviewing ProcessReviewing Process

• Delete all suggested descriptors which are too general

• Add relevant descriptors which were not found • numerical values, e.g. pressure ranges, temperature ranges, etc

• nuclear reactions

• chemical compounds, alloys, etc.

• CAI is cleaning up BT/NTs clean up BT/NTs from manual additions

• Clean up suggestions from homographic terms

• Delete “##CAI suggestions## “

• Submit file to “INIS Input Box”

Page 9: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 9 International Atomic Energy Agency

CAI OnlineCAI OnlineReviewing ProcessReviewing Process

• Delete all suggested descriptors which are too general

• Add relevant descriptors which were not found • numerical values, e.g. pressure ranges, temperature ranges, etc

• nuclear reactions

• chemical compounds, alloys, etc.

• CAI is cleaning up BT/NTs will give warnings for BT/NTs from manual additions

• Clean up suggestions from homographic terms

• Export file when finished

• File will be exported to INIS Production System (or send back to MS for reviewing if requested)

Page 10: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 10 International Atomic Energy Agency

CAI Thesaurus extensionCAI Thesaurus extension

“Hidden terms” are character patterns representing the different appearances of a concept in the free text, which is indexed by one or more descriptors.

• handled similar to “forbidden terms” with one or more USE relations

• CAI internal only

• not exported to INIS production system

• not exported to FIBRE

• not printed in any appearance of the thesaurus

• support identification of descriptors in the free text

Page 11: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 11 International Atomic Energy Agency

Hidden Terms: CompoundsHidden Terms: Compounds

Descriptor hidden term free text

MAGNESIUM BORIDES MgB_2 MgB2

MAGNESIUM CARBONATES MgCO_3 MgCO3

MAGNESIUM HYDRIDES MgH_2 MgH2

MAGNESIUM HYDROXIDES Mg(OH)_2 Mg(OH)2

IRON BROMIDES iron dibromide

IRON BROMIDES iron tribromide

ARSENIC IONS As"3"- As3-

ACETYLENE C_2H_2 C2H2

ACETALDEHYDE C_2H_4O C2H4O

ACETIC ACID C_2H_4O_2 C2H4O2

approx. 2000 hidden terms (expected 3000)

Page 12: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 12 International Atomic Energy Agency

Hidden Terms: IsotopesHidden Terms: Isotopes

Descriptor hidden term free text

CESIUM 137 Cesium 137, Cesium-137"1"3"7cs 137Cs137 caesium 137 Caesium, 137-Caesiumcaesium 137 Caesium 137, Caesium-137137 cesium 137 Cesium, 137-Cesium137 cs 137 Cs, 137-Cs137cs 137Cscs 137 Cs 137, Cs-137cs"1"3"7 Cs137

cs137 Cs137CESIUM 138 "1"3"8"mcs 138mCs

cs"1"3"8"m Cs138m

approx. 26.000 hidden terms

Page 13: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 13 International Atomic Energy Agency

Hidden Terms: Elementary ParticlesHidden Terms: Elementary Particles

Descriptor hidden term free text

B QUARKS bottom quarksT QUARKS top quarks

ELECTRON NEUTRINOS #nu#_e νe

MUON NEUTRINOS #nu#_#mu# νμ

TAU NEUTRINOS #nu#_#tau# ντ

RHO-770 MESONS #rho#(770) ρ(770)RHO-770 MESONS #rho#-770 ρ-770OMEGA-782 MESONS #omega#(782) ω(782)OMEGA-782 MESONS #omega#-782 ω-782KAONS NEUTRAL K"0 K0

KAONS NEUTRAL SHORT-LIVED K"0_S K0S

KAONS NEUTRAL LONG-LIVED K"0_L K0L

approx. 300 hidden terms

Page 14: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 14 International Atomic Energy Agency

Hidden Terms: UK/US Spellings Hidden Terms: UK/US Spellings

Descriptor hidden term

A CENTERS a centresACTIVITY METERS activity metresANALOG COMPUTERS analogue computersANALOG SYSTEMS analogue systemsANESTHESIA anaesthesiaARCHAEOLOGY archeologyAUSTRIAN ORGANIZATIONS austrian organisationsBALLISTIC MISSILE DEFENSE ballistic missile defenceBAYARD-ALPERT GAGES bayard-alpert gaugesBEAM ANALYZERS beam analysersBEHAVIOR behaviourCATALOGS catalogues

approx. 800 hidden terms

Page 15: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 15 International Atomic Energy Agency

Hidden Terms: Diacritics and Countries Hidden Terms: Diacritics and Countries

Descriptor hidden termDiacritics:

BAECKLUND TRANSFORMATION backlund transformationBRUECKNER METHOD bruckner methodBRUECKNER MODEL bruckner modelBRUNSBUETTEL REACTOR brunsbuttel reactorMOESSBAUER EFFECT mossbauer effect

Country Names:CAMBODIA kampucheaCOTE D'IVOIRE ivory coastGREECE hellasMYANMAR burmaSYRIA syrian arab republicTHAILAND siam

approx. 250 hidden terms

Page 16: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 16 International Atomic Energy Agency

Hidden Terms: Other Spellings Hidden Terms: Other Spellings

Descriptor hidden termSingular/Plural

FUNGI fungusFUNGI fungusesG MATRIX g matricesG MATRIX g matrixes

Reverse SequenceATOM-MOLECULE COLLISIONS molecule-atom collisionsATOM-MOLECULE COLLISIONS atom-molecule scatteringATOM-MOLECULE COLLISIONS molecule-atom scatteringATOM-MOLECULE COLLISIONS atom-molecule reactionsATOM-MOLECULE COLLISIONS molecule-atom reactionsATOM-MOLECULE COLLISIONS atom-molecule interactionsATOM-MOLECULE COLLISIONS molecule-atom interactions

approx. 900 hidden terms

Page 17: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 17 International Atomic Energy Agency

Hidden Terms: Other Spellings Hidden Terms: Other Spellings

Descriptor hidden termGrammatical Variations

PERIODICITY periodicPERIODICITY periodicalPERIODICITY periodically

Phrases versus compound termsRADIOWAVE RADIATION radio waveSPACE-TIME spacetimeWAVE FUNCTIONS wavefunction

TerminologyGAMMA SPECTROMETERS #gamma#ray spectrometerGAMMA SPECTROMETERS #gamma#-ray

spectrometerGAMMA SPECTROMETERS gammaray spectrometerGAMMA SPECTROMETERS gamma-ray spectrometer

Page 18: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 18 International Atomic Energy Agency

Hidden Terms: Other Spellings Hidden Terms: Other Spellings

Descriptor hidden termTerminology

SU-2 GROUPS su(2) theorySU-2 GROUPS su(2) symmetrySU-3 GROUPS su(3) theorySU-3 GROUPS su(3) symmetry

AbbreviationsCARBON DIOXIDE LASERS CO_2 laserCARBON DIOXIDE LASERS CO2 laserKOBAYASHI-MASKAWA MATRIX CKM matrixKORTEWEG-DE VRIES EQUATION kdv equation

Numerical ValuesKEV RANGE kevMEV RANGE mevGEV RANGE gev

Page 19: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 19 International Atomic Energy Agency

CAI Thesaurus ExtensionCAI Thesaurus Extension

• Thesaurus• Valid Descriptors 21.147

• Forbidden Terms 9.114

• CAI • Hidden Terms 34.105

• Total 64.366

Terminological Knowledge Base

Page 20: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 20 International Atomic Energy Agency

Terms which need special attentionTerms which need special attentionNumerical values, ranges Numerical values, ranges

• ENERGY RANGES• MEV RANGE

• MEV RANGE 01-10• MEV RANGE 10-100• MEV RANGE 100-1000

• PESSURE RANGES• Recognize pressure ranges• Translate from atm, bar, torr to Pascal

• TEMPERATURE RANGES• Recognize temperature ranges• Translate from Celsius, Fahrenheit to Kelvin• Attention: the forbidden term (since 1992)

high temperature USE TEMPERATURE RANGE 0400-1000 Kis leading often to wrong results

Page 21: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 21 International Atomic Energy Agency

Terms which need special attentionTerms which need special attentionMulti-meaning Multi-meaning

• “+” and “-“ signs • K+ KAONS PLUS, KAONS MINUS, POTASSIUM IONS

• Case sensitivity• TiN TIN (instead of TITANIUM NITRIDES)

• …this can be … CaN CALCIUM NITRIDES

• gas GALLIUM SULFIDES

• “…who is the …” WHO (World Health Organization)

• Verbs versus Nouns• “… this leads us to …” LEAD

• “… this leaves it ….” LEAVES

Page 22: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 22 International Atomic Energy Agency

Terms which need special attentionTerms which need special attentionMulti-meaning Multi-meaning

• MPA• MAXIMUM PERMISSIBLE ACTIVITY• Mega Pascal (MPa)

• GDP• GROSS DOMESTIC PRODUCT• GADOLINIUM PHOSPHIDES (GdP)

• COBRA SNAKES• COBRA REACTOR KBR-1 REACTOR

• … in isotopes….. INDIUM ISOTOPES• …at 195 deg K… ASTATINE 195

Page 23: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 23 International Atomic Energy Agency

Terms which need special attentionTerms which need special attention

• Homographic terms• Solutions SOLUTIONS or MATHEMATICAL SOLUTIONS

• Color COLOR, COLOR CENTRES, COLOR MODEL

• Flavor FLAVOR, FLAVOR MODELS

• Tunnel TUNNELS, TUNNELING, TUNNEL EFFECT

• Nuclear Reactions, e.g. 14N(γ,α)10B • Targets

• Beams

• Reactions

Page 24: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 24 International Atomic Energy Agency

Terms which need special attentionTerms which need special attentionTerms which are often wrongTerms which are often wrong

• Production• BEAM PRODUCTION • HEAT PRODUCTION • HYDROGEN PRODUCTION • ISOTOPE PRODUCTION • PARTICLE PRODUCTION • PLASMA PRODUCTION • PRODUCTION

• Transport• AIR TRANSPORT• ATOM TRANSPORT• BEAM TRANSPORT • CHARGED-PARTICLE TRANSPORT• ENVIRONMENTAL TRANSPORT• PHOTON TRANSPORT • RADIOACTIVITY TRANSPORT • TRANSPORT

• Decay• NUCLEAR DECAY

• ALPHA DECAY• BETA DECAY• …….

• PARTICLE DECAY• ELECTROMAGNETIC…• HADRONIC…• RADIATIVE…• WEAK…

Page 25: International Atomic Energy Agency 2-5 Nov 200834th ILO Meeting1 Workshop on Computer-assisted Indexing Alexander Nevyjel 34 th Consultative Meeting of

2-5 Nov 200834th ILO Meeting 25 International Atomic Energy Agency

CAI Hands-on training by Subject SpecialistsCAI Hands-on training by Subject Specialists

Physics Marija

Sejmenova-Gichevska

A2477

Chemistry Christine Krieger-Levine A2478

Reactors Neviana Rashkova A2479

Live Science Bekele Negeri A2480