XML and General Dutch Dictionary (ANW)
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 1
Peter van der Kampwww.inl.nl
Topics
• Characteristics
• Schema
• XML Dictionary Editor
• Problems to be solved
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 2
Characteristics
Online dictionary, no printed version
Dutch language (incl. Flanders) from 1970 - 2018
Based on a corpus of 100 mio words
Elaborated microstructure
XML
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 3
Schema characteristics
• Divided into 12 subschemas• Currently all elements: zero or more occurrences except headword• Currently 186 atomic elements• Many enumerations (378, to be used as controlled vocabulary)• Some elements allowed at different levels
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 4
Schema
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 5
Entry
PoS Sense
Entry
PoS
XML Dictionary Editor
User requirements:
• Don’t want to work with tags• Tags invisible
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 6
XML Dictionary Editor (cont’d)
User requirements:
• Form like input• Use of predefined lists (controlled vocabulary)
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 7
XML Dictionary Editor (cont’d)
User requirements:
• Insert, add and remove elements must be easy
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 8
XML Dictionary Editor (cont’d)
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 9
User requirements: • Hide/show elements
Technical requirements• Subschema enabled
XML Dictionary Editor (cont’d)
XML editor, but…
…which one?
XMLWriter
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 10
XML Dictionary Editor (cont’d)
Currently the best possible solution:
Authentic (free XML content editor from Altova)StyleVision (e-forms and stylesheet designer from Altova)(http://www.altova.com)
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 11
XML Dictionary Editor: problems
Problem: hide element = delete elementHide element important due to size of entry
Solution (to be implemented):• Extra element <hide> in schema• Checkbox as ‘data entry device’• When unchecked: perform hide
Disadvantage:<hide> is noise in dictionary entry
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 13
XML Dictionary Editor: problems
Problem: visualize difference between container elements and atomic elements.
Current implementation requires some schema knowledge
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 16
Conclusion / future work
Developing forms easyCurrent implementation satisfying
Database solution (relational vs. xml)RetrievalEasy use of (X)query language
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 17