NON-TECHNICAL COMPUTER THESAURUS VERSUS SPECIALIZED COMPUTER THESAURUS

Embed Size (px)

DESCRIPTION

This presentation is devoted to a comparative analysis of the Computer Thesaurus of Ukrainian Verbs and the Specialized Thesaurus of Computer Ideography. These two dictionaries are representative examples of a general language (non-technical) computer thesaurus and a specialized computer thesaurus. We focus our attention on the entries of each thesaurus, its macrostructure, microstructure, compilation and use.

Text of NON-TECHNICAL COMPUTER THESAURUS VERSUS SPECIALIZED COMPUTER THESAURUS

  • 1.NON-TECHNICAL COMPUTER THESAURUS VERSUS SPECIALIZED COMPUTER THESAURUS Olena SirukLaboratory for Computational LinguisticsInstitute of Philology National Taras Schevchenko University of Kyiv Ukraineolebosi @ gmail.com

2. 1. Topicality of the research

  • Compilation of general andspecialised(terminological) thesauri
  • Ukrainian lexicography development
  • Users requirements in integrated information
  • Development of computer technologies
  • Development of formalised principles of thesauri modelling
  • Systematisation of terms
  • Standardisation of definitions

3.

  • Non-technical
  • ( Computer Thesaurus of Ukrainian Verbs)
  • approbation on the basis of the semantic field of speech

he Thesaurus joins terms on theconceptualprinciple Specialized ( Specialized Thesaurus of Computer Ideography ) 4. 2.CT units (CT of CI versusCT of UV)

  • Quantity 75 terms (it is considered complete) / the semantic field of speech contains about 2000 units
  • Type nouns, noun-noun and noun-adjective compounds/verbs
  • Amount from 1 to 4 words in a term/LSV
  • C o ntent from highly specialised terms to terms related with other linguistic disciplines/verbs of the semantic field of speech

5. 3 .CT of Nounsversus CT of Verbs

  • It is precisely the noun that holds the garland in ideographical dictionaries of different languages.
  • The basis for the semantic scheme of nouns is adopted from obj e ctive extralinguistic reality.
  • Verbs are included in the different types of thesauri considerably less often than nouns, and especially seldom in terminological thesauri.
  • Significative semantics prevails in the meaning of a verb.

6. 4 .CT of Nouns

  • C o nsequently, for a noun
  • 1)external, denotativechoice of concepts is characteristic;
  • 2) adeductive approachto structuring the material is mostly applied;
  • 3) word-formation and the valency potential of a nounare not very importantfor the creation of the syn o ptic scheme;
  • 4)wholepartrelationsare subst a ntial ,taxonomy is pr e valent .
  • It is precisely the noun that holds the garland in ideographical dictionaries of different languages.

7. 5 .CT of Verbs

  • In light of this for verbs
  • 1) aninternal, sign i ficativec o ncept selection strategybased on the analysis of meaningis more acceptable;
  • 2) anind u ctive approachto ordering lexemes is more adequate;
  • 3) relations based onword-formation type(deriv a tion hyponymy) andvalency potential(a basis for connections between parts of speech) areessential ;
  • 4) taxonomy, wholepart relations areirrelevant .

8. 6. CT macrostructure

  • Synoptic scheme represented as a term index
  • Maximum depth 6 intervals of hierarchy

9.

  • Both types of CT have certain common, analogous and uniting features:
  • 1) both dictionaries represent more or less completely the relations between units;
  • 2) both dictionaries either have an explicit synoptic scheme, that is a division of the universe into thematic classes, or such a scheme ispresent mplicitly;
  • 3) the rubric (a class of synonymous words in non-technical thesauri and a descriptor article in specialized thesauri) serves as interpretation, or as context, in both dictionaries;
  • 4) there are cross-references between entries in both dictionaries.
  • The features of the lexical semantics of verbs condition the difference between an ideographical dictionary of nouns and an analogous dictionary of verbs with respect to the organization of its external structure (macrostructure). Verbs have been categorized primarily on a semantic basis, using the method of component analysis and stepwise identification of verbal meanings.

10. () 0

  • () 1
    • 2
    • () 2
    • 2
    • 2
      • 3
      • 3
        • 4
          • - 5
      • 3
      • 3
        • 4
        • 4
        • 4
        • 4
      • 3

Synoptic scheme of the CT 11. 12. CT fragment (online version) 13. 0

  • 1
    • / ()2
    • ()2
    • ()2
      • * , , 3
      • * 3
      • * 3
      • * 3
      • * 3
      • * , 3
      • * 3
      • * 3
        • * * 4
        • * * 4
        • * * 4
        • * * 4
        • * * 4[]
      • * 3 []
    • ()2 []
  • , , 1 []

Synoptic scheme of speech verbs in the CT 14. 15. 7. CT microstructure(CT of CI versusCT of UV)

  • Title term/ Verb
  • Definition genus-species (for a term) or close to encyclopaedic (for a concept)/ interpretation
  • Relations genus-species and synonymic/ +manner of action relations and relations between verb and other parts of speech

16. 17. 8. Semantic relations in CT

    • Hierarchical interverbal relations , orhyponymy,derivational hyponymy in particular, represented by hyperonyms, hyponyms, and verbs of manner of action (VMA).
    • Same-level interverbal relations , i.e.,synonymy(represented by complete (absolute) synonyms, in particular, by phonetic variants of verbs, stylistic and derivational synonyms) as well as antonymy (represented by antonyms).
    • Relations between verb and other parts of speech , based on verbal derivation within parts of speech and valence potential of the verb.

18. Example of a CT entry 19. Example of a CT entry 20. 9. Application of CT

  • As an inquiry system
  • For teaching purposes

21. 10. Audience

  • The Specialised Thesaurus of Computer Ideography is intended for:
  • Specialists in philology
  • Students of philology
  • The Computer Thesaurus of Ukrainian Verbs has a wider audience: thanks to its specification, it can be used as a multi-level information system and as a base for further linguistic research.

22. 11. How to use CT

  • Computer program
  • Paper project
  • Computer version on the linguistic portal MOVA.info in the dictionary section

23. Thank you! Contact information: Olena SirukLaboratory for Computational LinguisticsNational Taras Schevchenko University of Kyiv olebosi @ gmail.com