现代汉语语义词典规格说明书 - ccl.pku.edu.cnccl.pku.edu.cn/doubtfire/papers/2003_semdict_specification... · 息处理用汉语语义词典” (陈力为, 袁琦,

  • Upload
    phamanh

  • View
    383

  • Download
    13

Embed Size (px)

Citation preview

  • Journal of Chinese Language and Computing, 13 (2) 159-176 159

    1

    2

    1

    1

    2

    [email protected]; [email protected]; [email protected]

    Submitted on 8 March , 2003, Revised and Accepted on 16 May, 2003

    (SKCC)

    6.6

    1.

    80

    Wordnet (Fellbaum, 1998)Mindnet (Richardson, 1998)Framenet (Fillmore, 1998) EDR

    973 G1998030507-4G1998030507-1 863

    2002AA117010-08

    mailto:[email protected]:[email protected]:[email protected]

  • Hui Wang, Weidong Zhan, Shiwen Yu 160

    SenseWeb 905 (, , 1995)Hownet, 1999CCD, , 2002Chodorow 1985

    Ide 1993 1998

    1994

    SKCC1996

    1998 863

    4.9

    1998IBMIntelFujitsu, Toshiba,

    NTT, Canon, Sail-labs 20

    4 2001 11 973

    6.6 SKCC

    2.

    2.1

    SKCC 2003

  • The Specification of The Semantic Knowledge-base of Contemporary Chinese 161

    14,663 1993

    1.7

    6.5

    Microsoft Forxpro 6.0 1

    11

    8 15 16 1

    4

    37485 15

    567 15

    185 15

    203 15

    235 15

    20798 16

    3557 15

    315 15

    993 15

    996 11

    108 11

    65442 8

    1 SKCC

    2.2 SKCC

    19831987

    199819961998(1998)19981999

    SKCC

  • Hui Wang, Weidong Zhan, Shiwen Yu 162

    4

    Wordnet CCD

    :

    A

    5 2

    Wordnet

    B Wordnet

    C 7 5 29

    Wordnet synset Wordnet

    synset Artifact 18 synsetAntiquityblockcoveringcreationdecorationdrugenclosureexcavationexcavationfacilityfixturefloatinstrumentalitytoywaykeepsakenotionprizeSubstance 18 synsetsmaterialallergenmixtureatommoleculechemical_element

    activatorinhibitorcompoundfuelmedium

    eavenfluidsludgerefrigerantresidue

    poisonchemical_irritantsolidemanation

  • The Specification of The Semantic Knowledge-base of Contemporary Chinese 163

    6.6

    2.2.1 Noun

    1 entity

    1.1 organism 1.1.1 person

    1.1.1.1 individual 1.1.1.1.1 profession 1.1.1.1.2 identity 1.1.1.1.3 relation 1.1.1.1.4 name

    1.1.1.2 group 1.1.1.1.1 organization 1.1.1.1.2 society

    1.1.2 animal 1.1.2.1 beast 1.1.2.2 bird 1.1.2.3 fish 1.1.2.4 insect 1.1.2.5 reptile

    1.1.3 plant 1.1.3.1 tree 1.1.3.2 grass 1.1.3.3 flower 1.1.3.4 crop

    1.1.4 microbe 1.2 object

    1.2.1 artifact 1.2.1.1 building 1.2.1.2 clothes 1.2.1.3 food 1.2.1.4 drug

  • Hui Wang, Weidong Zhan, Shiwen Yu 164

    1.2.1.5 works 1.2.1.6 software 1.2.1.7 asset 1.2.1.9 bill 1.2.1.10 certificate 1.2.1.11 (symbol) 1.2.1.12 material 1.2.1.13 instrument

    1.2.1.13.1 tool 1.2.1.13.2 vehicle 1.2.1.13.3 weapon 1.2.1.13.4 furniture 1.2.1.13.5 musical-instrument 1.2.1.13.6 electricity 1.2.1.13.7 stationery 1.2.1.13.8 sports- instrument

    1.2.2 natural object 1.2.2.1 celestial body 1.2.2.2 weather 1.2.2.3 geography

    1.2.2.3.1 land 1.2.2.3.2 water 1.1.2.2.4 mineral 1.1.2.2.5 element 1.1.2.2.6 substance

    1.2.3 excrement 1.2.4 shape

    1.3 part 1.3.1 body-part 1.3.2 object-part

    2 abstraction 2.1 attribute

    2.1.1 measurable 2.1.2

    2.1.2.1 property_of_human 2.1.2.2 description_of_event

  • The Specification of The Semantic Knowledge-base of Contemporary Chinese 165

    2.1.2.3 property_of_object 2.1.3 color

    2.2 information 2.3 (field ) 2.4 rule 2.5 physiological_state 2.6 psychol feature

    2.6.1 feelings 2.6.2 cognition

    2.7 motivation 3 process

    3.1 event 3.2 natural phenomenon

    3.2.1 visible phenomenon 3.2.2 audible phenomenon

    4 time 4.1 specific time 4.2 relative time

    5 space 5.1 location 5.2 direction

    2.2.2 Adjective

    1 description of event 2 (property of object)

    2.1 measurable value 2.1.1 concentration 2.1.2 temperature 2.1.3 speed 2.1.4 (length) 2.1.5 (height) 2.1.6 (width) 2.1.7 (depth) 2.1.8 (thickness) 2.1.9 (rigidity)

  • Hui Wang, Weidong Zhan, Shiwen Yu 166

    2.1.10 (humidity) 2.1.11 (degree of finish) 2.1.12 (degree of tightness) 2.1.13 (size) 2.1.14 (value)

    2.2 unmeasurable value 2.2.1 (vision) 2.2.2 (tactility) 2.2.3 tone 2.2.4 (taste) 2.2.5 quality 2.2.6 (content) 2.2.7 (shape)

    2.3 (color) 3 (property of human)

    3.1 (age) 3.2 (character) 3.3 (relation) 3.4 (condition)

    4 (property of space) 4.1 (one dimension) 4.2 (two dimension) 4.2 (three dimension)

    5 (property of time)

    2.2.3 (Verb)

    1 state 2 emotion/ cognition 3 event

    3.1 change 3.2 weather 3.3 bodily care and functions 3.4 perception 3.5 consumption 3.6 motion

  • The Specification of The Semantic Knowledge-base of Contemporary Chinese 167

    3.7 creation 3.8 contact 3.9 possession 3.10 communication 3.11 competition 3.12 social behavior 3.13 other event

    2.2.4 (adverb)

    1 degree 2 range 3 time 4 location 5 frequency 6 manner 7 negation

    2.2.5 (Numeral)

    1 cardinal number

    1.1 1.2

    2 ordinal number 3 amount 4 (auxiliary) 2.3

    2.3.1

    8 1~4

  • Hui Wang, Weidong Zhan, Shiwen Yu 168

    24 123455

    chang2shi2

    chi3zi5

    2 nva

    2 IN

    LV

    2 ()

    (huo4)(he2)

    (he2ji4)

    (he2ji5)

    (he2)(huo4)

    ABC

    1

    23 AB

    C ABC

    123

    A1A2B1B2

    4 v

    n

    2

    1

    2

  • The Specification of The Semantic Knowledge-base of Contemporary Chinese 169

    10

    20

    /

    /

    WORD 40 quiet

    dirty and messy

    Ecat 40

    A!A+C+!A(!)

    20 ~

    /

    WORD Ecat42002 3

    2.3.2

    2 1 2 3

    (

    ,19951998)

    3 1 2

    3

  • Hui Wang, Weidong Zhan, Shiwen Yu 170

    1 1

    0

    20 /

    agent

    /

    20

    /

    object

    /

    /

    20

    / 30

    / /

    // ~~ **

    4

  • The Specification of The Semantic Knowledge-base of Contemporary Chinese 171

    2.3.3

    2 1 2

    19941995

    1/

    *

    *

    2

    /

    /

    /

    20

    20

    /

    20

  • Hui Wang, Weidong Zhan, Shiwen Yu 172

    2.3.4

    2 12

    1

    /

    2 /

    20

    /

    /

    /

    20

    /

    /

    3

    SKCC NLP

    SKCC 6.6

  • The Specification of The Semantic Knowledge-base of Contemporary Chinese 173

    973

    [1] Bake, C.F, C.J. Fillmore, and John B.Lowe, 1998, The Berkeley FrameNet Project, In Proceedings of COLING'98. pp. 86-90

    [2] Christiane Fellbaum. ed. WordNet: an electronic lexical database. Mass: MIT Press. 1998

    [3] Richardson, Stephen D., 1998, MindNet: acquiring and structuring semantic information from text, In Coling98. pp. 1098-1102

    [4] ..1995.. :

    [5] . 1996.. . :

    [6] 1998. 2 [7] . 1998. . : [8] . 1998.. .. 3 [9] . 1999.Hownet. http:// www.keenage.com

  • Hui Wang, Weidong Zhan, Shiwen Yu 174

    [10]. 1987... : [11]. 1998. .. 3 [12]..1995.. . : .

    pp..1-7

    [13]. 1983.. [14].. 1998. .1998

    . . pp 361-367 [15].. 2002.. CCD . No.4. pp

    12-20 [16]. 2003..

    2 . [17]. 1994. .. 4 [18]. 1995.. ..

    . pp29-58 [19]. 1998...

  • The Specification of The Semantic Knowledge-base of Contemporary Chinese 175

    WORD

    v 1 sprout

    v 2 2 build

    v 3 present

    v 3 tell

    2 SKCC

    WORD

    n 0 tiger

    n 1 1 1 / leg

    n 2 2 1 leg

    n 2 / opinion

    3 SKCC

    WORD

    a 1 1 big

    a 2 1 / heavy

    a 1 / crowded

    a 2 warm

    4 SKCC

  • Hui Wang, Weidong Zhan, Shiwen Yu 176

    The Specification of The Semantic Knowledge-base

    of Contemporary Chinese

    Hui Wang1, Weidong Zhan2, Shiwen Yu1

    1Institute of Computational Linguistics, Peking University, Beijing 100871 2Dept. of Chinese Language & Literature, Peking University, Beijing 100871

    [email protected]; [email protected]; [email protected]

    Abstract: The Semantic Knowledge-base of Contemporary Chinese (SKCC) is a large machine-readable dictionary developed by the Institute of Computational Linguistics and

    Chinese Department of Peking University. It provides a large amount of semantic

    information such as semantic hierarchy and collocation features for 66,539 Chinese

    words and their English counterparts. Its semantic classification system represents the

    latest progress in Chinese linguistics and language engineering. The descriptions of

    semantic attributes are fairly thorough, comprehensive and authoritative. The paper

    introduces the outline and specification of SKCC, and indicates that, as a large scale

    fundamental semantic resource of Chinese, SKCC will not only provide valuable semantic

    knowledge for Chinese language processing, but also play an important role in Chinese

    lexical semantics and computational lexicography research.

    Key words: Semantic knowledge-base, lexical semantic, computational lexicography, semantic hierarchy, valence information, Chinese language processing

    mailto:[email protected]:[email protected]:[email protected]