24
Comparing syntactic Comparing syntactic semantic patterns and semantic patterns and passages in Interactive passages in Interactive Cross Language Information Cross Language Information Access Access (iCLEF at the University (iCLEF at the University of Alicante) of Alicante) Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es Departamento de Lenguajes y Sistemas Informáticos Universidad de Alicante

Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

  • Upload
    maxine

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

Comparing syntactic semantic patterns and passages in Interactive Cross Language Information Access (iCLEF at the University of Alicante). Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es Departamento de Lenguajes y Sistemas Informáticos - PowerPoint PPT Presentation

Citation preview

Page 1: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

Comparing syntactic semantic Comparing syntactic semantic patterns and passages in Interactive patterns and passages in Interactive Cross Language Information AccessCross Language Information Access(iCLEF at the University of Alicante)(iCLEF at the University of Alicante)

Borja Navarro, Fernando Llopis, Miguel Ángel Varó

{borja, llopis, mvaro}@dlsi.ua.esDepartamento de Lenguajes y Sistemas Informáticos

Universidad de Alicante

Page 2: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 2

OutlineOutline

1. Introduction and objectives

2. Method of interaction I: passages

3. Method of interaction II: syntactic semantic patterns

4. Description of the experiment

5. Results and conclusions

Page 3: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

Introduction and objectivesIntroduction and objectives

Page 4: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 4

• Important aspect in Interactive Cross Language Information Access is the way in which the system shows the relevant information to the user– Only with this information, the user must decide

if the document is relevant or not

• A key point for the correct selection of documents and for futures refinements of the query

IntroductionIntroduction

Page 5: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 5

• Problem:– Multilingualism:

• The language of the query and the language of the documents are different

• Main solutions:– To show the information in the language of the

query– To show the information in the language of the

document

IntroductionIntroduction

Page 6: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 6

• iCLEF 2002 (Llopis et al. 2002):– To use a system based on passage for the

interaction with the user– This approach is better than the use of the

whole document– Main problem: many passages was unreadable

for the users due to problems with the machine translation of the passages

IntroductionIntroduction

Page 7: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 7

• Our aim at iCLEF 2003:– Improve this approach in two aspects:

• The interaction speed: the time consuming by the user between the uploading of the passage to the decision about its relevance

• The recall and precision in the selection of the relevant documents

– But avoiding the use of Machine Translation

– We have defined an interactive approach based on syntactic-semantic patterns (Navarro et al. 2003)

IntroductionIntroduction

Page 8: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 8

Objectives at iCLEF 2003Objectives at iCLEF 2003

• To know if it is possible that a user decide if a document is relevant or not only with the syntactic semantic patterns extracted from the passages

• To know if the interaction based on syntactic semantic patterns is better than the interaction based on passages only

• To know if the use of syntactic semantic patterns is better than the machine translation of the passages

Page 9: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

Method I:Method I:Approach based on passagesApproach based on passages

Page 10: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 10

Method I: passagesMethod I: passages

• Developed and presented at iCLEF 2002 (Llopis et al. 2002)

• Passage: a relevant piece of text of a document

• With the use of passages, only the most relevant information of a document is shown to the user– Not the whole document

Page 11: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

Method II:Method II:Approach based on syntactic Approach based on syntactic

semantic patternssemantic patterns

Page 12: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 12

Syntactic-semantic patternSyntactic-semantic pattern

• Linguistic pattern formed by three components:– A verb with one sense (necessary)– The subcategorization frame of the sense– The selectional preferences of each argument

(semantic features)

Page 13: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 13

Automatic extraction of Automatic extraction of patternpattern• Parser MiniPar• Steps:

– Look for a verb– Look for a noun at the left of the verb– Look for a noun or preposition plus noun at the

right of the verb– Look for a noun or preposition plus noun at the

right of the previous noun

Page 14: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 14

Primakov suggested that the Administration was using the Ames arrest to score domestic political points, to punish Russia for its independent stance on the conflict in Bosnia-Herzegovina and to provide convenient excuse for cutting American aid to Russia, according to journalists who attended.

• Primakov suggest Administration

• administration use Ames arrest

• administration score domestic point

• Primakov punish Russia for its stance

• Primakov provide convenient excuse for

• Primakov cut American aid to Russia according to journalist

• journalist attend

ExampleExample

Page 15: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 15

Automatic extraction of Automatic extraction of patternpattern• The patterns are extracted from the passages• The patterns show only the basic

information of each sentence:– the most important words: the verb and the

arguments,– the syntactic and semantic relations between

them

• It is enough to know the topic of a document and to decide about its relevance

Page 16: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 16

Automatic extraction of Automatic extraction of patternpattern

• Hypothesis:– It is possible to decide about the relevance of a

document only with the patterns– For a searcher with passive language abilities in

the foreign language, it is more easy to process the patterns than the complete passage, because he put the attention only in the main words of each sentence

Page 17: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

Description of the experimentDescription of the experiment

Page 18: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 18

ExperimentExperiment

• Cross-language document selection

• Search group: Spanish with passive language abilities in English

• Information Retrieval System: IR-n system (Llopis 2003)– It uses the complete query– From each query, extract 25 (possible) relevant

documents

Page 19: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 19

ExperimentExperiment

• Each retrieved document is showed to the user:– System 1 shows only passages (in English)– System 2 shows the patterns extracted from the

passages (in English)

• With this, the user must decide if the document is relevant or not

• Through HTML interface, we save the relevant judgment and the time consuming

Page 20: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

Results and conclusionsResults and conclusions

Page 21: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 21

F-alpha averageF-alpha average

SYSTEM F-alpha average

Passages 0.45416703125

Patterns 0.43622984375

Page 22: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 22

Time consumingTime consuming

0

2000

4000

6000

8000

10000

12000

1 2 3 4 5 6 7 8

Searcher

Tim

e c

on

su

min

g

Total Passages patterns

Page 23: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

iCLEF 2003 23

ConclusionsConclusions

• Only with the syntactic semantic patterns it is possible to decide about the relevance of a document in a foreign language (if the searcher has passive abilities in this language)

• The time consuming in the judgment decision is less with the patterns than with the passages in most of the cases

• With the syntactic semantic patterns and/or passages it is possible to avoid the use of machine translation systems for users with passive abilities in the language of the document.

Page 24: Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, mvaro}@dlsi.ua.es

Comparing syntactic semantic Comparing syntactic semantic patterns and passages in Interactive patterns and passages in Interactive Cross Language Information AccessCross Language Information Access(iCLEF at the University of Alicante)(iCLEF at the University of Alicante)

Borja Navarro, Fernando Llopis, Miguel Ángel Varó

{borja, llopis, mvaro}@dlsi.ua.esDepartamento de Lenguajes y Sistemas Informáticos.

Universidad de Alicante