78

Pragmatikako erlaziozko diskurtso-egitura: deskribapena

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Pragmatikako erlaziozko diskurtso-egitura:deskribapena eta bere ebaluazioahizkuntzalaritza konputazionalean

A description of pragmatics rhetorical structureand its evaluation in computational linguistics

Candidate: Mikel Iruskieta Quintian

Advisors: Arantza Díaz de Ilarraza SánchezMikel Lersundi Ayestaran

University of the Basque Country (UPV/EHU)

February 26, 2014

Page 2: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

2 / 75

Page 3: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

3 / 75

Page 4: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction

Discourse structure phenomena in CL

• CL works on discourse structure:I Referential: co-reference disambiguation (Mitkov, 2002;

Recasens et al., 2010) in Basque (IXA group) (Goenaga et al.,2012; Ceberio et al., 2009)

I Relational: rhetorical annotation (Asher and Lascarides, 2003;Mann and Thompson, 1988) in Basque (Barrutieta et al.,2002, 2001) and in IXA group (Iruskieta et al., 2013b, 2011b)

• Can we explain discourse structure with only explicit andsemantic relations? Examples from van Dijk (1980b)

(1) Tiketa erosi dut eta nire aulkira joan naiz.I bought a ticket and went to my seat. (Macro-structure)

(2) #Peter zinemara joan zen. Berak begi urdinak ditu.#Peter went to the cinema. He has blue eyes. (Improvable)

(3) John gaixorik dago. Gripea dauka.John is sick. He has the �u. (Semantic)

(4) Johnek ezin du etorri. Gaixorik dago.John can't come. He is sick. (Semantic, Pragmatic)

4 / 75

Page 5: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction

Theories of discourse structures in CL (Stede, 2008)

a. Strong formalization based on syntactic or semantic theoriesI Based on sentence level and few analysis of corpus with real

texts

• SDRT (Asher and Lascarides, 2003)• D-LTAG (Forbes et al., 2003)• LDM (Polanyi, 1988)

b. Real text corpora and analysis of di�erent phenomenaI Shortcomings in formalization

• RST (Mann and Thompson, 1987)• PDTB (Miltsakaki et al., 2004)

5 / 75

Page 6: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction

Why an RST TreeBank for Basque?

• General reasons (Taboada and Mann, 2006)I Linguistic descriptionI Real texts in di�erent languages

• RST TB, SFU Corpus (Taboada and Renkema, 2011), RSTSpanish TB (da Cunha et al., 2011a), Potsdam Corpus(Stede, 2004), TCC (Pardo and Nunes, 2006) and Rhetalhocorpus (Pardo and Seno, 2005)

I Several applications based on RST:• automatic text creation (Bouayad-Agha, 2000),• automatic text summarization (Marcu, 2000b),• machine translation (Ghorbel et al., 2001),• assessment of written texts (Burstein et al., 2003),• information retrieval (Haouam and Marir, 2003),

• Speci�c to this thesis:I No annotation needed at other linguistic levelsI Free and available tools for annotation and evaluation

• (RSTTool, RhetDB, RSTeval)I Building (automatically) (Marcu, 2000b) and evaluating

RS-trees is easier than graphs

6 / 75

Page 7: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction Aims

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

7 / 75

Page 8: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction Aims

Main goals

Three main goals:

i) To describe a rhetorical structure of Basque texts bymeans of corpus annotation

ii) To establish an annotation method

iii) To validate the annotation method and analyze typicalcases of annotators' disagreement

8 / 75

Page 9: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction Aims

Other aims

• Methodological decisions:− to analyze in�uence of the macro-structure in

micro-structure− to avoid circularity− to study a qualitative evaluation− to propose some guidelines for the resolution of

annotation disagreements

• Gold Standard:I in segmentation:− for a Basque segmenter

I in macro-structure:− to analyze indicators

I in rhetorical relations:− to signal annotation

• Disseminating the results

9 / 75

Page 10: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction RS-structure in Basque studies and in CLs

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

10 / 75

Page 11: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction RS-structure in Basque studies and in CLs

Rhetorical structure in Basque

Explicit RRs Implicit RRs

Formal Euskaltzaindia (1990, 1994); Aierbe (2008);Urrutia (2008)

Discourse Esnal (2008); Alberdi and Landa (2013); García(2010); Ibarra (2013); Larringan (1995)

• Little attention to implicit RRs in BasqueI Implicit RRs are necessary to describe RS and carry out some

tasks in CLsI Most of the RRs are implicit (Taboada, 2006)

• 66.67% implicit RRs in GMB0401

• In CLs is very important to describe all the RRs to apply inseveral applications (main goal of IXA group)

11 / 75

Page 12: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction RS-structure in Basque studies and in CLs

Abstracts of a scienti�c text [GMB0401]

ORIGINAL

Perfil del usuario de la zona ambulatoria del Servicio deUrgencias del Hospital de GaldakaoThe profile of the users from the emergency department from Galdakao´s Hospital

I. Bengoetxea Martínez

Médico de Familia.

RESUMEN

El número de asistencias urgentes crece constantemente, en España el ritmo decrecimiento se ha establecido en torno al 4% anual. Se estima que el 80% de los usuariosacuden por iniciativa propia a los servicios de urgencia y que el 70% de las consultas sonconsideradas leves por el personal sanitario. Realizar estudios epidemiológicos quedescriban las características de los usuarios y los motivos de la sobreutilización de losservicios de urgencia hospitalarios pueden resultar interesante desde el punto de vista de laplanificación sanitaria. Por lo que hemos creído oportuno realizar un estudio para conocer elperfil del usuario de urgencias del hospital de Galdakao.Resultados: El perfil del usuario sería el de un varón (51,4%) de mediana edad (43,2 años)que consulta por patología traumática (50,5%) y procede de la comarca sanitaria cercana alhospital.Palabras clave: Usuarios de urgencias, sobreutilización, perfil de usuario.

SUMMARY

The number of urgent cares grows continuosly, the rate of growth in Spain has been setaround the 4% annually. According to the estimates, the 80% of the users, go by their owninitiative to the emergency department, and the 70% of the surgeries are considered slightsby the health staff. It could be interesting from the sanitary planning poin of view, to carryout epidemiological studies which describe the users characteristics, and the reasons forthe overuse of the hospital emergency department. We have seen convenient to archieve astudy to know the profile of the users from the emergency department from Galdakao’sHospital.Results: The general profile of users would be, man (51.4%) of middle age (43.2%) whoconsults because of traumatologic phatologies (50.5%) and who comes from the sanitaryarea near the hospital.Key words: Emergency department users, overuse, users profile.

LABURPENA

Larrialdi zerbitzuetako asistentzia medikuen kopurua gehituz doa etengabe, estatuespañolean igoera hau urteko %4an kokatzen da. Erabiltzaileen %80ak bere kabuzerabakitzen dute larrialdi zerbitzu batetara jotzea eta kontsulta hauen %70a larritasungutxikotzat jotzen dituzte zerbitzu hauetako medikuek. Zerbitzu hauen perfila azaltzen dutenikerketa epidemiologikoak egitea baliagarria izan daiteke osasun planifikazioaren aldetik,hau dela eta, Galdakaoko ospitaleko larrialdi zerbitzuaren erabiltzaileen perfil deskriptibobat egitea aproposa iruditu zaigu.Emaitzak: Erabiltzaileen perfil orokorra ondokoa dela esan daiteke: gizonezkoa (%51,4),heldua (43,2 urteko media) eta patologia traumatologikoagatik kontsultatzen duena(%50,5). Galdakao inguruko herrietatik datorrelarik gehiengoa.Hitz garrantzitsuak: Larrialdi zerbitzuen erabiltzaileak, gainerabilpena, erabiltzaileen perfila.

Introducción

El número de asistencias urgentes crececonstantemente. Se ha estimado quemás de la mitad de la población utilizaalguna vez los servicios de urgencia a lolargo de un año (1). En España el ritmo decrecimiento se ha establecido en torno al4% anual (2). Dicho crecimiento tambiénqueda patente en el territorio de laComunidad Autónoma Vasca.Los motivos propuestos para explicar estecrecimiento constante son: el envejeci-miento de la población, la accesibilidad alos servicios de urgencia, la confianza enla atención hospitalaria, la demora de laatención especializada y la cultura de lainmediatez entre otros (3).Se estima que el 80% de los usuariosacuden por iniciativa propia a los serviciosde urgencia y que el 70% de las consultasson consideradas procesos leves por elpersonal sanitario (4).Diversos estudios han constatado queciertos determinantes externos como elnivel socioeconómico, los cambiosatmosféricos, las epidemias de gripe, losniveles de contaminación y/o polinizaciónambiental, los ciclos lunares o los even-tos deportivos televisados condicionanuna fluctuación de la demanda asistencial(5).Realizar estudios epidemiológicos que des-criban las características de los usuarios ylos motivos de la sobreutilización de losservicios de urgencia hospitalarios puederesultar interesante desde el punto de vistade la planificación sanitaria. Hasta la fechano se dispone de estudios similares ennuestro medio laboral, por lo que se hacreído oportuno realizar un estudio quedescriba las características de los usuariosque acuden a los servicios de urgencia y seetiquetan como " de poca gravedad" por elpersonal de triaje, ya que son en principiola causa del aumento asistencial anterior-mente citado.El objetivo general es conocer el perfil delusuario de la zona ambulatoria (pacientesetiquetados como "no graves" en el con-

[7] Gac Med Bilbao 2004; 101: 115-120 115

Correspondencia:Dra. Itsaso Bengoetxea MartínezAtutxa Saiburua, 2 - 3º48330 - LEMOA - BizkaiaEnviado 23/01/2004. Aceptado 8/09/2004

12 / 75

Page 13: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction RS-structure in Basque studies and in CLs

Multilingual texts extraction [GMB0401]

13 / 75

Page 14: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction RS-structure in Basque studies and in CLs

Segmentation of discourse units (EDUs) [GMB0401]

• Adjunct verb clause-based segmentation (To�loski et al., 2009)

14 / 75

Page 15: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Introduction RS-structure in Basque studies and in CLs

Rhetorical structure of a text [GMB0401]

• A modular and incremental annotation (Pardo, 2005)

• Is there any correlation between the CU and the RRs?

15 / 75

Page 16: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

16 / 75

Page 17: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Preparation phase

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

17 / 75

Page 18: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Preparation phase

Problems and solutions for RS annotation

• Discourse annotation is complex (Hovy, 2010)I Solution in CL: corpus annotation

• Consistent: enough to support machine learning• Descriptive: enough to work with NLP advanced applications

18 / 75

Page 19: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Preparation phase

The corpus

• The Basque RST TreeBank (Iruskieta et al., 2013a):I Short texts, but with complex RSI Abstracts: structured texts (Swales, 1990; Ripple et al., 2011)I Di�erent domainsI Parallel texts (Iruskieta et al., 2014a; da Cunha and Iruskieta,

2010; Iruskieta and da Cunha, 2010b)

Domain Sub-corpus Texts Sentences Words

Medicine GMB 20 198 3010Terminology TERM 20 253 5664Science ZTF 20 352 6892

Total 60 803 15566

19 / 75

Page 20: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Preparation phase

Description of the annotators and the super-annotator

• 4 linguists who had experience annotating texts at otherlanguage levels (morphologic, syntactic and semantic)

I RST and RSTTool were introduced to 3 linguistsI No previous training phase and no manual provided based on

signals

• To avoid circularity between RS and signaling• Because qualitative description was more important than

reliability• Triple- (80%) and double-annotated (20%) corpus

• Is there any way to gain reliability if previous trainingand manuals are avoided?

I A �super-annotator� (Hovy, 2010)

• Experienced in RST• Criteria to harmonize annotations were established

20 / 75

Page 21: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Preparation phase

Description of the annotators and the super-annotator

• 4 linguists who had experience annotating texts at otherlanguage levels (morphologic, syntactic and semantic)

I RST and RSTTool were introduced to 3 linguistsI No previous training phase and no manual provided based on

signals

• To avoid circularity between RS and signaling• Because qualitative description was more important than

reliability• Triple- (80%) and double-annotated (20%) corpus

• Is there any way to gain reliability if previous trainingand manuals are avoided?

I A �super-annotator� (Hovy, 2010)

• Experienced in RST• Criteria to harmonize annotations were established

20 / 75

Page 22: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Preparation phase

Di�erent interpretations of GMB0401

21 / 75

Page 23: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Preparation phase

Our annotation method

22 / 75

Page 24: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Segmentation

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

23 / 75

Page 25: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Segmentation

Segmentation guidelines

• Segmentation guidelines con�ate RST and Basque clausecombining (To�loski et al., 2009; Salaburu, 2012; Artiagoitiaet al., 2003)

Clause type EDU Example

Perpaus independen-tea `an independentsentence'

Yes [Whipple (EW) gaixotasunak hesteei eragiten die bereziki.]1GMB0503

Perpaus nagusi koordina-tua `a main clause, partof sentence'

Yes [pT1 tumoreko 13 kasuetan ez zen gongoila inbasiorik hauteman;]1[aldiz, pT1 101 tumoretatik 19 kasutan (18.6%) inbasioa hauteman

zen, eta pT1c tumoreen artetik 93 kasutan (32.6%).]2Aditz jokatudun adjuntuperpausa `�nite adjunctclauses'

Yes [Haien sailkapena egiteko hormona hartzaileen eta c-erb-B2 onko-genearen gabeziaz baliatu gara,]1 [ikerketa anatomopatologikoetan

erabili ohi diren zehaztapenak direlako.]2 GMB0702Aditz jokatugabedunadjuntu perpausa `non-�nite adjunct clauses'

Yes [Ohiko tratamendu motek porrot eginez gero,]1 [gizentasun eriga-rriaren kirurgia da epe luzera egin daitekeen tratamendu bakarra.]2GMB0502

Erlatibo ez-murriztailea`non-restrictive relativeclause'

Yes [Dublin Hiriko Unibertsitateko atal bat da Fiontar,]1 [zeinak Eko-nomia, Informatika eta Enpresa-ikasketetako Lizentziatura ematenbaitu, irlanderaren bidez.]2 TERM23

24 / 75

Page 26: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Central unit

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

25 / 75

Page 27: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Central unit

Harmonization of CU and its indicators

• Following van Dijk (1980a) texts ought to be coherent atI local level: (between words and) between clauses (or RRs)I global level: main topic (CU) with other thematic events (RRs)

• But the coherence of CU with other units (or RRs) is notconsidered in RST

I in the annotation guidelines (Carlson et al., 2001)I in the evaluation method (Marcu, 2000a)

• CU annotation guidelines

i) Topic or thesis statementii) Purposeiii) Methodiv) Resultsv) Conclusions

• Description of some CU �indicators� (Paice, 1980)I Verb clustering with SUMO-category from MCR synsetI Noun clustering with the WordNet synset

26 / 75

Page 28: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Rhetorical relations

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

27 / 75

Page 29: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Rhetorical relations

Our evaluation method: qualitative by Iruskieta et al.(2014a)

• Quantitative RS-tree evaluation method (Marcu, 2000a) bymeans of EDUs, spans, nuclearity and RRs

I Shortcomings• Evaluated factors (nuclearity and RRs) are not independent

(van der Vliet, 2010)• RRs are not (well) compared (Iruskieta et al., 2013b)

I But well formalized (automated by Maziero and Pardo (2009))• Appropriate qualitative measurement

I Independent factorsI Qualitative description of

• agreement (RCA, RA, RC and R)• disagreement (annotators interpretations and language forms)

• Measurement of RSI with the same language: Basque-Basque (Iruskieta et al.,

2013a)I in parallel texts: Basque-Spanish (da Cunha and Iruskieta,

2010) and Basque-English-Spanish (Iruskieta et al., 2014a)

28 / 75

Page 30: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Rhetorical relations

Our evaluation method: decision trees

• Qualitativeagreement

• Qualitativedisagreement

29 / 75

Page 31: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Rhetorical relations

RR harmonization guidelines: a proposalRelation by relation Text by text

Distinguish annotators, search consistency Top-down revision (CU and nuclearity)But cannot edit the RRs First, the RRs linked to CUCannot decide nuclearity Then, incremental and modular

• From confusion matrix(3)

I Scale of informativeness: ELABORATION 47.21%• Scale of informativeness is necessary (Kortmann, 1991)

I Not all the relations needed (Mol, 2005) and some of themwere adapted

RR ANTITHESISmost CONCESSION

informative CONTRAST↑ CONDITION↑ MEANS ENABLEMENT PURPOSE MOTIVATION↑ CAUSE RESULT↑ SEQUENCE SOLUTIONHOOD↑ JUSTIFY INTERPRETATION EVALUATION EVIDENCE↑ ELABORATION BACKGROUND RESTATEMENT SUMMARY↑ LIST DISJUNCTION↑ CONJUNCTIONRR CIRCUMSTANCEleast PREPARATION

informative JOINT

30 / 75

Page 32: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Signaling the RRs

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

31 / 75

Page 33: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Signaling the RRs

Signaling the RRs• Portuguese (B): 100 texts 4100 RRs (Pardo and Nunes, 2004)

• Spanish: 84 texts 1275 RRs (da Cunha, 2013)

• English: 40 texts 1304 RRs (Taboada and Das, 2013)• Basque: 60 texts 1292 RRs

I Annotation tool: Rhetorical Data-Base (Pardo, 2005)• Relation by relation• Searches can be done to maintain consistency

I What is signaling?a) DM annotationb) Annotation of the frequent forms (Taboada and Das, 2013)Types Examples Translations

DMs hala ere, horrenbestez, izan ere however, thus, in factMorphological -t(z)eko, -lako, -gatik, bait- �purpose morpheme�, �cause morphemes�Lexical lortu, helburu, emaitza to achieve, aim, resultEntity �izen bereziak� �proper names�Semantic handitu-txikitu, ugari-gutxi, irtenbide-arazo,

patologia-gaixotasun, legamia-onddoto increase-to decrease, abundant-few,solution-problem, pathology-disease, leaven-fungus

Syntactic �Galde-perpausa eta erantzun perpausa� �question sentence and reply�Graphical-numerical 1. 2., a) b)Double signals -t(z)eko asmoz, era honetan (...) lortu with the intention of, to achieve (...) in this

way

I If signals can be from any language form, is annotation

more reliable?

32 / 75

Page 34: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Signaling the RRs

Signaling the RRs• Portuguese (B): 100 texts 4100 RRs (Pardo and Nunes, 2004)

• Spanish: 84 texts 1275 RRs (da Cunha, 2013)

• English: 40 texts 1304 RRs (Taboada and Das, 2013)• Basque: 60 texts 1292 RRs

I Annotation tool: Rhetorical Data-Base (Pardo, 2005)• Relation by relation• Searches can be done to maintain consistency

I What is signaling?a) DM annotationb) Annotation of the frequent forms (Taboada and Das, 2013)Types Examples Translations

DMs hala ere, horrenbestez, izan ere however, thus, in factMorphological -t(z)eko, -lako, -gatik, bait- �purpose morpheme�, �cause morphemes�Lexical lortu, helburu, emaitza to achieve, aim, resultEntity �izen bereziak� �proper names�Semantic handitu-txikitu, ugari-gutxi, irtenbide-arazo,

patologia-gaixotasun, legamia-onddoto increase-to decrease, abundant-few,solution-problem, pathology-disease, leaven-fungus

Syntactic �Galde-perpausa eta erantzun perpausa� �question sentence and reply�Graphical-numerical 1. 2., a) b)Double signals -t(z)eko asmoz, era honetan (...) lortu with the intention of, to achieve (...) in this

way

I If signals can be from any language form, is annotation

more reliable?

32 / 75

Page 35: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Delivery phase

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

33 / 75

Page 36: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Methodology Delivery phase

Delivery phase (Iruskieta et al., 2013a)

• First rhetorical structure annotated corpus in Basque

• The Basque RST TreeBank's delivery phase (Ide andPustejovsky, 2010)

• Innovations: a number of operations can be carried out withthis annotated corpus

34 / 75

Page 37: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

35 / 75

Page 38: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Segmentation

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

36 / 75

Page 39: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Segmentation

Segmentation results

Measure State of the Art Basque

Manual annotation Kappa > 0.8 0.6337Segmenter F-score 73% - 85% 57%

• Discourse parsers: EDUs (F 1)I Machine learning (French): 73% (Afantenos et al., 2010)I DiSeg, rule based (Spanish): 80% (da Cunha et al., 2010a)

• Preliminary results in Basque: end boundaries (F 1)I Transformed segmenter: 66.94% (Iruskieta et al., 2011a)I Constraint Grammar-based rules: 69.69%I Syntactic dependency based heuristics: 80.68%

37 / 75

Page 40: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Segmentation

Granularity and RR agreement

• Less agreement at intra-sentential than at sententionalagreement (−13.74%), but more agreement in RRs (+14.19%)and more robust (RCA +9.5%) (Iruskieta et al., 2011b)

I Parallelism: syntax-discourse (Marcu and Echihabi, 2002)I Some RRs can be derived from syntax (Soricut and Marcu,

2003)I Simpler constituents (C) and fewer attachment points (A)I Parsers are more reliable (Pardo and Nunes, 2008; Soricut and

Marcu, 2003)

38 / 75

Page 41: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Central unit

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

39 / 75

Page 42: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Central unit

CU annotation results (Iruskieta et al., 2014b)

Texts Annotators Measure Results

Burstein et al. (2001) 100 2 professionals F-score 71%Basque 60 4 non-professionals F-score 61%

• CU annotation by 2 non-professionals:I Extracted from RS-tree: 65% (GMB)I First CU: 85% (TERM and ZTF)

• When CU is the same,bigger agreement in RRs(+5.04%, T-test: 0.013)

• When RR is linked to CU,bigger agreement(+17.29% T-test: 0.001)

40 / 75

Page 43: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Central unit

CU and RRs: the IMRaD structure (Swales, 1990)

• Within the RRs linked to the CU, those with an IMRaDstructure appear most frequently (unless ELABORATION)

GMB TERM ZTF CorpusRRs SN NS SN NS SN NS SN NS

PREPARATION 22 24 22 68ELABORATION 6 15 28 49BACKGROUND 13 15 16 44MEANS 1 14 5 6 1 25PURPOSE 2 1 6 9 3 15RESULT 10 2 12SUMMARY 4 3 7CIRCUMSTANCE 2 3 1 6INTERPRETATION 5 5CAUSE 2 1 1 4JUSTIFY 1 2 3CONCESSION 1 2 1 2SOLUTIONHOOD 3 3

Total 39 44 45 39 39 48 123 131

41 / 75

Page 44: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Central unit

Verb and noun indicators of the CU and their strength

• Di�erent verbgroup andindicator'sstrength in eachdomain

• Some nouns'synsets are goodindicator

SUMO GMB % TERM % ZTF %

Reasoning IPP 46.15 22.73 8.70Comparing IPP 26.92Communication SI 3.85 45.45 4.35Predicate 3.85 4.55 34.78

Verbs strengthLemma MCR SUMO GMB % TERM % ZTF %

aztertu analyze1 Reasoning IPP 58.82 25.00 2.78aurkeztu present2 Communication SI 50.00 25.00izan, ukan Predicate 0.47 4.29 2.95

Indicator GMB TERM ZTF Total !Noun

WN 3.0strength

ikerkuntza3 1 3

19 28 67.86 research2ikerketa2 1 6azterlan3 6 1ikerlan3 1

lan3 2 4 7 13 45 28.89 work2xede1 2

12 39 30.77 goal1helburu2 2 8

komunikazio 10 10 15 66.67 paper5bide2 2 6 1 9 15 60.00 means1

42 / 75

Page 45: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Central unit

A list of indicators for Basque (verbs and nouns)

• New indicators from ourcorpus in gray for anautomatic detection ofthe CU in Basque

I New indicators(Paice, 1980) in gray

I Good indicators inblue (≥50.00%)

I But analysis of othercategories are neededto detect the CU

Verbs NounsBSQ ENGMCR BSQ ENGMCRaztertu examine1 abiapuntu1 starting_point1analizatu examine1 arlo1 subject_�eld1oinarritu base1 artikulu7 article1baloratu value2 asmo2 purpose1azaldu recount1 bide2 means1aurkeztu present2 gai6 topic1aipatu present2 ikerkuntza3

research2berri eman present2 ikerketa2jardun present2 azterlan3plazaratu present2 ikerlan3izan / ukan arazo3 problem2

erabili use1 irtenbide2 resolution4ikertu investigate1 komunikazio paper5

hitzaldi2 speech1lan3 work2lan-ildo −−lerro11 line8ikerketa-lerroproiektu2 project2ikerketa-proiektutalde1 group1ikerketa-taldexede1 goal1helburu2

43 / 75

Page 46: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Rhetorical relations

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

44 / 75

Page 47: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Rhetorical relations

RR annotation results

N RCA RC RA R RR agreement81.73% 47.76% 6.27% 3.41% 4.03% 61.47%

No-Match Nuclearity N/N-N/S Attachment Constituent0.23% 6.73% 8.90% 0.08% 0.15% RR disagreementRelation R-Similar R-MissMatch R-Speci�cy Segmentation 38.53%13.62% 5.88% 2.01% 0.93% 0.15%

• The Basque RST TreeBank (Iruskieta et al., 2013a)I 0,568 κ or 61.47% F 1 (2 annotators, 60 texts: 1470 EDUs)

• The Dutch TreeBank (van der Vliet et al., 2011)I 0.57 κ (2 annotators, 4 texts)

• The RST TreeBank (Carlson et al., 2001)I from 0.5973 to 0.7921 κ (2 annotators, 30 texts: 1918 EDUs)I from 0.6017 κ to 0.7555 κ (3 trained professionals, 4/5 texts

515/343 EDUs)

• The Spanish RST TreeBank (da Cunha et al., 2010b)I 77.64% F 1 (2 trained annotators: 84 texts, 694 EDUs)

45 / 75

Page 48: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Rhetorical relations

RR confusion matrixa b c d e f g h i j k l ll m n ñ o p q r s t u v w x y z

ENABLEMENT a 1 2 3ANTITHESIS b 1 1 1 1 1 5SOLUTIONHOOD c 1 3 9 13CONDITION d 14 2 1 3 3 23JOINT eRESTATEMENT f 4 2 1 1 8DISJUNCTION g 1 1 2EVALUATION h 1 3 1 2 1 8EVIDENCE i 3 3 1 1 1 1 10ELABORATION j 8 1 162 2 5 1 6 18 2 14 13 2 15 4 49 302UNCONDITIONAL k 1 1NO-EDU l 1 1 2PURPOSE ll 10 88 1 1 1 1 2 1 1 2 108INTERPRETATION m 4 9 1 10 1 25JUSTIFY n 1 2 11 1 1 2 18CAUSE ñ 1 4 24 8 37CONJUNCTION o 2 3 27 1 14 5 3 1 1 57CONTRAST p 2 5 1 12 5 5 2 1 2 35CONCESSION q 3 1 1 3 2 26 1 1 38SUMMARY r 3 1 1 5LIST s 2 12 1 15 2 1 125 1 2 2 3 166MEANS t 17 1 3 1 63 2 5 92MOTIVATION u 1 1 1 3RESULT v 1 12 3 1 1 1 1 39 1 60PREPARATION w 12 1 79 15 107SEQUENCE x 1 2 1 4 9 3 16 1 37BACKGROUND y 1 4 2 2 5 1 1 54 1 71CIRCUMSTANCE z 1 2 3 4 1 4 41 56

Total 4 15 17 4 1 2 6 267 91 30 3 52 74 19 32 6 171 95 4 99 80 27 145 48 1292

• To go back to RR's harmonization(4)

46 / 75

Page 49: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Rhetorical relations

Reliability of RRs, agreement: Fleiss (1971) Kappa

RRs Kappa p.value

PURPOSE 0.872 >0.001PREPARATION 0.836 >0.001CIRCUMSTANCE 0.772 >0.001CONCESSION 0.743 >0.001CONDITION 0.733 >0.001LIST 0.710 >0.001DISJUNCTION 0.666 >0.001RESTATEMENT 0.665 >0.001MEANS 0.633 >0.001

SEQUENCE 0.556 >0.001CAUSE 0.527 >0.001RESULT 0.458 >0.001ELABORATION 0.448 >0.001BACKGROUND 0.448 >0.001CONTRAST 0.416 >0.001CONJUNCTION 0.404 >0.001

EVIDENCE 0.371 >0.001INTERPRETATION 0.313 >0.001ANTITHESIS 0.220 >0.001EVALUATION 0.178 >0.001SUMMARY 0.178 >0.001

RRs Kappa p.value

JUSTIFY -0.008 0.760JOINT -0.007 0.803SOLUTIONHOOD -0.005 0.857MOTIVATION -0.003 0.923ENABLEMENT -0.001 0.967UNCONDITIONAL 0.001 0.989

• Strong agreement (aboveaverage) in 9 RRs

• Weak agreement (belowaverage) in 7 RRs

• Bad agreement in 5 RRs(with red color)

• No enough data for 6 RRs

47 / 75

Page 50: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Rhetorical relations

Relevant RR disagreement: confusion matrix

RRs # Total

ELABORATION BACKGROUND 50

183

MEANS ELABORATION 30LIST CONJUNCTION 29ELABORATION RESULT 27ELABORATION LIST 26ELABORATION CONJUNCTION 21

INTERPRETATION RESULT 13

69

PREPARATION ELABORATION 12PURPOSE ELABORATION 12JUSTIFY CAUSE 11SEQUENCE LIST 11MEANS BACKGROUND 10

SOLUTIONHOOD BACKGROUND 9

60

ELABORATION INTERPRETATION 9ELABORATION JOINT 8CONJUNCTION RESULT 8CAUSE RESULT 7CONTRAST CONCESSION 7CONTRAST LIST 7CONTRAST ELABORATION 5

Total 312

• One of them is the mostwidely used RR: 47.21%(ELABORATION-X )

• Similar RRs: 4.1%(LIST-CONJUNCTION,

JUSTIFY-CAUSE,

INTERPRETATION-RESULT)

I Di�erent nuclearity:0.54% (CAUSE-RESULT)

• Not used by one ofannotators: 0.7%(SOLUTIONHOOD-

BACKGROUND)

48 / 75

Page 51: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Signaling the RRs

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

49 / 75

Page 52: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Signaling the RRs

CAUSE subgroup signaling agreement

• If signals can be from any language form, is annotationmore reliable?

Annotators CAUSE% RESULT% PURPOSE%

A1-A2 71.43 59.70 90.00A1-A4 67.86 50.75 80.91A2-A4 73.21 37.31 78.18

A1-A2-A4 58.93 37.31 75.45

I Annotating signals are more ambiguous than DMs

• DMs' disagreement 15.27%• Other signals' disagreement 68.13%

50 / 75

Page 53: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Signaling the RRs

CAUSE subgroup signals (≥ 3 occurrences)

• CAUSE signals:I bait-, -gatik, -nez, -ela eta (CAUSE morphemes)I -ren ondorioz `as consequence', arrazoia `reason'I izan ere `in fact'

• RESULT signals:I ondorioz `consequence' and emaitza `result'I lortu `to achieve', ekarri `to bring'I eta `and', beraz `thus'I -tuz `-ing'

• PURPOSE signals:I -t(z)eko, -t(z)eko asmoz `with the intention of', -t(z)eko

helburuarekin `with the aim of'I helburu `aim', asmo `intention'I �Subjuntiboko aditzak� (subjunctive verbs)

51 / 75

Page 54: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Signaling the RRs

Results of the RRs and their signals

Rhetorical relations Signal% DU1 DU2 DU1/2 N S S/N

PREPARATION 110 2 1.82 2 2BACKGROUND 75 16 21.33 12 4 4 12ENABLEMENT 6 6 100.00 6 1 5MOTIVATION 5 5 100.00 3 2 3 2

Presentational EVIDENCE 11 7 63.64 1 6 1 6(pragmatic) JUSTIFY 14 13 92.86 1 11 1 12 1

ANTITHESIS 5 4 80.00 1 1 2 2 2CONCESSION 40 39 97.50 11 26 2 7 30 2RESTATEMENT 10 7 70.00 7 7SUMMARY 10 5 50.00 5 5

ELABORATION 286 84 29.37 82 2 82 2MEANS 93 81 87.10 19 62 81CIRCUMSTANCE 57 53 92.98 44 9 1 52SOLUTIONHOOD 10 9 90.00 3 3 3 3 3 3CONDITION 20 19 95.00 12 5 2 17 2

Subject-matter UNCONDITIONAL 1 1 100.00 1 1(semantic) INTERPRETATION 28 22 78.57 3 17 2 20 2

EVALUATION 11 10 90.91 10 10CAUSE 56 53 94.64 23 21 9 3 41 9RESULT 67 57 85.07 1 55 1 2 54 1PURPOSE 110 109 99.09 40 68 1 3 105 1

Multinuclear

LIST 166 87 52.41 3 53 31SEQUENCE 32 21 65.63 2 15 4CONJUNCTION 50 38 76.00 37 1CONTRAST 40 33 82.50 2 23 8DISJUNCTION 2 2 100.00 2

Total 1315 783 59.54 180 532 71 25 550 27

52 / 75

Page 55: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Signaling the RRs

RRs and signals: interpretation of the results

• Signaling tendencies:I Low signaling (≤ 25%):

• PREPARATION, BACKGROUND

I Middle signaling (≥ 25% eta ≤ 75%):

• EVIDENCE, RESTATEMENT, SUMMARY, ELABORATION,LIST, SEQUENCE

I High signaling (≥ 75%):

• ENABLEMENT, MOTIVATION, JUSTIFY, ANTITHESIS,CONCESSION, MEANS, CIRCUMSTANCE, CONDITION,SOLUTIONHOOD, UNCONDITIONAL, INTERPRETATION,EVALUATION, CAUSE, RESULT, PURPOSE, CONTRAST,CONJUNCTION, DISJUNCTION

• The 4 most annotated RRs 48.44%, only signaled at 29.20%I ELABORATION, LIST, PREPARATION, BACKGROUND

• General RRs (not very informative)

• The signaling of other 22 RRs has a high frequency 86.28%

53 / 75

Page 56: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Results Signaling the RRs

Signaling and RR ambiguity (≥3 occurrences)

Ambiguous signals Non-ambiguous signals and RRsSignal Translation # Signal Translation # RR

eta and 34 -tzeko Purpose morpheme 27 PURPOSE-nez given 15 erabiliz used 8 MEANS-tuz -ing 11 -tzean -ing 8 CIRCUMSTANCEbaina but 11 helburu purpose 8 PURPOSEbait- because 10 adibidez for example 6 ELABORATIONba- if 10 ondoren then 6 SEQUENCEbestalde moreover 9 hala ere however 6 CONCESSIONera berean likewise 8 -ela eta cause morpheme 5 CAUSEizan ere in fact 8 arazo problem 4 SOLUTIONHOODgainera futhermore 6 izan arren despite 4 CONCESSIONberriz whereas 5 -tu ondoren then 4 CIRCUMSTANCEalde batetik on the one hand 5 -nean when 4 CIRCUMSTANCE-ta -ed 5 nahiz eta although 3 CONCESSION

lortutako emaitzekbaieztatzen dute

the results obtainedcon�rm

3 INTERPRETATION

hau da that is to say 3 RESTATEMENT1. 1. 3 LIST

• Detection of some RRs based on non-ambiguous signals

54 / 75

Page 57: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Delivery phase

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

55 / 75

Page 58: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Delivery phase

Importance of the delivery phase

• Delivery phase is of paramount importance (Hovy, 2010), toprovide place for interesting studies

I But often forgottenI Not in the RST Spanish Treebank (da Cunha et al., 2011b)

• Extract RRs from the corpus (to analyze the RRs patterns)

• Is there any place for improvements?1. The SEARCH section based on word-form, lemma and POS

featuresDoc. EDU Id Word CU EDU

1 TERM50 sent2 taldeek / helburua BAI [. . . ] Hitzaldi honek azken hiru urteotan lau unibertsitate hauentaldeek egindako ikerkuntzaren ondorioetako batzuk azaltzekohelburua izango luke.

groups / aim YES [. . .] The aim of this talk is to present some of the results ofthe research carried out by groups from these four universitiesover the last three years.

2 ZTF13 sent1 taldearen / helburu BAI [. . . ] Gure ikerkuntza taldearen helburu nagusia, [. . . ]group's / aim YES [. . .] Our research group's principal aim, [. . .]

3 ZTF13 sent17 taldearen / helburu EZ Alor honetan, gure ikerkuntza taldearen helburu nagusiak bidira.

group's / aim NO In this �eld, our research group has two main aims.

1 ZTF15 sent7 helburu / talde EZ [. . . ] bestelako galdera zailagoei ere erantzutea dute helburu,hala nola, espezieen biogeogra�a, taldearen �logenia, eta abar.

aim / group NO [. . .] the aim is to answer other such di�cult questions, such asspecies biogeography, group phylogeny, etc.

56 / 75

Page 59: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Delivery phase

Importance of the delivery phase

• Delivery phase is of paramount importance (Hovy, 2010), toprovide place for interesting studies

I But often forgottenI Not in the RST Spanish Treebank (da Cunha et al., 2011b)

• Extract RRs from the corpus (to analyze the RRs patterns)

• Is there any place for improvements?1. The SEARCH section based on word-form, lemma and POS

featuresDoc. EDU Id Word CU EDU

1 TERM50 sent2 taldeek / helburua BAI [. . . ] Hitzaldi honek azken hiru urteotan lau unibertsitate hauentaldeek egindako ikerkuntzaren ondorioetako batzuk azaltzekohelburua izango luke.

groups / aim YES [. . .] The aim of this talk is to present some of the results ofthe research carried out by groups from these four universitiesover the last three years.

2 ZTF13 sent1 taldearen / helburu BAI [. . . ] Gure ikerkuntza taldearen helburu nagusia, [. . . ]group's / aim YES [. . .] Our research group's principal aim, [. . .]

3 ZTF13 sent17 taldearen / helburu EZ Alor honetan, gure ikerkuntza taldearen helburu nagusiak bidira.

group's / aim NO In this �eld, our research group has two main aims.

1 ZTF15 sent7 helburu / talde EZ [. . . ] bestelako galdera zailagoei ere erantzutea dute helburu,hala nola, espezieen biogeogra�a, taldearen �logenia, eta abar.

aim / group NO [. . .] the aim is to answer other such di�cult questions, such asspecies biogeography, group phylogeny, etc.

56 / 75

Page 60: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Delivery phase

EDUs and CUs in RS-trees: SEGMENTS section

• Extra advanced functionalities:2. CU and RRs linked to CU3. Annotator's info

EDU Segment GMB0301-GS.rs3 (7) Annotator CU

1 Estomatitis Aftosa Recurrente (I): Epidemiologia, etiopatogenia etaaspektu klinikopatologikoak.

GS

Recurrent aphthous stomatitis (I): epidemiologic, etiologic and clinicalfeatures.

2 �Estomatitis aftosa recurrente� deritzon patologia, ahoan agertzen denugarienetako bat da.

GS

�Recurrent aphthous stomatitis� is one of the most frequent oralpathologies.

3 tamainu, kokapena eta iraunkortasuna aldakorra izanik. GShaving a variable size, location and duration.

4 Honen etiologia eztabaidagarria da. GSIt has a controversial etiology.

5 Ultzera mingarri batzu bezela agertzen da, GSIt is characterized by the apparition of painful ulcers,

6 Hauek periodiki beragertzen dira. GSThese ulcers appear recurrently.

7 Lan honetan patologia arrunt honetan ezaugarri epidemiologiko, etio-patogeniko eta klinikopatologiko garrantsitsuenak analizatzen ditugu.

GS See

In this paper we analyze the most important epidemiological, etiologi-cal, pathological and clinical features of this common oral pathology.

57 / 75

Page 61: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Delivery phase

RELATIONS section

• Extra advanced functionalities:

4. Speci�c RRs and the search of their signals

Relation: Kausa `Cause' (27)Left span NS Rigth span Relation Ref.

Aurreko hamarkadetan, serbierakozientzia-arloko ikertzaile askok joera batnabaritu dute eta horren berri emandute: ingeleseko unita[. . . ]

< − Izan ere, iritzi ezberdinetako zien-tzialari serbiarrek adostasuna lortudute eta aurreko hamarkadetan in-gelesari eman diote [. . . ]

Cause TERM18

In recent decades, many Serbian re-searchers working in di�erent scienti�c�elds have noticed a tendency and thisis outlined here: the English unit [. . .]

Indeed, Serbian scientists from dif-ferent schools of thought havereached a consensus and have gi-ven English [. . .]

Terminologiak berak ere, uztartu eginbehar ditu joera orokor horiek, eranstenzaizkien beste batzuekin batera, hala no-la: teknologien [. . . ]

< − gizartearekin lotuta dagoen jardue-ra denez,

Cause TERM19

Terminology itself must seek to unite the-se general trends, along with others rela-ted to them, for example: technology

since it is an activity linked to so-ciety,

58 / 75

Page 62: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Delivery phase

SIGNALS section

• Extra advanced functionalities:

5. To search in which RR is the speci�c signal

Signal: baina `but'

Gainerakoan, prokasu adierazle egokiak daude, Kontzesioa baina altan dagoen gaixoaren ahalmen fun-tzionalaren erregistro urria antzematen da,

GMB0504

With respect to the other aspects, the indica-tors of process are good

Concession but there is poor recording of the patient'sfunctional capacity on discharge,

Bestalde, Euskaltzaindiak hitz elkartuen bi-dea (1995eko urtarrilaren 27an onartutakoaraua) proposatzen du adjektibo erreferentzia-lak itzultzeko,

Kontrastea baina arauan bertan esaten denez, �. . . ahalden guztian. . . �,

TERM22

Euskaltzaindia proposed a mechanism of com-pound words (in a standard approved on Ja-nuary 27th 1995) for the translation of refe-rential adjectives.

Contrast However the academy also con�rmed,. . . �whenever possible�,

59 / 75

Page 63: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

Outline

1 IntroductionAimsRS-structure in Basque studies and in CLs

2 MethodologyPreparation phaseSegmentationCentral unitRhetorical relationsSignaling the RRsDelivery phase

3 ResultsSegmentationCentral unitRhetorical relationsSignaling the RRs

4 Delivery phase

5 Conclusions and future work

60 / 75

Page 64: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

Goal 1: describe the RS of a Basque corpus

• The Basque RST TreeBank (1,315 RR, 783 signals)

• Adjunct verb clause-based segmentation (81.14% F 1 in pairs)I A prototype of intra-sentential discourse segmentation

(57.81% F 1)

• CU decision tree (61.42% F 1 in threes) and indicators (verbsand nouns) for CU detection

• RR (61.81% F 1 in pairs) and signal (76.82% F 1 in pairs)description

I 22 RRs (51.16%) signaled with a high frequency 86.28%I Signals in DU2 (67.94%) and in satellite unit (91.36%)I IMRaD structure was observed in RRs frequency linked to CUI Consistent cause subgroup harmonization for their detection

61 / 75

Page 65: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

Goal 2: to establish an annotation method

• A new annotation phase (global coherence before local)

• A new method to gain reliability avoiding circularity(harmonizing RRs)

• A qualitative evaluation method for RS-trees

• A robust and innovative delivery phase to di�erent theoreticalstudies (consistency, patterns, ambiguity)

62 / 75

Page 66: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

Goal 3: validate annotation method and analyzedisagreement cases

• Incremental annotation: intra-sentential segmentation was13.74% lower than sentential but 14.19% higher for RRs.

• Macro-structure (CU) before, micro-structure (RRs)I Higher CU agreement (from 10% to 30%), even though the

probability was smallerI Higher RR agreement when same CU is annotated (6.17%,

t-test: p < 0.013)I Higher RR agrement when RRs are linked to CU (11.52%,

t-test: p < 0.001)• Signaling RRs to avoid implicit RRs, is more ambiguous thanmarking with DMs

I Then, it is necessary to put more attention in signal evaluation• Relevant disagreements in RR confusion matrix:

I Most widely user RR (ELABORATION) in 47.21%I Not well understood RRs: EVIDENCE, INTERPRETATION,

ANTITHESIS, EVALUATION and SUMMARYI Not used RR: SOLUTIONHOOD

63 / 75

Page 67: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

Future work

• To measure the adequacy of the segmentation criteria and ofthe RRs harmonization criteria

• To extent the corpus to other genres and domainsI The Reference Corpus for the Processing of Basque (EPEC)

(Aduriz et al., 2006) manually-annotated at di�erent languagelevels

I From the abstracts to their full articles: summarization(da Cunha, 2008)

• To apply in advanced applicationsI Discourse segmenterI Detection of CU via indicators: summarizationI IMRaD structure detection: assessment of written abstractsI Qualitative evaluation of RS-treesI Detection of the cause subgroup: discourse structure analysisI Re-annotate the signals of some RRs with more annotators, to

gain reliability and detect RRs

64 / 75

Page 68: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

Publications

Papers Topic

Iruskieta (2012) Explanation of RSTIruskieta et al. (2011a) Automatic segmentationIruskieta et al. (2014b) Central unitIruskieta et al. (2013b) The drawbacks of quantitative evaluationIruskieta et al. (2011b) Relation and segmentation levelsda Cunha and Iruskieta (2010) Qualitative evaluation of relationsIruskieta et al. (2014a) Qualitative evaluation of relationsIruskieta et al. (2009) DM for signalsIruskieta and da Cunha (2010b) DM for signals (Spanish and Basque)Iruskieta and da Cunha (2010a) Using DMs and RRs to discriminate domains

(medicine and terminology)Iruskieta et al. (2008) Study of DM and its ambiguityGarcia and Iruskieta (2013) DMs of reformulationIruskieta et al. (2013a) The RST Basque TreeBank

Annotated corpus:http://ixa2.si.ehu.es/diskurtsoa/

Thesis-report in Basque:http://ixa2.si.ehu.es/~jibquirm/tesia/tesi_txostena.pdf

Abbreviated translation of the thesis-report in English:http://ixa2.si.ehu.es/~jibquirm/tesia/tesi_txostena_itzulita.pdf

65 / 75

Page 69: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Pragmatikako erlaziozko diskurtso-egitura:deskribapena eta bere ebaluazioahizkuntzalaritza konputazionalean

A description of pragmatics rhetorical structureand its evaluation in computational linguistics

Candidate: Mikel Iruskieta Quintian

Advisors: Arantza Díaz de Ilarraza SánchezMikel Lersundi Ayestaran

University of the Basque Country (UPV/EHU)

February 26, 2014

Page 70: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

References I

Aduriz, I., Aranzabe, M. J., Arriola, J., Atutxa, A., Diaz de Ilarraza, A., Ezeiza, N., Gojenola, K.,Oronoz, M., Soroa, A., and Urizar, R. (2006). Methodology and steps towards the construction ofepec, a corpus of written basque tagged at morphological and syntactic levels for automaticprocessing. Language and Computers, 56(1):1�15.

Afantenos, S. D., Denis, P., Muller, P., and Danlos, L. (2010). Learning recursive segments for discourseparsing. In Seventh conference on International Language Resources and Evaluation, pages3578�3584, Paris, France.

Aierbe, A. (2008). Birformulazio-estrategiak eta komunikagarritasuna administrazioko testuetan.Technical report.

Alberdi, X. and Landa, J. (2013). EUDIMA corpusetik adibideak erauzteko lan-tresna: diskurtso unitatefraseologikoak aztertzeko lanabesa. ASJU, 45(2).

Artiagoitia, X., Oyharçabal, B., Hualde, J. I., and de Urbina, J. O. (2003). Subordination, pages632�844. A grammar of Basque. Mounton de Gruyter, Berlin-New York.

Asher, N. and Lascarides, A. (2003). Logics of conversation. Cambridge Univ Pr, Cambridge.

Barrutieta, G., Abaitua, J., and Díaz, J. (2001). Grossgrained RST through XML metadata formultilingual document generation. In MT Summit VIII, pages 39�42, Santiago de Compostela,Spain.

Barrutieta, G., Abaitua, J., and Díaz, J. (2002). An XML/RST-based approach to multilingualdocument generation for the web. Procesamiento del lenguaje natural, 29:247�253.

Bouayad-Agha, N. (2000). Using an abstract rhetorical representation to generate a variety ofpragmatically congruent texts. In 38th Annual Meeting ACL, volume 38, pages 16�22, Hong Kong.

Burstein, J. C., Marcu, D., Andreyev, S., and Chodorow, M. S. (2001). Towards automatic classi�cationof discourse elements in essays. In Proceedings of the 39th annual Meeting on Association forComputational Linguistics, pages 98�105. Association for Computational Linguistics.

Burstein, J. C., Marcu, D., and Knight, K. (2003). Finding the write stu�: Automatic identi�cation ofdiscourse structure in student essays. Ieee Intelligent Systems, 18(1):32�39.

67 / 75

Page 71: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

References II

Carlson, L., Marcu, D., and Okurowski, M. E. (2001). Building a discourse-tagged corpus in theframework of rhetorical structure theory. In 2nd SIGDIAL Workshop on Discourse and Dialogue,Eurospeech 2001, page 10, Aalborg, Denmark. Association for Computational Linguistics.

Ceberio, K., Aduriz, I., Diaz de Ilarraza, A., and Garc�a, I. (2009). Empirical study of the relevance ofsemantic information for anaphora resolution: the case of adverbial anaphora. In 7th DiscourseAnaphora and Anaphor Resolution Colloquium (DAARC09), pages 56�63, Goa, India.

da Cunha, I. (2008). Hacia un modelo lingüístico de resumen automático de artículos médicos enespañol. Doktore-tesia, IULA, Universitat Pompeu Fabra.

da Cunha, I. (2013). A symbolic corpus-based approach to detect and solve the ambiguity of discoursemarkers. In 14th International Conference on Intelligent Text Processing and ComputationalLinguistics, Samos, Greece.

da Cunha, I. and Iruskieta, M. (2010). Comparing rhetorical structures in di�erent languages: Thein�uence of translation strategies. Discourse Studies, 12(5):563�598.

da Cunha, I., SanJuan, E., Torres-Moreno, J.-M., Lloberes, M., and Castellón, I. (2010a). Discoursesegmentation for Spanish based on shallow parsing. In 9th Mexican international conference onAdvances in arti�cial intelligence: Part I, pages 13�23, Pachuca, Mexico. Springer-Verlag.

da Cunha, I., SanJuan, E., Torres-Moreno, J.-M., Lloberes, M., and Castellón, I. (2010b). Diseg: Unsegmentador discursivo automatico para el español. Procesamiento de Lenguaje Natural, 45.

da Cunha, I., Torres-Moreno, J.-M., and Sierra, G. (2011a). On the Development of the RST SpanishTreebank. In 5th Linguistic Annotation Workshop (LAW V '11), pages 1�10, Portland, USA.Association for Computational Linguistics.

da Cunha, I., Torres-Moreno, J.-M., Sierra, G., Cabrera-Diego, L.-A., and Castro-Rolón, B.-G. (2011b).The RST Spanish Treebank On-line Interface. In International Conference Recent Advances in NLP,Bulgaria.

Esnal, P. (2008). Testu-antolatzaileen erabilera estrategikoa, volume 51. Euskaltzaindia, Bilbo.

Euskaltzaindia (1990). Euskal gramatika. Lehen urratsak III (Lokailuak). Euskaltzaindia, Bilbo.

68 / 75

Page 72: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

References III

Euskaltzaindia (1994). Euskal gramatika: lehen urratsak (EGLU) VI (juntagailuak). Euskaltzaindia,Bilbo.

Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological bulletin,76(5):378�382.

Forbes, K., Miltsakaki, E., Prasad, R., Sarkar, A., Joshi, A., and Webber, B. L. (2003). D-ltag system:Discourse parsing with a lexicalized tree-adjoining grammar. Journal of Logic, Language andInformation, 12(3):261�279.

Garcia, J. and Iruskieta, M. (2013). Birformulatzaile zuzentzaileak testu idatzietan. Eridenen du zerzazkontenta. Sailkideen omenaldia Henrike Knörr irakasleari (1947-2008). EHU, Bilbo.

García, I. M. (2010). Estrategias textuales y discursivas en el aprendizaje de la exposición oral de dosmaterias distintas, pages 155�162. Modos y formas de la comunicación humana. AESLA, Vigo.

Ghorbel, H., Ballim, A., and Coray, G. (2001). Rosetta: Rhetorical and semantic environment for textalignment. In Corpus Linguistics, pages 224�233, Lancaster University (UK).

Goenaga, I., Arregi, O., Ceberio, K., Diaz de Ilarraza, A., and Jimeno, A. (2012). AutomaticCoreference Annotation in Basque. In Eleventh International Workshop on Treebanks and LinguisticTheories, Portugal.

Haouam, K. and Marir, F. (2003). SEMIR: Semantic indexing and retrieving web document usingRhetorical Structure Theory. In 4th International Conference on Intelligent Data Engineering andAutomated Learning (IDEAL), pages 596�604, Hong Kong.

Hovy, E. (2010). Annotation: A tutorial. In 48th Annual Meeting of the Association for ComputationalLinguistics, Uppsala, Sweden.

Ibarra, O. (2013). Sobre estrategias discursivas del lenguaje de los jóvenes vascoparlantes: aspectospragmáticos y discursivos (conectores, marcadores). ASJU, pages 395�411.

Ide, N. and Pustejovsky, J. (2010). What Does Interoperability Mean, Anyway? Toward an OperationalDe�nition of Interoperability for Language Technology. In 2nd Int. Conf. Global InteroperabilityLang. Res, Hong Kong.

69 / 75

Page 73: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

References IV

Iruskieta, M. (2012). Pragmatika. http://www.ehu.es/seg/hizk/1/6.

Iruskieta, M., Aranzabe, M. J., Diaz de Ilarraza, A., Gonzalez, I., Lersundi, M., and de la Calle, O. L.(2013a). The RST Basque TreeBank: an online search interface to check rhetorical relations. In 4thWorkshop �RST and Discourse Studies� , Brasil.

Iruskieta, M. and da Cunha, I. (2010a). El potencial de las relaciones retóricas para la discriminación detextos especializados de diferentes dominios en euskera y español. Calidoscópio, 8(3):181�202.

Iruskieta, M. and da Cunha, I. (2010b). Marcadores y relaciones discursivas en el ámbito médico: unestudio en español y euskera. In XXVIII Congreso Internacional AESLA: Analizar datos > Describirvariación, pages 13�159, Vigo. Servicio de Publicaciones.

Iruskieta, M., da Cunha, I., and Taboada, M. (2014a). A qualitative comparison method for rhetoricalstructures: Identifying di�erent discourse structures in multilingual corpora. Submitted to LanguageResources and Evaluation.

Iruskieta, M., Diaz de Ilarraza, A., and Lersundi, M. (2008). Análisis de los marcadores del discurso parael euskera: denominación, clases, relaciones semánticas y tipos de ambigüedad. In Proceedings of26th AESLA International Conference, pages 1271�1282.

Iruskieta, M., Diaz de Ilarraza, A., and Lersundi, M. (2009). Correlaciones en euskera entre lasrelaciones retóricas y los marcadores del discurso. In Proceedings of 27th AESLA InternationalConference, pages 963�971.

Iruskieta, M., Diaz de Ilarraza, A., and Lersundi, M. (2011a). Bases para la implementación de unsegmentador discursivo para el euskera. In 8th Brazilian Symposium in Information and HumanLanguage Technology (STIL 2011).

Iruskieta, M., Diaz de Ilarraza, A., and Lersundi, M. (2011b). Unidad discursiva y relaciones retóricas:un estudio acerca de las unidades de discurso en el etiquetado de un corpus en euskera.Procesamiento del Lenguaje Natural, 47:144.

Iruskieta, M., Diaz de Ilarraza, A., and Lersundi, M. (2013b). Establishing criteria for RST-baseddiscourse segmentation and annotation for texts in Basque. Corpus Linguistics and LinguisticTheory, 0(0):1�32.

70 / 75

Page 74: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

References V

Iruskieta, M., Diaz de Ilarraza, A., and Lersundi, M. (2014b). Deecting the central unit in rhetoricalstructure trees: A key step in annotating rhetorical relations. Submitted to Coling.

Kortmann, B. (1991). Free adjuncts and absolutes in English: Problems of control and interpretation.Psychology Press, New York.

Larringan, L. (1995). Testu-antolatzaileak bi testu motatan: testu informatiboa eta argudiapenezkoa.Doktore-tesia, Euskal Herriko Unibertsitatea, Gasteiz.

Mann, W. C. and Thompson, S. A. (1987). Rhetorical Structure Theory: A Theory of TextOrganization. Text, 8(3):243�281.

Mann, W. C. and Thompson, S. A. (1988). Rhetorical Structure Theory: Toward a functional theory oftext organization. Text-Interdisciplinary Journal for the Study of Discourse, 8(3):243�281.

Marcu, D. (2000a). The rhetorical parsing of unrestricted texts: A surface-based approach.Computational Linguistics, 26(3):395�448.

Marcu, D. (2000b). The theory and practice of discourse parsing and summarization. The MIT press,Cambridge.

Marcu, D. and Echihabi, A. (2002). An unsupervised approach to recognizing discourse relations. InProceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages368�375. Association for Computational Linguistics.

Maziero, E. G. and Pardo, T. A. S. (2009). Metodologia de avaliação automática de estruturas retóricas.In 7th Brazilian Symposium in Information and Human Language Technology (STIL 2009).

Miltsakaki, E., Prasad, R., Joshi, A., and Webber, B. L. (2004). Annotating discourse connectives andtheir arguments. In HLT/NAACL Workshop on Frontiers in Corpus Annotation, pages 9�16, Boston,USA.

Mitkov, R. (2002). Anaphora resolution, volume 134. Longman London.

Mol, S. (2005). Causality in a cross-linguistic perspective: So, therefore, and thus versus så, derfor, andsåledes. Technical Report 27.

71 / 75

Page 75: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

References VI

Paice, C. D. (1980). The automatic generation of literature abstracts: an approach based on theidenti�cation of self-indicating phrases. In 3rd annual ACM conference on Research and developmentin information retrieval, pages 172�191, Cambridge. Butterworth and Co.

Pardo, T. A. S. (2005). Métodos para análise discursiva automática. Master's thesis.

Pardo, T. A. S. and Nunes, M. G. V. (2004). Relações retóricas e seus marcadores super�ciais: Análisede um corpus de textos cientí�cos em português do brasil [rhetorical relations and its surface markers:an analysis of scienti�c texts corpus in portuguese of brazil]. Technical Report NILC-TR-04-03.

Pardo, T. A. S. and Nunes, M. G. V. (2006). Review and Evaluation of DiZer�An Automatic DiscourseAnalyzer for Brazilian Portuguese. In International Workshop on Computational Procesing ofWritten and Spoken Portuguese, pages 180�189. Springer.

Pardo, T. A. S. and Nunes, M. G. V. (2008). On the development and evaluation of a brazilianportuguese discourse parser. Revista de Informática Teórica e Aplicada, 15(2):43�64.

Pardo, T. A. S. and Seno, E. R. M. (2005). Rhetalho: um corpus de referência anotado retoricamente.Anais do V Encontro de Corpora, pages 24�25.

Polanyi, L. (1988). A formal model of the structure of discourse. Journal of Pragmatics,12(5-6):601�638.

Recasens, M., Màrquez, L., Sapena, E., Martí, M. A., Taulé, M., Hoste, V., Poesio, M., and Versley, Y.(2010). Semeval-2010 task 1: Coreference resolution in multiple languages. In 5th InternationalWorkshop on Semantic Evaluation, pages 1�8, Sweden. Association for Computational Linguistics.

Ripple, A. M., Mork, J. G., Knecht, L. S., and Humphreys, B. L. (2011). A retrospective cohort study ofstructured abstracts in medline, 1992�2006. Journal of the Medical Library Association: JMLA,99(2):160.

Salaburu, P. (2012). Menderakuntza eta menderagailuak (Sareko Euskal Gramatika: SEG).http://www.ehu.es/seg/morf/5/2/2/2.

72 / 75

Page 76: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

References VII

Soricut, R. and Marcu, D. (2003). Sentence level discourse parsing using syntactic and lexicalinformation. In 2003 Conference of the North American Chapter of the Association forComputational Linguistics on Human Language Technology, volume 1, pages 149�156. Associationfor Computational Linguistics.

Stede, M. (2004). The Potsdam Commentary Corpus. In 2004 ACL Workshop on Discourse Annotation,pages 96�102, Barcelona, Spain. Association for Computational Linguistics.

Stede, M. (2008). Disambiguating rhetorical structure. Research on Language and Computation,6(3):311�332.

Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge UniversityPress, Cambridge, UK.

Taboada, M. (2006). Discourse markers as signals (or not) of rhetorical relations. Journal ofPragmatics, 38(4):567�592.

Taboada, M. and Das, D. (2013). Annotation upon annotation: Adding signalling information to acorpus of discourse relations. Dialogue and Discourse, 4(2):249�281.

Taboada, M. and Mann, W. C. (2006). Rhetorical Structure Theory: looking back and moving ahead.Discourse Studies, 8(3):423�459.

Taboada, M. and Renkema, J. (2011). Discourse relations reference corpus.http://www.sfu.ca/rst/06tools/discourse_relations_corpus.html.

To�loski, M., Brooke, J., and Taboada, M. (2009). A syntactic and lexical-based discourse segmenter.In 47th Annual Meeting of the Association for Computational Linguistics, pages 77�80, Suntec,Singapore. ACL.

Urrutia, A. (2008). Legeen eta administrazioaren hizkera, testu-antolatzaileen ikuspegitik. Euskera:Euskaltzaindiaren lan eta agiriak, 53(2):525�546.

van der Vliet, N. (2010). Inter annotator agreement in discourse analysis.http://www.let.rug.nl/ñerbonne/teach/rema-stats-meth-seminar/.

73 / 75

Page 77: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Conclusions and future work

References VIII

van der Vliet, N., Berzlánovich, I., Bouma, G., Egg, M., and Redeker, G. (2011). Building adiscourse-annotated Dutch text corpus. Bochumer Linguistische Arbeitsberichte, 3:157�171.

van Dijk, T. A. (1980a). Macrostructures: An interdisciplinary study of global structures in discourse,interaction, and cognition. L. Erlbaum Associates Hillsdale, NJ.

van Dijk, T. A. (1980b). The semantics and pragmatics of functional coherence in discourse. Speech acttheory: Ten years later, Versus, 26(27):49�65.

74 / 75

Page 78: Pragmatikako erlaziozko diskurtso-egitura: deskribapena

Pragmatikako erlaziozko diskurtso-egitura:deskribapena eta bere ebaluazioahizkuntzalaritza konputazionalean

A description of pragmatics rhetorical structureand its evaluation in computational linguistics

Candidate: Mikel Iruskieta Quintian

Advisors: Arantza Díaz de Ilarraza SánchezMikel Lersundi Ayestaran

University of the Basque Country (UPV/EHU)

February 26, 2014