22
Methods and Techniques for the Analysis of Parliamentary Records: Two Case Studies on Italian Simonetta Montemagni Istituto di Linguistica Computazionale “A. Zampolli” ILC-CNR (Pisa, Italy)

Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Methods and Techniques for the

Analysis of Parliamentary Records:

Two Case Studies on Italian

Simonetta Montemagni

Istituto di Linguistica Computazionale “A. Zampolli”

ILC-CNR (Pisa, Italy)

Page 2: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Natural Language Processing and

Knowledge extraction

Extraction of

Named Entities

Extraction of

semantic relations Extraction of

domain-

relevant entities

Extraction of

temporal

expressions

Graph-based

Knowledge

Representation

Linguistic

profiling of texts

Textual genre

assessment

Readability level

assessment

Native Language

Identification

Monitoring of

variation across

language varieties

Lin

gu

istic K

no

wle

dge

Ex

tractio

nD

om

ain

Kn

ow

eld

ge

Ex

tracti

on

Page 3: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

T2K system

Relation extractor

Domain-specificEntities extractor

Named Entitytagger

LinguisticAnalysis

Tools

InformationExtraction

Tools

KnowledgeGraph

Tools

Graph Visualizer

Semanticannotator

Indexer

Graph creator

Knowledge graph

Index of Content

Semantic annotation

LinguisticProfiling

Annotated corpus

Linguistic pre-processing Knowledge extraction

Text-to-Knowledge (T2K)

T2K combines a battery of tools for Natural Language Processing (NLP),

statistical text analysis and machine language learning which are dynamicallyintegrated to provide an accurate representation of the domain-specific

context of text corpora in different domains.

Page 4: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Case Study 1-

Knowledge Extraction and Semantic Indexing

Focus on different types of parliamentary data:

◦ legislative texts (draft bills)

◦ parliamentary reports (starting)

The challenges

◦ The peculiarity of legal language and its impact on NLP tools

Legal syntax is “convoluted and unnatural” (McCarty, NaLEA

2009) with respect to ordinary language

Much lower performance of state-of-the-art NLP tools on

legislative texts

Need for Domain Adaptation of NLP tools

◦ “Twofold” terminology

domain-specific terms of law or parliamentary procedures are

tainted with regulated / discussed domain knowledge (world

knowledge)

e.g. autorità competente vs sostanza pericolosa

Page 5: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Methods and Techniques for the

Analysis of Parliamentary Records:

Case Study 2 –

Stylistic analysis of Italian Political

Speeches

Page 6: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

A Readability Analysis of Campaign Speeches

from the 2016 US Presidential CampaignElliot Schumacher, Maxine Eskenazi

Page 7: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

From linguistic annotation …

Morpho-syntactic annotation (PoS tagger developed by Dell’Orletta, 2009)

◦ Evalita 2009: accuracy = 96,34% (without reference lexicon)

◦ State-of-the-art for Italian

Dependency syntactic annotation (DeSR parser, Attardi & Dell’Orletta, 2009)

◦ Conll-2007: 81.3% LAS

◦ Evalita 2009: 83.38% LAS

◦ State-of-the-art for Italian

Page 8: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

… to linguistic profiling and readability

assessment

Automatically parsed corpus

Automatic extraction of linguistic

features (linguistic profiling)

Automatic readability assessment and

detection of complex text passages

READ-IT

http://www.italianlp.it/demo/read-it/

NLP-based automatic

readability assessment

software for the Italian

language (Dell’Orletta et al.

2011)

Page 9: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

READ-IT results

Page 10: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Italian political speeches: yesterday

1977

1953

1993

De Gasperi

Berlinguer

Craxi

Page 11: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Italian political speeches: yesterday

1977

1953

1993

De Gasperi

Berlinguer

Craxi

A selection of grammatical features

Page 12: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Italian political speeches: yesterday

1977

1953

1993

De Gasperi

Berlinguer

Craxi

A selection of grammatical features

Usage of verbal morho-

syntactic features

(number and person)

Page 13: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Usage of verbal morho-

syntactic features

(number and person)

Italian political speeches: yesterday

1977

1953

1993

De Gasperi

Berlinguer

Craxi

% of word types

on the Basic

Italian

Vocabulary

(De Mauro) Type/Token

ratio

De Gasperi 73,88 0,60

Berlinguer 77,60 0,68

Craxi 61,93 0,72

Page 14: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Italian political speeches: today

Discorso alla Leopolda (28 ottobre 2013)

Piazza del Popolo (23 marzo 2013)

Porta a Porta (21 maggio 2014)

Renzi

Berlusconi

Grillo

Page 15: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

What if the audience changes?

Discorso al senato (24 febbraio 2014)

E-news: job act (gennaio 2014) Discorso alla Leopolda

(28 ottobre 2013)

Page 16: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

What if the audience changes?

Alta leggibilità

Bassa leggibilità

Discorso al senato (24 febbraio 2014)

E-news: job act (gennaio 2014)

Discorso alla Leopolda (28 ottobre 2013)

Page 17: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

What if the audience changes?

A selection of grammatical features

Page 18: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

What if the audience changes?

Usage of verbal morho-

syntactic features

(number and person)

Page 19: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

What if the audience changes?

Usage of verbal morho-

syntactic features

(number and person)

% of word types

on the Basic

Italian

Vocabulary (De

Mauro)

Type/Token ratio

Renzi-Leopolda 73,37 0,63

Renzi-Jobsact 65,78 0,73

Renzi-Senate 76,01 0,63

Page 20: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Within the same text

Slogans

Rethoric

Page 21: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

Conclusion

Promising results

To be refined through

◦ Domain Adaptation of NLP tools

To be extended in different directions

◦ By widening

the set of linguistic monitored features

the dimensions of variation (to genre, political parties, etc.)

◦ By taking into account the characterizing conceptsand their evolution across speakers, politicalparties, genre, time

◦ By dealing with other languages besides Italian

Page 22: Methods and Techniques for the Analysis of Parliamentary ... · Craxi A selection of grammatical features Usage of verbal morho-syntactic features (number and person) Usage of verbal

The ItaliaNLP Lab

People:

Dominique Brunato

Andrea Cimino

Felice Dell’Orletta

Simonetta Montemagni

Giulia Venturi

[email protected]