33
Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

Embed Size (px)

Citation preview

Page 1: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

Fahima Bouzit & Mohamed Tayeb LaskriFahima Bouzit & Mohamed Tayeb LaskriRencontres sur la Recherche en Informatique

june 12-14, 2011

1

Page 2: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

2

PLAN

Introduction

Analysis Levels

Machine Translation

Proposed Approach:

Fillmore Theory

Conceptual Dependency

Semantic Traits of Chafe

Frame Based Representation

Conclusion & Perspectives

Page 3: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

3

Introduction

Language

Natural Language Processing(NLP)

Page 4: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

4

Introduction

Linguistic Approaches

Probabilistic (Statistical) Approaches

NLP Schools

Page 5: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

Analysis Levels

5

IntroductionAnalysis Levels

Morphology ;

Syntax ;

Semantic ;

Pragmatic ;

Page 6: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

6

IntroductionAnalysis Levels

Machine Translation

Machine Translation

Translation

Source language

Target language

Machine

Page 7: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

7

The challenge in machine translation: how to program a computer that will "understand" a text as a person does, and that will "create" a new text in the target language that "sounds" as if it has been written by a person.This problem may be approached in a number of ways.

IntroductionAnalysis Levels

Machine Translation

Translation

Decoding the meaning of the source text

Re-encoding this meaning in the target language

Page 8: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

8

Basic Model of a Machine Translation System

الـصغـيرة

الـمطلـوبالـطفـلـةوجـدتة

الـصفـحة

a trouvéla fillepetitla pagedemandé

petit fillela a trouvé la page demandé

petite fillela a trouvé la page demandée

3

2

1

IntroductionAnalysis Levels

Machine Translation

Page 9: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

9

Arabic Sentence

Analyse

Frame in Arabic Frame in French

Construction

French sentence

Translation

Proposed ArchitectureProposed Architecture

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 10: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

10

Proposed Approach

Fillmore theory

Conceptual Dependency (Schank)

Nouns Classification (Chafe)

Frame based representation (Minsky)

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 11: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

11

Fillmore theory

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

The sentence;

Verb = Kernel

Other components of the sentence = peripherals

Verbs typological nature

Page 12: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

The case AGENT : syntactic case = Subject.  The case OBJET : syntactic case = objectComp Or syntactic case = Subject verb mode = Passive  The case INSTRUMENT : gram case = Dative Preposition = ب ,باستعمال ,بواسطة ِ 

12

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 13: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

The case SOURCE : grammatical case = Dative

Preposition = KْنMِم Or A place noun playing the

role of a direct object comp of some known verbs, such

us : ترك , غادر like in الطفل الموقعغادر

13

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 14: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

DESTINATION : gram case = Dative Preposition = M نحو\ , ِ إلى , لـ , باتجاه , صوKب\

Or A place noun playing the role

of a direct object comp of some known verbs, such us

الموقع in قصد المسافر َ قصد

14

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 15: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

FURNISHER : syntactic case = Indirect object

complement. Animation = Animated kind of verb = transfert verb Particule = M Kِمْن or M عند ِمْن eg :الطفل in ِمْن األستاذ استلم

رسالة الطفل

15

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 16: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

BENEFICIARY : Syn case = Direct object comp Animation = Animated Kind of verb = verb of transfert such us : استلم, سلّـم , أرسل , َح\ص\ل, تسلّـم , أعطى , Particule = M إلى ل ، M like األستاذ in األستاذ إلى اإللكترونية الرسالة الطفل أرسلOr Syn case = indirect object comp Animation = Animated Kind of verb = transfert verb like : طارق in لطارق هدية األب أعطى

16

Page 17: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

17

]]

]]

] =المستعمل

] = الشاشة

Animated,Animated, HumanHuman, , FeminineFeminine,,

Concrete,Concrete, Potent,Potent,

CountableCountable

(+)(+) (+)(+)

(+)(+)

(+)(+)

(-)(-)

Animated,Animated, Human, Human, Feminine,Feminine,

Unique,Unique, Concrete,Concrete, Potent,Potent,

CountableCountable

(-)(-) (-)(-)

(+)(+)

(+)(+)(+)(+)(-)(-)

(+)(+)

Unique,Unique, (+)(+)(-)(-)

Traits of Chafe

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 18: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

18

Frames

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Fig 1. General Frame

Page 19: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

19

Frames

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Fig 2. Specialized Frame

Page 20: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

20

Conceptual Dependency

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

PROPEL Apply a force to somethingMOVE Moving a body partGRASP Catch an objectINGEST Ingest, for a moving objectEXPEL Physically expel, for a moving objectPTRANS Move a physical objectATRANS Modify an abstract relationship, such as possessionSPEAK Produce a sound; support of an action such as

“Communicate”

ATTEND Apply his attention to a perception or stimulusMTRANS Information TransferMBUILD Creating a new though

Page 21: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

21

Arabic Sentence

Analyse

Frame in Arabic Frame in French

Construction

French Sentence

Translation

Proposed ArchitectureProposed Architecture

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 22: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

22

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 23: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

Re-organization French

Le livre est vendu / beau La revue est vendue / belle Les livres sont vendus / beaux Les revues sont vendues / belles

Less in English

23

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 24: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

24

Examples الملف أعاد تسمية المستعمل ب الطفل طبع الطابعةـالنص سرعةب النص الطفل طبع بالطابعة النصع ـطب قاعدة المهندس معلوِماتال نسخ إلى إلكترونية{ رسالة الطفل أرسل

ستاذأال إلى إلكترونية رسالة الطفل أرسل

هأستاذ

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 25: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

25

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

L'utilisateur a re-nommé le fichierL’enfant a imprimé le texte avec l’imprimanteEnfant a imprimé le texte rapidementLe texte a été impriméL’ingénieur a copié la base de donnéesL’enfant a envoyé un email à l'enseignantEnvoyer un email à son enseignant

The user re-named the fileThe Child printed the text with the printerThe Child printed text The Printed text printerThe Engineer copied the database The child sended an email to the teacherThe child sended an email to his teacher

Page 26: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

26

Examples

L'utilisateur de renommer le fichierImprimante enfant du texte impriméEnfant texte imprimé rapidementImprimé imprimante texteIngénieur de base de données de copieEnvoyer un email à l'enfant de l'enseignantEnvoyer un email à l'enfant mentor

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

الملف أعاد تسمية المستعمل

ب الطفل طبع الطابعةـالنص

سرعةب النص الطفل طبع

بالطابعة النصع ـطب

قاعدة المهندس معلوِماتال نسخ

إلى إلكترونية{ رسالة الطفل ستاذأالأرسل

أستاذ إلى إلكترونية رسالة الطفل هأرسل

Page 27: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

27

Examples

User re-naming the fileChild printer printed textChild printed text quicklyPrinted text printerCopy Database EngineerSend an email to the child the teacherSend an email to the child mentor

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

الملف أعاد تسمية المستعمل

ب الطفل طبع الطابعةـالنص

سرعةب النص الطفل طبع

بالطابعة النصع ـطب

قاعدة المهندس معلوِماتال نسخ

إلى إلكترونية{ رسالة الطفل ستاذأالأرسل

أستاذ إلى إلكترونية رسالة الطفل هأرسل

Page 28: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

28

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Our translation system that some modules

were exposed in this paper, takes part in the

semantic processing of texts using purely

linguistic tools and finds fulfillment with the

DCF method as a basis.

This method has been proved appropriate to

the Arabic language and its particularities as

to syntax and semantic sides [3][4][6]

Page 29: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

29

Enrich dictionaries to cover other domains

Multilingual system

Fusion of linguistic and probabilistic approaches

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 30: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

30

Rich DictionariesWell Defined Rules ++

Better Translation

Enrich dictionaries to cover other domains

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 31: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

31

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

31

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Intern Representation

« Meaning »

Sentence in Italien

Sentence in French

Sentence in English

Sentence in Arabic

Sentence in Italien

Sentence in French

Sentence in English

Sentence in Arabic

Multilingual system

Page 32: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

32

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

++++

Better TranslationBetter Translation

FUSIONFUSION

LINGUISTIC APPROACH

STATISTIC APPROACH

Fusion of linguistic and probabilistic approaches

Page 33: Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

33