Click here to load reader

English to Bangla Translation

Embed Size (px)

Citation preview

Presentation on English to Bangla Text Conversion

Saugata BoseAssistant ProfessorDepartment of Computer Science and EngineeringULABPresentation onEnglish to Bangla Text Conversion

Flow of our Session

What happens earlier

What qualifies me for applying for PhD

Text TranslationMachine TranslationSignificance

Introductory Ideas

Machine TranslationDirectIndirectWord by Word, Phrase by PhraseRequirementsBilingual DictionaryRearrangement Rule

All information necessary for the generation of the target text without looking back to the original textSL analysisTL GenerationSL to TL transfersourceLogical form of SOURCELogical form of TARGETtargetEmpirical

Empirical SystemStatistical ApproachExample BasedApproachSource text==stored example translationRequirementsBilingual corpusBest Match algorithmGrammar is not major focusGood quality of bilingual data in very large corpus

Previous WorksEMBT approach by Dr. Mumit Khan and Anwarus SalamTagging and Parsing the English sentenceTranslate from source language to the target language following some sentence rules.CYK-CNF approach by Sajib Dasgupta, Abu Wasif and Sharmin AzamSame as EMBT approachApply CNF to convert the parse tree to normal formTransfer English parse tree to Bangla one.Generate Bangla sentence

Am I ready???A Framework for Detecting External Plagiarism from Monolingual Documents: Use of Shallow NLP and N-gram Frequency Comparison Approach, Presented at 2nd International Conference on Information and Communication Technology for Competitive Strategies (ICTCS-2016) (Paper ID: 89), March, 2016. [Conference Proceedings by ACM ICPS, Proceedings Volume ISBN No 978-1-4503-3962-9]

Propose a FrameworkInvestigate the role of machine learning in the proposed framework.

ScopeExternal Plagiarism

Text Pre-processing & NLP Techniques

Comparison Methodologies

Feature Vector

Suspicious Documents

Original Documents

CorpusLowercasingChunkingPunctuationsStop WordsStemming/Lemmatizing1 gram2 gram3 gram4 gram5 gram

Feature Selection

Reduced Feature Set

Train Classifier

Apply Classifier on Test Data

Plagiarism Detection

Experimental Setup(cont)Comparison Methodologies

Machine learning algorithm

N gram Frequency based similarity measureJ48 Classifier, Nave Bayes Classifier

Experiment and Findings-1

Generating Decision Tree 95 instances121 attributesSelecting FeaturesBuild train model95 instances27 attributesAccuracy: 94.6809 % on J48Accuracy: 65.9574 % % on Nave BaiseAccuracy: 71.2766 % on Nave BaiseAccuracy: 93.617 % on J48

Accuracy: 89.0052 % on J48

Accuracy: 86.3874 % on NaiveBaise

Thank You

11