33
What MT developers are doing… …to make post-editors happy John Tinsley CEO and Co-founder WPTP4 @ MT Summit. Miami. 3 rd November 2015

What machine translation developers are doing to make post-editors happy

Embed Size (px)

Citation preview

Page 1: What machine translation developers are doing to make post-editors happy

What MT developers are doing… …to make post-editors happy

John TinsleyCEO and Co-founder

WPTP4 @ MT Summit. Miami. 3rd November 2015

Page 2: What machine translation developers are doing to make post-editors happy

We provide Machine Translation solutions with Subject Matter Expertise

MT solutions and services provider, specializing in providing customised solutions with subject matter expertise for specific technical sectors, such as Patents/IP, life sciences, and financial.

Page 3: What machine translation developers are doing to make post-editors happy

MT for Information Purposes

MT Application Areas

MT for Post-editing Productivity

•  Development focuses on improving key information translation•  Terminology is important•  Evaluation driven by “usability”

•  Development focuses on reducing edits required•  Feedback loop is crucial•  Evaluation through practical translation tasks

Page 4: What machine translation developers are doing to make post-editors happy

Use cases in practice

Product descriptions to open new markets

MT for post-editing productivity across

industries

Developer, and user for web content

Tens of thousands of people using online

tools daily

Page 5: What machine translation developers are doing to make post-editors happy

TRANSLATION

Page 6: What machine translation developers are doing to make post-editors happy

“Four Pillars of Happiness”

QUALITY

EVALUATION

INTEGRATION

FEEDBACK

Ensuring the the output is the highest quality possible!

Making sure the MT fits seamlessly into the workflow

Letting users know how good to expect output to be

Bringing the translator into the loop to affect change

Page 7: What machine translation developers are doing to make post-editors happy

Quality There’s no silver bullet when it comes to improving MT quality

Page 8: What machine translation developers are doing to make post-editors happy

•  What is being done to improve MT*a)  on a broader, technology level?b)  on a lower level for specific languages / domains?

Quality

*not with the express purpose of making post-editors happy J

Page 9: What machine translation developers are doing to make post-editors happy

•  What is being done to improve MT*a)  on a broader, technology level?b)  on a lower level for specific languages / domains?

Quality

*not with the express purpose of making post-editors happy J

–  Neural networks and deep learning•  something new, totally different, the future?

–  Online adaptive MT•  improving specific engines rapidly [feedback]

–  Syntax-based MT (tree-to-string, etc.)•  incorporating elements of linguistics

Page 10: What machine translation developers are doing to make post-editors happy

•  What is being done to improve MT*a)  on a broader, technology level?b)  on a lower level for specific languages / domains?

Quality

*not with the express purpose of making post-editors happy J

–  Chinese•  segmentation, 的 (de) particle

–  German•  long-distance verb movements, compound splitting / joining

–  Irish•  more fundamental, data collection, resource development

Page 11: What machine translation developers are doing to make post-editors happy

•  What is being done to improve MT*a)  on a broader, technology level?b)  on a lower level for specific languages / domains?

Quality

–  Chinese•  segmentation, 的 (de) particle

Page 12: What machine translation developers are doing to make post-editors happy

•  What is being done to improve MT*a)  on a broader, technology level?b)  on a lower level for specific languages / domains?

Quality

*not with the express purpose of making post-editors happy J

–  Chinese•  segmentation, 的 (de) particle

–  German•  long-distance verb movements, compound splitting / joining

–  Irish•  more fundamental, data collection, resource development

Page 13: What machine translation developers are doing to make post-editors happy

•  What is being done to improve MT*a)  on a broader, technology level?b)  on a lower level for specific languages / domains?

Quality

–  German•  long-distance verb movements, compound splitting / joining

Page 14: What machine translation developers are doing to make post-editors happy

•  What is being done to improve MT*a)  on a broader, technology level?b)  on a lower level for specific languages / domains?

Quality

*not with the express purpose of making post-editors happy J

–  Chinese•  segmentation, 的 (de) particle

–  German•  long-distance verb movements, compound splitting / joining

–  Irish•  more fundamental, data collection, resource development

Page 15: What machine translation developers are doing to make post-editors happy

•  What is being done to improve MT*a)  on a broader, technology level?b)  on a lower level for specific languages / domains?

Quality

*not with the express purpose of making post-editors happy J

–  MT for User Generated Content @ …•  how do handle misspellings, text speak, etc.

–  Patent focused MT @ Iconic•  concentrating on mix of technical language and style

–  MT for online course materials @ TraMOOC•  European H2020 project

Page 16: What machine translation developers are doing to make post-editors happy

Evaluation •  Objectively provide stakeholders information such as:

a)  general quality expectations of an MT engineb)  how it’s impacting individual translators’ performancec)  what specific areas could be improved

Page 17: What machine translation developers are doing to make post-editors happy

Lots of different ways to do evaluation–  automatic scores

•  BLEU, METEOR, GTM, TER

–  fluency, adequacy, comparative ranking–  task-based evaluation

•  error analysis, post-edit productivity

Different metrics, different intelligence–  what does each type of metric tell us?–  which ones are usable at which stage of evaluation?

e.g. can we really use automatic scores to assess productivity?

e.g. does productivity delta really tell us how good the output is?

MT Evaluation – where do we start!?

Page 18: What machine translation developers are doing to make post-editors happy

ProblemLarge Chinese to English patent translation project. Challenging content and language

QuestionWhat if any efficiencies can machine translation add to the workflow of RWS translators?

How we applied different types of MT evaluation and different stages in the process, at various go/no stages, to help RWS to assess whether MT is viable for this project

Evaluation Case Study – RWS

- UK headquartered public company- Founded 1958- 9th largest LSP (CSA 2013 report)- Leader in specialist IP translations

Page 19: What machine translation developers are doing to make post-editors happy

Can we improve our baseline engines through customisation? Step 1: Are the engines any good?

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

BLEU TER

Iconic Baseline

Iconic Customised

What next?

How good is the output relative to the task, i.e. post-editing?- fluency/adequacy not going to tell us- let’s start with segment level TER

-  Huge improvement

-  Intuitively, scores reflect well but don’t really say anything

-  Let’s dig deeper

Page 20: What machine translation developers are doing to make post-editors happy

Translation Edit Rate: correlates well with practical evaluations

If we look deeper, what can we learn?

INTELLIGENCE

• Proportion of full matches (i.e. big savings)

• Proportion of close matches (i.e. faster that fuzzy matches)

• Proportion of poor matches

ACTIONABLE INFORMATION

• Type of sentence with high/low matches

• Weaknesses and gaps

• Segments to compare and analyse in translation memory

Page 21: What machine translation developers are doing to make post-editors happy

TER

sco

re

Step 2: Are they any good for post-editing?

Distribution of segment-level TER scores

segment length

Page 22: What machine translation developers are doing to make post-editors happy

With MT experience and previous MT integration, productivity testing can be run in the production environment. In this case we used, the TAUS Dynamic Quality Framework

Step 3: Quantifying with ACTUAL translators

Productivity Test

Page 23: What machine translation developers are doing to make post-editors happy

Productivity Test

Page 24: What machine translation developers are doing to make post-editors happy

With MT experience and previous MT integration, productivity testing can be run in the production environment. In this case we used, the TAUS Dynamic Quality Framework

Beware the variables!•  Translators: different experience, speed, perceptions of MT

–  24 translators: senior, staff, and interns

•  Test sets: not representative; particularly difficult–  2 tests sets, comprising 5 documents, and cross-fold validation

•  Environment and task: inexperience and unfamiliarity–  Training materials, videos, and “dummy” segments

Step 3: Productivity testing

Page 25: What machine translation developers are doing to make post-editors happy

Overall average

Findings and Learnings

25% productivity gain

Experienced: 22%Staff: 23%

Interns: 30%

Test set 1.1: 25%Test set 1.2: 35%Test set 2.1: 06%Test set 2.2: 35%

Correlates with TER

Rollout with junior staff for more immediate impact on bottom line?

Don’t be over concerned by outliers.Use data to facilitate source content profiling?

What it tells us

By Translator Profile

By Test Set

Page 26: What machine translation developers are doing to make post-editors happy

Evaluation •  Objectively provide stakeholders information such as:

a)  general quality expectations of an MT engine ✔b)  how it’s impacting individual translators’ performance ✔c)  what specific areas could be improved ✔

Now we actually talk to the translators to get their feedback on the task, the MT output, and start that virtuous loop…we’ll come back to this

Metrics•  WMT metrics shared task•  New(er) metrics designed to correlate with post-editing effort•  Optimising MT engines on new / different metrics

Page 27: What machine translation developers are doing to make post-editors happy

Estimating the quality of MT output in real-time at runtime

•  Binary classification (good/bad)•  Multi-label classification, scores•  Word level, error categorisation

Quality Estimation and other features

Page 28: What machine translation developers are doing to make post-editors happy

Engaging end-users – post-editors, LSPs – both directly and indirect to take on-board feedback for the betterment of MT

Feedback

Direct Feedback•  talking to the translators (imagine!)•  collecting structured feedback

–  error categorisation–  correction–  severity

•  commenting on error types and actions

Establish a relationship and understanding to foster acceptance

Page 29: What machine translation developers are doing to make post-editors happy

The machine translation engine will never be 100% perfect. Certain types of sentences will always lend themselves better to MT than others. Our joint goal is to get the machine translation quality to a level that a majority of the sentences are translated well, and the process of post-editing will be faster and more efficient than piecing together translations from a combination of fuzzy matches, terminology, and reference translations.

There are certain types of MT output errors that can be fixed quickly and easily, while others are more fundamental issues that will get fixed with general improvement of the engines and technology itself over time. Here are some examples of each:

If we encounter an error that is just a "minor" mistake and, in general, the contextaround it is ok, sometimes the best approach is to simply leave it for post-editing.

Understanding the MT developer

Fixed Over Time- General grammatical errors- Sentence-level disfluency- Noun phrase ordering

Quick Fixes- Technical terminology- Frequent, consistent set phrases - Stylistic/formatting errors

Page 30: What machine translation developers are doing to make post-editors happy

Engaging end-users – post-editors, LSPs – both directly and indirect to take on-board feedback for the betterment of MT

Feedback

Indirect Feedback•  terminology management•  automatic post-edit rules•  templates for

generalisation

Empowering the translator to affect change themselves

Page 31: What machine translation developers are doing to make post-editors happy

•  Make MT fit as seamlessly as possible into the translator workflowa)  directly into existing CAT toolsb)  new CAT toolsc)  what else would you like? J

Integration

•  Most CAT tools have MT plugins for most MT vendors•  Studio, MemoQ, Wordfast, MultiTrans

•  Matecat making MT more central•  facilitating online learning technology too

•  Highlighting, instrumentation, TM / MT cooperation

Page 32: What machine translation developers are doing to make post-editors happy

“The biggest room in the world is the room for improvement”

Page 33: What machine translation developers are doing to make post-editors happy

Thank You! [email protected]

@IconicTrans