38
Translating the classics: An automated system for translating Dutch uniform classical music titles INGMAR VROOMEN & CASPER KARREMAN, MUZIEKWEB

Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Translating the classics: An automated system

for translating Dutch uniform classical music titles

INGMAR VROOMEN & CASPER KARREMAN, MUZIEKWEB

Page 2: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Project manager

Ingmar Vroomen

Senior developer

Casper Karreman

Page 3: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Introducing Muziekweb

Page 4: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Muziekweb: a short introduction

• Founded in 1961 as Stichting Centrale Discotheek (CDR)

Page 5: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign
Page 6: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign
Page 7: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Collection 2018:

• ± 600.000 CD’s

• 300.000 LP’s

• 30.000 music DVD’s

• Historical audio formats: wax

cylinders, shellac, Pathé

records, Edison Diamond Discs

Page 8: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign
Page 9: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Music library of the Netherlands

Bibliothèque

nationale de

France

British Library

Sound ArchiveDeutsches

Musikarchiv

Muziekweb

Page 10: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Projects

• Music and science

• Internationalisation

Page 11: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

(International) Collaborations

• Scientific research: sharing data, contributing to research,

e.g. with TU Delft or Utrecht University

• All public libraries in The Netherlands and Flanders

• Dutch Royal Library, national library of The Netherlands

• Foreign music libraries like DMA and BLSA but all our

data is in Dutch!

Page 12: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Project automated

translation

Page 13: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Project: automated translation

• Translate our website muziekweb.nl for international

visitors

• Share data with foreign (non-Dutch speaking) libraries

• Enable easier linking of our database to other international

music services and databases

Page 14: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Translating 1000 titles by hand

Page 15: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

The need for an automated solution

Page 16: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Steps in the translation process

• What do we translate?

• Generic title or

identifying name

• Instruments and voices

• Identifying opus or

catalogue number

• Key (A major, d minor)

Page 17: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Steps in the translation process

• What do we translate?

Wolfgang Amadeus Mozart,

Requiem for soloists [4], choir, orchestra KV.626 in d minor

Wolfgang Amadeus Mozart,

Requiem voor soli [4], koor en orkest KV.626 in d kl.t.

Page 18: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Steps in the translation process

Pjotr Iljitsj Tsjaikovski,

Schoppenvrouw, op.68

• What do we translate?

Google:

Muziekweb:

English French German

Spade woman Pelle femme Spaten Frau

Queen of spades Dame de pique Pique dame

Page 19: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Steps in the translation process

• What do we translate?

• Research for other datasets

Page 20: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Research for other datasets

ISNI for names of creators / collaborators

WorldCat for library collections

DDEX for music distribution

None focus on the musical composition

Page 21: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Research for other datasets

• Cantorion - Focus on classical music, concerts and sheet music

• MusicBrainz - Open music encyclopedia

• Wikidata – Open structured dataset, interacts well with machines and humans

Find datasets with overlapping content in different languages

Page 22: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Steps in the translation process

• What do we translate?

• Research for other datasets

• Analyze the data

Page 23: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Analyze the data

• Find out how our subject is addressed

• What information is in the data; data is not information!

• Each dataset contains different information and presentations

Page 24: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Analyze the dataCantorion example

Page 25: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Analyze the dataWikidata example

Page 26: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Analyze the dataMuziekweb

Page 27: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Steps in the translation process

• What do we translate?

• Research for other datasets

• Analyze the data

• Query the datasets

Page 28: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Query the datasets

• Every resource has it’s own interface

• Results rely on the question asked so ask the right

questions

Page 29: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Query the datasets

Cantorion

Page 30: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Query the datasets

Wikidata

Page 31: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Steps in the translation process

• What do we translate?

• Research for other datasets

• Analyze the data

• Query the datasets

• Rating the results / deciding when to translate

Page 32: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Rating the results / deciding when to

translate

• Rate probability of matching results

• When more sources say the same it must be true

Page 33: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Rating the results / deciding when to

translate

Page 34: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Rating the results / deciding when to

translate

Page 35: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Steps in the translation process

• What do we translate?

• Research for other datasets

• Analyze the data

• Query the datasets

• Rating the results / deciding when to translate

• Store proposed translation including decision attributes

Page 36: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

System design

Page 37: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Results

• 10.000 most popular titles: 95% accuracy

• Translated in 16 hours to prevent exhaustion of the remote

systems

Page 38: Translating the classics: An automated system for ......Project: automated translation • Translate our website muziekweb.nl for international visitors • Share data with foreign

Thank you

Ingmar VroomenMUZIEKWEB

[email protected]

Casper KarremanMUZIEKWEB

[email protected]