Vector Quantized Neural Networks for Acoustic Unit Discoverynortje+kamper... · 2021. 2. 15. ·...

Vector Quantized Neural Networks for Acoustic Unit

Discovery

Benjamin van Niekerk, Leanne Nortje, Herman Kamper

The Generative Factors of Speech

Content:● Discrete phonetic units.● ≅44 phonemes in English.

HH / Y / UW / M / ER

Prosody:● Rhythm● Intonation● Stresses

Timbre:● Quality of a particular voice.● Characterized by frequency

spectrum.

HUMOUR

spectrum.

HUMOUR

spectrum.

HUMOUR

spectrum.

HUMOUR

HH / Y UW/ M/ ER/

spectrum.

HH / Y UW/ M/ ER/

spectrum.

HH / Y UW/ M/ ER/

spectrum.

HH / Y UW/ M/ ER/

spectrum.

What is Acoustic Unit Discovery?

The goal is to learn discrete representations of speech that separate phonetic content from the other factors.…all without any labels or annotations!

Encoder

Applications

Bootstrap training of low-resource speech systems:

Automatic speech recognition

Text-to-speech

Non-parallel voice conversion

Applications

Text-to-speech

Applications

Text-to-speech

Applications

Text-to-speech

But, how do we learn discrete representations using neural networks?

A. van den Oord, O. Vinyals, and K. Kavukcuoglu. “Neural discrete representation learning.” Advances in Neural Information Processing Systems. 2017.

Vector Quantization Layer

Codebook

Encoder

Codebook

Encoder

Codebook

Encoder

Codebook

Encoder

Codebook

Encoder

Codebook

Encoder

Codebook

Encoder

Codebook

Encoder

Codebook

Our contribution: we propose and compare two models for acoustic unit discovery in the ZeroSpeech 2020 Challenge.

A Vector-Quantized Variational Autoencoder (VQ-VAE)1. A combination of Vector-Quantization and

Contrastive Predictive Coding (VQ-CPC)2.

Encoder

Decoder

Inspired by: J. Chorowski, et al. “Unsupervised speech representation learning using wavenet autoencoders.” IEEE/ACM transactions on audio, speech, and language processing. 2019.

A Vector-Quantized Variational Autoencoder (VQ-VAE)1. A combination of Vector-Quantization and

Contrastive Predictive Coding (VQ-CPC)2.

Encoder

Decoder

A combination of Vector-Quantization and Contrastive Predictive Coding (VQ-CPC)2.A Vector-Quantized Variational

Autoencoder (VQ-VAE)1.

Encoder

Decoder

Inspired by: A. van den Oord, et al. “Representation Learning with Contrastive Predictive Coding.” 2018.

Vector-Quantized Variational Autoencoder

Encoder

Decoder

minimize reconstruction error

Encoder

Decoder

Encoder

Decoder

Information bottleneck

Encoder

Decoder

Speaker

Encoder

Decoder

SpeakerPowerful autoregressive model

Vector-Quantized Contrastive Predictive Coding

Prediction

Encoder

VQ layer

Encoder

VQ layer

Context model

Encoder

VQ layer

Context model

Predictions

Context vector

Positive example

Context vector

Positive example

Negative examples

Context vector

Positive example

Negative examples

Context vector

Positive example

Negative examples

Evaluation - Voice Conversion

Encoder

Decoder

Evaluation Metrics:● Speaker similarity (1-5 scale).● Intelligibility (character error rate).● Mean opinion score (1-5 scale).

Encoder

Decoder

Evaluation Metrics:● Speaker similarity (1-5 scale).● Intelligibility (character error rate).● Mean opinion score (1-5 scale).

Source Converted Target Other Conversion

Evaluation - ABX Score

Triphone A:

Encoder

Triphone A:

Triphone B:

Encoder Encoder

Triphone A:

Triphone B:

Triphone X:

Encoder Encoder Encoder

Triphone A:

Triphone B:

Triphone X:

Encoder Encoder Encoder

Questions?

Vector Quantized Variational Autoencoder

log-Mel spec

conv3(768)batchnorm

conv4stride2(768)batchnorm

conv3(768)batchnorm

Encoder linear(64) VQ(512)

Bottleneck

jitter(0.5) embedding

Decoder

concat

upsample

biGRU(128)biGRU(128)

upsample

GRU(896)

linear(256)ReLU

softmaxsamplemu-law

embedding

speaker

Vector Quantized Neural Networks for Acoustic Unit Discoverynortje+kamper... · 2021. 2. 15. ·...

Documents

Vocabularytree - MICC · 2013-12-18 · positions. A descriptor vector is computed for each region. The descriptor vector is then hierarchically quantized by the vocabulary tree

Quantized Resistance - The Budker Groupbudker.berkeley.edu/Physics141_2013/Zhifan He and Huimin Yang... · Quantized resistance is quantized hall resistance. Quantized resistance

GEORGE KAMPER PORTFOLIO

Quantized Energy of Light

Stabilizing Nonuniformly Quantized

The West by George Kamper

SICILIJA – april, maj 2008 (Rosa) - Kamper Riki

Notes5 Bandstructure Quantized States

George Kamper Realestate Book

Stabilizing Nonuniformly Quantized - arXiv

Bohr quantized the atom…

ALMANAK 2012 Kamper Almanak inhoud.pdf · 2016. 1. 13. · 2012 2012 Cultuur Historisch Jaarboek Kamper Almanak. Kamper Almanak 2012 ... Henk Dorgelo: Witte ooievaars rondom Kampen

Product Quantized Collaborative Filtering

Training Quantized Nets: A Deeper Understandingpapers.nips.cc/paper/7163-training-quantized-nets-a-deeper-understanding.pdfTraining Quantized Nets: A Deeper Understanding Hao Li 1,

Quantized (or not quantized) thermal Hall effect and

KAMPER KRONIEK 1971 - Start pagina - De Kamper Almanakkamperalmanak.nl/downloads/almanak1972/kamper_almanak... · 2016-01-06 · KAMPER KRONIEK 1971 10 september. De kranten bespreken

Quantized electric multipole insulators

George Kamper Lifestyle Portfolio

Photon and Quantized Enrgy

Quantized Hall effect