RTTH Summer School on Speech Technology July 6th -9th...

RTTH Summer School on Speech TechnologyA Deep Learning Perspective

July 6th - 9th, 2015, Barcelona, Spain

TALP Research Center Signal Theory and Communications Department

Universitat Politecnica de Catalunya - BarcelonaTech

Outline

o State-of-the-art in Speaker Recognition

o Deep Learning

o DNNs for Modeling i-Vectors in Speaker

o DNNs for Modeling i-Vectors in Speaker Recognition

o Experimental Results

o Conclusions

Deep Neural Networks for Speaker Recognition

State-of-the-artBackground

What is Supervector? What is i-Vector?

3Deep Neural Networks for Speaker Recognition

Speech Feature Vectors MAP adapted GMM

Supervector i-Vector

(33) (512)

(33×512) (400)i-Vector is a compact representation of a speech signal

State-of-the-artBackground

How to go from a Supervector to an i-Vector ?

- Using a factor analysis approach

o m is assumed to be normally distributed with mean vector m´ and covariance matrix TTt

i-Vector

Total Variability Matrix

Supervector

covariance matrix TTt

o w is a hidden variable which can be defined by the mean of its posterior distribution conditioned on Baum-Welch statistics for a given utterance.

Deep LearningBackground

Deep Neural Network :

A feed-forward artificial neural network with multiple layers of hidden units

How to train ?

Deep Neural Network(DNN)

Back-propagation algorithm given input vectors and class label for each input

How to initialize ?

o Small random numberso Deep Belief Network (DBN) parameterso Auto-encoders, …

Deep Belief Network :

A probabilistic generative model composed of many layers of hidden units above a visible input layer

How to train ?

Deep Belief Network(DBN)

How to train ?

Using Restricted Boltzmann Machines (RBM) layer by layer

How to initialize ?

Small random numbers

DBN Training - DNN Pre-Training :

o Every two adjacent layers are considered as an RBM

o Outputs of the first RBM are given to the next RBM as inputs

o The process is repeated until the top two layers are reached

Restricted Boltzmann Machine :

A generative undirected model constructed from two layers of hidden and visible units

How to train ?

RBM Training

o Stochastic gradient descent o Gradient is approximated by Contrastive Divergence (CD) algorithm

CD-1 Steps : Update Network Parameters :

Deep Learning (Summary)Background

RBM RBM Training

DBN DBN Training/DNN Pre-TrainingDNN

Deep Learning for Modeling i-Vectors

Goal :

Training a discriminative model for each target speaker

What We Have ?

o One i-vector (single session) or a couple of i-vectors (multi session) per each target speaker

o A large number of background i-vectors (impostors)

Target i-Vector

Impostor i-Vectors

Problems :

o Unbalanced data Bias towards the majority class

o Few data Overfitting

Our Proposal:

o Balanced training

Impostor selection and clustering

Distributing equally impostor and target samples among minibatches

o DBN Adaptation

Take advantage of unsupervised learning of DBN using the whole background data called Universal DBN (UDBN)

Adapt UDBN to few data of each speaker :

Step 3

Step 2 (Adaptation)

Target/ImpostorLabels

Impostor MiniBatch

UDBNBackground

i-vectors

DBN Adaptation

Discriminative Speaker Model

Step 1 (Balanced Training)

DNNClusteringImpostor Selection

MiniBatchBalance

Target i-vectors

Impostor Selection

DBN Adaptation

Clustering DNNMiniBatchBalance

Target

UDBNBackground

i-vectors

Target i-vectors

Step 1 : Balanced Training

Problem:

o A large number of impostor data (negative samples)o Very few number of target data (positive samples)

Impostor Selection

DBN Adaptation

DBNTarget/Impostor

Labels

Target

Background i-vectors

Step 1 : Balanced Training

Solutions:

o Global Impostor Selectiono Clustering using K-means (cosine distance criterion)o Equally distributing positive and negative samples among minibatches

Target i-vectors

Impostor Selection

DBN Adaptation

Step 2 (Adaptation)

Target

Step 2 : Adaptation

o Universal DBN (Unsupervised learning using background i-vectors)o Unsupervised Adaptation

Initialize networks by the UDBN parameters Unsupervised learning using balanced data with few iterations

Target i-vectors

DBN Adaptation

Step 3

Step 2 (Adaptation)

DNNClusteringImpostor Selection

MiniBatchBalance

Target

Step 3 : Fine-Tuning

o Supervised learning given impostor and target labels, adapted DBN, and balanced data

Target i-vectors

Impostor Selection

Minibatch Balance

In each minibatch, we show the network the same target samples but different impostor centroids.

Minibatch Balance

DBN Adaptation

DBN adaptation sets speaker specific initial points for each speaker model

Experimental Setup

o Databases

NIST SRE 2006 core test condition (Single session)

816 target speakers, 51,068 trials

NIST SRE 2006, Multi session task (8 samples per each target speaker)

699 target speakers, 31,080 trials

o i-vector size = 400

o Post-processing on i-vectors

Mean Normalization + Whitening

o Hidden layer size = 512

Experimental Results (Single Session Task)

Baseline: i-Vector + cosine (EER = 7.18, minDCF = 324)

Experimental Results (Multi Session Task)

Baseline: i-Vector + cosine (EER = 4.20, minDCF = 191)

Conclusion

o Modeling discriminatively target and impostor i-vectors using DNN

o Adaptation of network parameters of each speaker from a background model called UDBN

o Decreasing the number of impostor i-vectors by the proposed impostor selection method

o The proposed systems outperform the baselines by more than 8% and 17% in the single and multi session tasks, respectively

Omid Ghahabi

omid.ghahabi@upc.edu

RTTH Summer School on Speech Technology July 6th -9th...

Documents

Dbn 20130925

Dbn 20141015

Cheat Sheets for AI, Neural Networks, Machine …...Hopfield Network (HN) Ava' Boltzmann Machine (BM) Restricted BM (RBM) Deep Belief Network (DBN) Deconvolutional Network (DN) Deep

Dbn 20141029

Dbn 20140910

Dbn 20140409

RBM New Collection · 2020. 11. 13. · RBM New Collection 2020/21 RBM My Turn sofa RBM Wall in RBM Vancouver lite RBM Soft Box RBM Sam Table. RBM My Turn Table. Complete RBM New

Dbn 20140108

Dbn 20141112

Dbn 20140903

Dbn 20140924

Dbn 20141001

Dbn 20140312

Edge Intelligence: Convergence of Edge Compute …acm-ieee-sec.org/2018/AR_edge.pdf‒Innovation in algorithms: DNN, CNN, DBN, RNN Modern NNs loosely based on neural network function

Dbn 20140625

Dbn 20140806

Dbn 20131120

Dbn 20140205

Dbn 20140219

Dbn 20140115