Deep Learning for Speech Recognition

Vikrant Tomar

Founder, Fluent.ai

vt@fluent.ai

We are hiring!

Outline- Introduction

- General overview of speech recognition framework

- Conventional GMM-HMM based systems

- Deep neural networks in speech

- ConvNets

- RNNs/LSTMs and End-to-end learning

- New interesting stuff

Intro 1: What is speech recognition?

- Dream: A machine should be able to develop a functional equivalent of the

speaker’s intended message as effortlessly as humans can

- In other words: The goal is to find the most likely sequence of symbols such as

words or sub-word speech units from a stream of acoustic data.

Intro 2: How is deep learning for speech different from deep learning for images?

- Speech is a temporal signal, there is information in the sequence

- One dimensional signal with multitudes of information:

- Speaker

- Accent and language

- Age and health

- Environment

- Issues:

- Noise and background conditions

- Accents

- Recording devices

Overview: Statistical Framework for speech recognition- Formally, an ASR system maps the sequence of observation vectors, X, to the

optimum sequence of words, Ŵ :

Overview 2: System Architecture

System Architecture : Feature extraction & spectrogram

GMM-HMM based systems

Deep neural networks in speech- Few different approaches

- Tandem

- Hybrid

- End-to-end

- Old but new

Tandem DNN: DNN -- GMM -- HMM

Hybrid DNN - HMM

- Good source:

Hinton et. al, Deep neural networks

for acoustic modelling in speech, 2012.

Hybrid CNN - HMM

- Good source: A-Hamid et. al, Covolutional neural networks for speech recognition,

Hybrid CNN - HMM -- Partial weight sharing

Some benchmarks

RNNs and End to end models- RNN:

- Good because sequential models

- However, cannot capture long-term dependencies

- Vanishing gradients

- Solutions: LSTMs and GRUs

- End to end models have overall simplified arch.

- CTC : Connectionist temporal classification

A. Graves et. al., “Towards End-to-End Speech

Recognition with Recurrent Neural Networks, 2014

New interesting stuff- Baidu Deep Speech: Use bi-directional RNNs to directly map to characters

- IBM 2015/2016 and Microsoft 2016: Deep CNN with 3 x 3 kernels similar to VGG

net etc.

- CLDNN : Conv + LSTMs + Fully Connected

Baidu Lab: Deep Speech 2014 and Deep Speech 2, 2015

Sainath et. al, CONVOLUTIONAL, LONG SHORT-TERM MEMORY, FULLY CONNECTED DEEP

NEURAL NETWORKS, 2015

Xiong et. al, THE MICROSOFT 2016 CONVERSATIONAL SPEECH RECOGNITION SYSTEM, 2016

Saon et. al, The IBM 2015/16 English Conversational Telephone Speech Recognition System, 2015/16

Conclusion and resources- Lots of exciting stuff, most concepts are similar to other deep learning

communities

- Good starting point: http://www.recognize-speech.com

- You can use any toolbox you like to start:

- Tensorflow, Torch, Theano etc.

- Kaldi, Currennt

- Older stuff: CMU-Sphinx, RWTH-ASR, HTK

- Free(-ish) datasets: http://www.openslr.org/resources.php

- Contact: vt@fluent.ai (Hiring Scientists)

Deep Learning for Speech Recognition - Vikrant Singh Tomar

Technology

Vikrant Nandrekar

delhidistrictcourts.nic.in · Sethi Rajinder Singh Geetanjali Harleen Singh Jitendra Singh Amit Arora Ekta Gauba Anu Aggarwal Jay Thareja Vikrant Vaid Satvir Singh Lamba Charu Gupta

Scanned by CamScannergdcdakpathar.com/merit2018/bcom-1st-merit.pdfgulab singh sharma raj kumar singh tomar n s ch/\ijhan sadique khan satpal singh anil kumar goyal rajiv kumar chauhan

of ME... · deepak prasad de-vendra singh parihar hare krishna pandey harsh kumar harshit singh bisht himanshu kartik jain kunai singh bisht lalit kumar yadav madhukar singh tomar

NIMS Chairman Dr. Balvir Singh Tomar attended WPCC-2015

Govt. P.G. College, Datia (M.P.) Admission list of M.A. I ... · PDF file4 2022113 Chandrabhan Singh Jatav Ramdas Jatav ... 19 2011624 Richa Tomar Suresh Singh Tomar Neeta Tomar

ATG Silverado Body Lightweighting Study - Drive · PDF fileATG Silverado Body Lightweighting Study Final Report January 13 th, 2017 Harry Singh – Director Lightweighting, Vikrant

Theses 2 : Vikrant A Chaudhari

PRESENTED BY: PRESENTED BY:AKANKSHA SINGH DIVYA SINGH HARSH VIKRAM SINGH HARSHIT TYGI JYOTI TRIPATHI KRITIKA TYAGI VAISHALI TOMAR

Vikrant industries

Join Indian Army....HEMANT KUMAR RAVINDER SINGH RAMAN SINGH TOMAR GULSHAN ANAND DIVYARAJ SINGH SISODIYA SAAYAN MUKHERJEE MOHAMMED MUFIS MANUJ CHAMOLI PARTH RAGHUVANSHI PRINCE KUMAR

SINGH AMIT KUMAR SINGH ANAND KUMAR SINGH NAVIN TOMAR DEVENDRA PRAKASH SANNY KUMAR RAGHVENDRA PRATAP SINGH HARENDRA PRA T Ap SINGH ROOPA DEVI RICHA SINGH NAGMA PARVEEN

Vikrant Tyres Ltd 1997

Naval Singh Sahakari Shakkar Karkhana Maryadit Vikrant Educational and Social Welfare Society PS Educational ... Thakur Shiv Kumar Singh Memorial Pharmacy College, Burhan ur Thakur

Effective Conflict Management Vikrant Joshi

jkpsc.nic.injkpsc.nic.in/pdf/image2036.pdfUMAR ALI UMAR BASHIR DAR UMAR RASHID DAR VIKESH KUMAR VIKRAM SINGH VIKRANT SINGH VIMAL KISHORE VINAY MANHAS WAHID HUSSAIN DAR WASEEM AHMAD

MP Congress · 2018. 11. 16. · Girish Singh Bhandari Bapusingh Tomar Priyavrat Singh Smt. Kala Malviya Mahendra Singh Parihar Vipin Vankhede Hukum Singh Karada Kunal Choudhary

govt-jobs.euttaranchal.com...VIRENDRA SINGH TOMAR S/O Mr. KHEEM SINGH TOMAR UTTAR.ÄKHÄND SUBORDINATE SERVICE SELECTION PROVISIONAL LIST FOR TYPING TEST S.No. Roll No Rank Exam Select

Leadership.ppt VIKRANT

· Mr Vikrant Malik Mr Pankaj Kumar Rai Mr Pankaj Kumar Mr Dhanjeet Singh Mr Anandkumar Dilipkumar Goud Mr Gagan Deep Singh Mr Bidwan Kishore Raymohapatra Mr Rituraj Singh Mr Raghavendra