Goals : Down sample 20 khz TIDigits data to 16 khz

Temple University

Goals :1. Down sample 20 khz TIDigits data to 16 khz.

2. Use Down sample data run regression test andCompare results posted in Sphinx-4 website

3. Based on the results, make decisions (issue with microprocessor, floating point etc.)

By Jaykrishna shukla, Mubin Ahmed and Cara Santin

Temple University

Learned :

1. Cygwin not effective to run Sox2. effective to run linux command line interface to build

application3. Easy to install

Temple University

Training Acoustic model using Sphinx Train

URL:

http://www.isip.piconepress.com/publications/conferences/temple/2010/ks_prediction/


Temple UniversityIntroduction to Training

Q1.What is acoustic model ?A1. model used by a speech recognizer for decoding language

spoken by a person and modeling numerically how the language sounds when spoken in a form that can be stored on a computer.

Q2. what is trainingA2. process that wants to converge on a solution yielding the

most likely sequence of vectors for a given acoustic unit. Q3. why is training required?A3. In order to generate a set of acoustic model for any audio

data, one needs to follow a particular set of steps which is named as training, hence to generate acoustic model, training is required.

Temple University

• The Flow chart for the Training Procedure

Training acoustic model using SphinxTrain 1.0 overview

Temple UniversitySphinxTrain 1.0 & auto generation

• The new version of sphinx train has a build all option, that generates all the required files that were shown in the flow chart from previous slide. However, in order to do object specific function, one needs to modify the config file according to the purpose of the task.

Temple UniversityThis week’s accomplishment

• The two major goals that I achieved this week were:

• Finished the complete training process for the an4 demo.

• Worked on generating the feature model for the TI Digit short test data.

• Sample output of a training process (it took more than 20 min to compile this code)

Temple UniversityGenerating the feature vectors

• There two main step in generating the feature vector:• 1. Generate the .Fileids file (it is just the path list of all the data file)• 2. Modify the Make_feats (perl script) to in order to read the correct data

in and change the default settings that the SphinxTrain comes with.

Temple UniversityConclusion and Future

• The main problem in feature generation is that the Make_feats file has default settings for the an4 tutorial, hence to getting it working we have to change the configuration for both the make_feats file and the SphinxTrain connfig file (because the config file determines what goes in to the make_feats file. Follow the below Example )

Temple University

Training Acoustic model using Sphinx Train

Jaykrishna shukla,Mubin Amehed& cara SantinDepartment of Electrical and Computer Engineering

Temple University

URL:



Temple University: Slide 11

Introduction to Training

Q1.What is acoustic model ?A1. model used by a speech recognizer for decoding language

spoken by a person and modeling numerically how the language sounds when spoken in a form that can be stored on a computer.

Q2. what is trainingA2. process that wants to converge on a solution yielding the

most likely sequence of vectors for a given acoustic unit. Q3. why is training required?A3. In order to generate a set of acoustic model for any audio

data, one needs to follow a particular set of steps which is named as training, hence to generate acoustic model, training is required.


• The Flow chart for the Training Procedure

Training acoustic model using SphinxTrain 1.0 overview


SphinxTrain 1.0 & auto generation

• The new version of sphinx train has a build all option, that generates all the required files that were shown in the flow chart from previous slide. However, in order to do object specific function, one needs to modify the config file according to the purpose of the task.


This week’s accomplishment

• The two major goals that I achieved this week were:

• Finished the complete training process for the an4 demo.

• Worked on generating the feature model for the TI Digit short test data.

• Sample output of a training process (it took more than 20 min to compile this code)


Generating the feature vectors

• There two main step in generating the feature vector:• 1. Generate the .Fileids file (it is just the path list of all the data file)• 2. Modify the Make_feats (perl script) to in order to read the correct data

in and change the default settings that the SphinxTrain comes with.


Conclusion and Future

• The main problem in feature generation is that the Make_feats file has default settings for the an4 tutorial, hence to getting it working we have to change the configuration for both the make_feats file and the SphinxTrain connfig file (because the config file determines what goes in to the make_feats file. Follow the below Example )

Documents

Goals : Down sample 20 khz TIDigits data to 16 khz