16
Temple University Goals : 1. Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4 website 3. Based on the results, make decisions (issue with microprocessor, floating point etc.) By Jaykrishna shukla, Mubin Ahmed and Cara

Goals : Down sample 20 khz TIDigits data to 16 khz

  • Upload
    mason

  • View
    94

  • Download
    1

Embed Size (px)

DESCRIPTION

Goals : Down sample 20 khz TIDigits data to 16 khz . 2. Use Down sample data run regression test and Compare results posted in Sphinx-4 website 3. Based on the results, make decisions (issue with microprocessor, floating point etc .) By Jaykrishna shukla , Mubin Ahmed and Cara Santin. - PowerPoint PPT Presentation

Citation preview

Page 1: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University

Goals :1. Down sample 20 khz TIDigits data to 16 khz.

2. Use Down sample data run regression test andCompare results posted in Sphinx-4 website

3. Based on the results, make decisions (issue with microprocessor, floating point etc.)

By Jaykrishna shukla, Mubin Ahmed and Cara Santin

Page 2: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University

Learned :

1. Cygwin not effective to run Sox2. effective to run linux command line interface to build

application3. Easy to install

Page 4: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple UniversityIntroduction to Training

Q1.What is acoustic model ?A1. model used by a speech recognizer for decoding language

spoken by a person and modeling numerically how the language sounds when spoken in a form that can be stored on a computer.

Q2. what is trainingA2. process that wants to converge on a solution yielding the

most likely sequence of vectors for a given acoustic unit. Q3. why is training required?A3. In order to generate a set of acoustic model for any audio

data, one needs to follow a particular set of steps which is named as training, hence to generate acoustic model, training is required.

Page 5: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University

• The Flow chart for the Training Procedure

Training acoustic model using SphinxTrain 1.0 overview

Page 6: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple UniversitySphinxTrain 1.0 & auto generation

• The new version of sphinx train has a build all option, that generates all the required files that were shown in the flow chart from previous slide. However, in order to do object specific function, one needs to modify the config file according to the purpose of the task.

Page 7: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple UniversityThis week’s accomplishment

• The two major goals that I achieved this week were:

• Finished the complete training process for the an4 demo.

• Worked on generating the feature model for the TI Digit short test data.

• Sample output of a training process (it took more than 20 min to compile this code)

Page 8: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple UniversityGenerating the feature vectors

• There two main step in generating the feature vector:• 1. Generate the .Fileids file (it is just the path list of all the data file)• 2. Modify the Make_feats (perl script) to in order to read the correct data

in and change the default settings that the SphinxTrain comes with.

Page 9: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple UniversityConclusion and Future

• The main problem in feature generation is that the Make_feats file has default settings for the an4 tutorial, hence to getting it working we have to change the configuration for both the make_feats file and the SphinxTrain connfig file (because the config file determines what goes in to the make_feats file. Follow the below Example )

Page 10: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University

Training Acoustic model using Sphinx Train

Jaykrishna shukla,Mubin Amehed& cara SantinDepartment of Electrical and Computer Engineering

Temple University

URL:

Page 11: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University: Slide 11

Introduction to Training

Q1.What is acoustic model ?A1. model used by a speech recognizer for decoding language

spoken by a person and modeling numerically how the language sounds when spoken in a form that can be stored on a computer.

Q2. what is trainingA2. process that wants to converge on a solution yielding the

most likely sequence of vectors for a given acoustic unit. Q3. why is training required?A3. In order to generate a set of acoustic model for any audio

data, one needs to follow a particular set of steps which is named as training, hence to generate acoustic model, training is required.

Page 12: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University: Slide 12

• The Flow chart for the Training Procedure

Training acoustic model using SphinxTrain 1.0 overview

Page 13: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University: Slide 13

SphinxTrain 1.0 & auto generation

• The new version of sphinx train has a build all option, that generates all the required files that were shown in the flow chart from previous slide. However, in order to do object specific function, one needs to modify the config file according to the purpose of the task.

Page 14: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University: Slide 14

This week’s accomplishment

• The two major goals that I achieved this week were:

• Finished the complete training process for the an4 demo.

• Worked on generating the feature model for the TI Digit short test data.

• Sample output of a training process (it took more than 20 min to compile this code)

Page 15: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University: Slide 15

Generating the feature vectors

• There two main step in generating the feature vector:• 1. Generate the .Fileids file (it is just the path list of all the data file)• 2. Modify the Make_feats (perl script) to in order to read the correct data

in and change the default settings that the SphinxTrain comes with.

Page 16: Goals : Down sample 20  khz TIDigits  data to 16  khz

Temple University: Slide 16

Conclusion and Future

• The main problem in feature generation is that the Make_feats file has default settings for the an4 tutorial, hence to getting it working we have to change the configuration for both the make_feats file and the SphinxTrain connfig file (because the config file determines what goes in to the make_feats file. Follow the below Example )