Speech:Run Decode


 * Home
 * Semesters - Project Work by Semester
 * [Information]
 * System Description
 * Experiments - List of speech experiments

Project Notes

 * Unix Notes
 * Speech Corpus Setup - Switchboard,  NOAA
 * Speech Recognition Related Readings
 * Experiment Setup
 * Scripts Page
 * Model Building - more info on data prep,  language models, &  building models
 * Step 1: Run a Train
 * Step 2: Create the Language Model
 * [Step 3: Run a Decode]
 * Trained Test Data
 * Unseen Test Data

Decode on Trained Data
Running a Decode on Trained data is a way to verify that your training is on the right track. You use this technique as a first measure to see if your training (i.e. model building) encountered any issues. These issues could be actual errors in the train (i.e. some things broke) or mismatch of configuration parameters to your data causing the models to be sub-optimal. Your goal here is to have a very low Word Error Rate (WER) to demonstrate your model is properly trained.

Decode on Unseen Data
Running a Decode on Unseen Data is done once you are confident your models have been properly trained. Your intitial set of unseen decodes should use the development test set in your corpora. If satisfied with those results (i.e. you feel you cannot do better and no more parameter tuning is needed) you can verify this result by re-Decoding your trained models using the evaluation test set.

Decode Error(s) and Solutions
If you get: [decode.log] /usr/local/bin/sphinx3_decode: error while loading shared libraries: libs3decoder.so.0: cannot open shared object file: No such file or directory Please review LDCONFIG here: https://foss.unh.edu/projects/index.php/Speech:Hardware_Errors

Side Notes

 * The directory's needed to run decode on unseen data are: DECODE LM  bin  etc  feat  wav.
 * As of August 2015 (note this may no longer be true), there isn't a designated script to prepare decode experiment, it's not an automated process at the moment, so a lot has to be pieced together by hand.
 * Use: prepareTrainExperiment.pl .. even though this script generates some unnecessary files. It will provide the structure to run the decode.
 * Edit: sphinxdecode.cfg file, configure it so that the acoustic models point to the train directory.