Speech:Summer 2011 Training


 * Home
 * Semesters
 * Summer 2011

Trainining
To set up the task directory: Note that this has been run successfully on caesar
 * setup task directory
 * 1) From the SphinxTrain directory, create a directory to store the task in: mkdir taskName
 * 2) * % cd /root/speechtools/SphinxTrain-1.0
 * 3) * % mkdir train1
 * 4) Move to that directory: cd taskName
 * 5) * % cd train1
 * 6) Execute the following command: ../scripts_pl/setup_SphinxTrain.pl -task taskName
 * 7) * % ../scripts_pl/setup_SphinxTrain.pl -task train1
 * 8) Copy config file. Note: you must use vi to change line 5 & 6 from _TRAIN_ to taskName (e.g. train1)
 * 9) * % cd etc
 * 10) * % cp -i /root/DOCS/sphinx_train.cfg.

There are two custom scripts that are needed to perform a train. These are /root/SCRIPTS/genPhones.csh and /root/SCRIPTS/genTrans.pl. Copy both of these scripts into the etc directory of the task. This generally is a subset of the main dictionary found in /root/DOCS/cmudict.06d.
 * Copy wav files into wavTemp directory
 * 1) Create the wavTemp directory: mkdir wavTemp
 * 2) * % mkdir wavTemp
 * 3) Move into the wavTemp directory:cd wavTemp
 * 4) * % cd wavTemp
 * 5) Copy all sph files that will be used for this train into the wavTemp directory.
 * 6) * % cp -i /media/data/Switchboard/disk1/swb1/sw02001.sph.
 * 7) * % cp -i /media/data/ your/audio/files/...
 * Copy necessary scripts into the etc directory.
 * % cd ../etc
 * % cp -i /root/SCRIPTS/genPhones.csh.
 * % cp -i /root/SCRIPTS/genTrans.pl.
 * Copy dictionary into task etc directory with filename taskName.dic
 * % cp -i /somewhere/your/generated/dictionary  train1.dic
 * Copy transcript.
 * 1) Copy the raw training transcript you chose for training into the task etc directory.
 * 2) * % cp -i /somewhere/your/unedited/transcripts  trans_unedited.txt
 * Run genTrans.pl
 * 1) make sure you are in the etc directory and execute genTrans.pl with two arguments. The first argument is the unedited transcription's filename. The second argument should be the taskName.
 * 2) * % genTrans.pl trans_unedited.txt train1
 * Run genPhones.csh
 * 1) This will be customized for each project by giving a taskName to generate the phonemes.
 * 2) * % genPhones.csh train1
 * Copy filler file to etc directory.
 * 1) Copy filler file to the task etc directory.
 * 2) Be sure the name of the filler file is taskName.filler.
 * 3) * % cp -i /root/DOCS/transcripts.filler train1.filler
 * Run make_feats.pl:
 * 1) Be sure to go to the root of the task directory (if you're in etc then up one level)
 * 2) * % cd ..
 * 3) Then execute: ./scripts_pl/make_feats.pl -ctl etc/taskName_train.fileids
 * 4) * % ./scripts_pl/make_feats.pl -ctl ./etc/train1_train.fileids
 * Run Runall.pl:
 * 1) From the root of the task execute: ./scripts_pl/RunAll.pl
 * 2) * % ./scripts_pl/RunAll.pl
 * 3) If you encounter errors of missing dictionary words or missing phonemes
 * 4) *Edit dictionary to add words
 * 5) ** % vi etc/train1.dic
 * 6) *Edit phoneme files to add phones
 * 7) ** % vi etc/train1.phone

You will now hae a set of models in model_parameters.
 * After completion, you have models!

Decoding
This means that we decode with the same data we just trained on. It's a quick way to see if our models are good because our results should be highly accurate (i.e. we just trained on this audio data so decoding on it should be optimal). First we need to create a Language Model Now we can do our decode
 * Quick and dirty test on train
 * 1) Create a language model directory (for now we will do so in the taskName directory, so be sure you're in it).
 * 2) * % mkdir LM
 * 3) * % cd LM
 * 4) Copy language model script found here: /root/SCRIPTS/lm_create.pl
 * 5) * % cp -i /root/SCRIPTS/lm_create.pl.
 * 6) Copy your transcripts from training into the LM directory...note we need to strip out file id tags so we use sed to help.
 * 7) * % sed "s/ (.*)//" ../etc/train1_train.trans > ./train1_lm.trans
 * 8) Now run the script to generate our language model (good idea to capture output in log file)
 * 9) * % lm_create.pl train1_lm.trans &> lm_create.log
 * 1) Create a decode directory (for now we will do so in the taskName directory, so change into it).
 * 2) * % cd ..
 * 3) * % mkdir DECODE
 * 4) * % cd DECODE
 * 5) Copy decode script found here: /root/SCRIPTS/run_decode.pl
 * 6) * % cp -i /root/SCRIPTS/run_decode.pl.
 * 7) Run it, giving it the task name as a parameter (Note it automatically captures output in decode.log file).
 * 8) * % ./run_decode.pl train1
 * 9) Now look through your log to see what was recognized...

This requires work to generate a set of transcripts with dictionary and accompanying audio files. The steps outlined in the decode scripts above are the same, it just takes some work creating appropriate input. The inputs to change within the run_decode.pl script are as follows:
 * Decoding on a dev test set
 * 1) $DICT - this needs to be a dictionary that captures all the words in your test set
 * 2) $CTL - this would be a list of file id's for your incoming wave files to test
 * 3) $LM - perhaps a different language model than what you used for test on train above.