Speech:Summer 2011 Notes

From Openitware
Revision as of 12:22, 17 February 2012 by Mcy59 (Talk | contribs)

Jump to: navigation, search



NFS is a protocol that allows file systems to be distributed and shared over a network. File system sharing is especially important when working with multiple users and terminals. Without NFS/LDAP, users would have to create a user name and password on each new machine they attempted to access. By utilizing NFS/LDAP, user information can be created on any client machine and then stored and accessed on the network's server. This eliminates user frustration by granting networked computers access to specific non-local information.

Required NFS software using openSUSE

To configure your host as an NFS client, you do not need to install additional software. All needed packages are installed by default.

NFS server software is not part of the default installation. To install the NFS server software, start YaST and select Software+Software Management. Now choose Filter+Patterns and select File Server or use the Search option and search for NFS Server. Confirm the installation of the packages to finish the installation process.


Mounting allows authorized users to import/export file systems. Once mounting has been set up, a server can access specified client directories and vice versa. The best way to complete this is by using YaST's very easy to use GUI. For a full step-by-step walk through of how to do this Click Here


setup task directory

To set up the task directory:

  1. From the SphinxTrain directory, create a directory to store the task in: mkdir taskName
    • % cd /root/speechtools/SphinxTrain-1.0; mkdir train1
  2. Move to that directory: cd taskName
    • % cd train1
  3. Execute the following command: ../scripts_pl/setup_SphinxTrain.pl -task taskName
    • % ../scripts_pl/setup_SphinxTrain.pl -task train1
Copy wav files into wavTemp directory
  1. Create the wavTemp directory: mkdir wavTemp
    • % mkdir wavTemp
  2. Move into the wavTemp directory:cd wavTemp
    • % cd wavTemp
  3. Copy all sph files that will be used for this train into the wavTemp directory.
    • % cp -i /media/data/Switchboard/disk1/swb1/sw02001.sph .
    • % cp -i /media/data/your/audio/files/...
Copy necessary scripts into the etc directory.
  • % cd ../etc

There are two custom scripts that are needed to perform a train. These are /root/SCRIPTS/genPhones.sh and /root/SCRIPTS/genTrans.pl. Copy both of these scripts into the etc directory of the task.

  • % cp -i /root/SCRIPTS/genPhones.sh
  • % cp -i /root/SCRIPTS/genTrans.pl
Copy dictionary into task etc directory with filename taskName.dic

This should be a subset of the main dictionary found in /root/DOCS/cmudict.06d.

  • % cp -i /somewhere/your/generated/dictionary train1.dic
Copy transcript.
  1. Copy the raw training transcript you chose for training into the task etc directory.
    • % cp -i /somewhere/your/unedited/transcripts trans_unedited.txt
Run genTrans.pl
  1. make sure you are in the etc directory and execute genTrans.pl with two arguments. The first argument is the unedited transcription's filename. The second argument should be the taskName.
    • % genTrans.pl trans_unedited.txt train1
Run genPhones.sh
  1. This needs to be customized for each project and should then be run to generate the phonemes.
    • % genPhones.sh
Copy filler file to etc directory.
  1. Copy filler file to the task etc directory.
  2. Be sure the name of the filler file is taskName.filler.
    • % cp -i /root/DOCS/transcripts.filler train1.filler
Run make_feats.pl
  1. Be sure to go to the root of the task directory (if you're in etc then up one level)
    • % cd ..
  2. Then execute: ./scripts_pl/make_feats.pl -ctl etc/taskName_train.fileids
    • % ./scripts_pl/make_feats.pl -ctl ./etc/train1_train.fileids
Run Runall.pl
  1. From the root of the task execute: ./scripts_pl/RunAll.pl
    • % ./scripts_pl/RunAll.pl
After completion, you have models!

You will now hae a set of models in model_parameters.


Quick and dirty test on train

This means that we decode with the same data we just trained on. It's a quick way to see if our models are good because our results should be highly accurate (i.e. we just trained on this audio data so decoding on it should be optimal). First we need to create a Language Model

  1. Create a language model directory (for now we will do so in the taskName directory, so be sure you're in it).
    • % mkdir LM
    • % cd LM
  2. Copy language model script found here: /root/SCRIPTS/lm_create.pl
    • % cp -i /root/SCRIPTS/lm_create.pl .
  3. Copy your transcripts from training into the LM directory...note we need to strip out file id tags so we use sed to help.
    • % sed "s/ (.*)//" ../etc/train1_train.trans > ./train1_lm.trans
  4. Now run the script to generate our language model (good idea to capture output in log file)
    • % lm_create.pl train1_lm.trans &> lm_create.log

Now we can do our decode

  1. Create a decode directory (for now we will do so in the taskName directory, so change into it).
    • % cd ..
    • % mkdir DECODE
    • % cd DECODE
  2. Copy decode script found here: /root/SCRIPTS/run_decode.pl
    • % cp -i /root/SCRIPTS/run_decode.pl .
  3. Run it, giving it the task name as a parameter and capturing it's output in a log file
    • % ./run_decode.pl train1 &> decode.log
  4. Now look through your log to see what was recognized...
Decoding on a dev test set

This requires work to generate a set of transcripts with dictionary and accompanying audio files. The steps outlined in the decode scripts above are the same, it just takes some work creating appropriate input. The inputs to change within the run_decode.pl script are as follows:

  1. $DICT - this needs to be a dictionary that captures all the words in your test set
  2. $CTL - this would be a list of file id's for your incoming wave files to test
  3. $LM - perhaps a different language model than what you used for test on train above.