Speech:Spring 2015 Modeling Group


 * Home
 * Semesters
 * Spring 2015
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Groups

 * Systems Group
 * Experiment Group
 * Tools Group
 * Data Group
 * [Modeling Group]


 * Proposal Group

Group Member Logs

 * Zachery Boynton
 * Garrett Bryant
 * Samuel Sweet
 * Zebadiah Wood

Assigned machine is: TBD

Observations

/mnt/main/scripts/user /mnt/main/scripts/user/genTrans.pl
 * The first thing we discovered was the tutorials for running a train through decode were inaccurate. Not only that, but there was three different versions of the Train tutorial.
 * Last semester (Spring 2014) made an automated train script that would set up the experiment directory with the appropriate directories and files as well as tune Sphinx with the users inputted Senone and Density values. This was great for first_5hr trains, but as the length of the train increased the script would stop when an SSH pipe broke.
 * There are also between 2 and 10 versions of any given script in the:
 * directory. It was awfully confusing to figure out the difference between each script and why they were using the ones they were.  This was due to the lack of comments within the code as well as they stopped adding the newest code to the wiki.
 * A major problem was discovered in:
 * It should have been creating soft links to the corpus directory it was pulling its utterance from.  Instead it was overwriting the corpus data that was already there and copying it over overtime a new experiment was created.  This was causing the /mnt/main/Exp directory to be taking up to much storage space as well as cause genTrans.pl to run incredibly slow.

Changes

/mnt/main/scripts/user/pruneDictionary.pl /mnt/main/scripts/user/genTrans.pl /mnt/main/scripts/user/prepareExperiment.pl
 * This first thing that needed to be done was fix the training scripts so people could successfully run trains again.
 * The following scripts have been cleaned up to match current directory paths:

/mnt/main/Exp//etc/wav
 * Changed were also made to genTrans.pl to have it create soft links in


 * The automated script created by Spring 2014 students is now DEPRECATED. This scripts should not be used as it does not add any benefit other than setting Senone and Density values for Sphinx

/mnt/main/scripts/user Removed /mnt/main/corpus/switchboard//clean
 * The scripts directory:
 * Has been cleaned up so multiple versions of scripts are not floating around in the main directory. There are now sub-directories that contain all versions of any given script.
 * Also he following changes have been made to switchboard

Fixed to match structure of removed directory clean /mnt/main/corpus/switchboard/train


 * These changes were done because Professor Jonas did not want clean to be part of the corpus. The structure of clean however was more logical than train was so train was updated to match cleans directory structure.