Speech:Spring 2017 Modeling Group

From Openitware
Jump to: navigation, search


Groups


Group Member Logs


Tasks

March 1

Updated instructions for testing on unseen data.

.Run addExp.pl
1. Run a training experiment on training data the way you normally would on training data. The documentation here is confusing as it gives a step to make a train for test data but this should be included in the "run test on decode" instructions.
2. Build the language model as you normally would.
3. Create a new test directory for your unseen data test.
4. Run "makeTest.pl <flag> <corpus> <train directory> <decode directory>" Use the -t flag for the 30hr test data. Example: makeTest.pl -t switchboard/30hr 0295/015 0295/016
8. The next step is to then run "genFeats.pl -d"
6. CD into the etc folder.
9. Run "nohup run_decode.pl <train exp> <test exp> <senone count> &"
10. Transform the decode.log file to hyp.trans file by running "parseDecode.pl decode.log hyp.trans"
11. Copy over the <Exp#>/etc/<Exp#>_train.trans from the training experiment.
12. run "sclite -r <exp#>_train.trans -h hyp.trans -i swb >> scoring.log"

Running LDA

Instructions for running LDA on unseen data.

IMPORTANT: any time you log on, you must make certain that your 'python' is 2.7
If it is not, run (export PATH=/usr/local/miniconda/bin:$PATH)


Add your experiment to the wiki (addExp.pl)
make your train using (makeTrain.pl switchboard 30hr/train), if it doesn't work add (/mnt/main/scripts/user/) to the beginning.
alter the sphinx_train.cfg file so that $CFG_LDA_MLLT = 'yes' and $CFG_LDA_DIMENSION = 32
train as you normally would.
build the LM as you normally would.
create a new sub-experiment. Make sure to add it to the wiki.
Run "makeTest.pl <flag> <corpus> <train directory> <decode directory>" inside the new subexp. Use the -t flag for the 30hr test data.
cd into the etc folder.
notice the files names "est.<something>". Change the names to "test.<something>".
run (genFeats.pl -d)
run (nohup run_decode_lda.pl <train exp> <test exp> <senone count> &). notice the "lda".
run (parseDecode.pl decode.log hyp.trans)
Copy over the <Exp#>/etc/<Exp#>_train.trans from the training experiment.
run "sclite -r <exp#>_train.trans -h hyp.trans -i swb >> scoring.log"