Speech:Spring 2017 Modeling Group
- Information - General Project Information
- Experiments - List of speech experiments
Group Member Logs
Updated instructions for testing on unseen data.
- .Run addExp.pl
- 1. Run a training experiment on training data the way you normally would on training data. The documentation here is confusing as it gives a step to make a train for test data but this should be included in the "run test on decode" instructions.
- 2. Build the language model as you normally would.
- 3. Create a new test directory for your unseen data test.
- 4. Run "makeTest.pl <flag> <corpus> <train directory> <decode directory>" Use the -t flag for the 30hr test data. Example: makeTest.pl -t switchboard/30hr 0295/015 0295/016
- 8. The next step is to then run "genFeats.pl -d"
- 6. CD into the etc folder.
- 9. Run "nohup run_decode.pl <train exp> <test exp> <senone count> &"
- 10. Transform the decode.log file to hyp.trans file by running "parseDecode.pl decode.log hyp.trans"
- 11. Copy over the <Exp#>/etc/<Exp#>_train.trans from the training experiment.
- 12. run "sclite -r <exp#>_train.trans -h hyp.trans -i swb >> scoring.log"
Instructions for running LDA on unseen data.
- IMPORTANT: any time you log on, you must make certain that your 'python' is 2.7
- If it is not, run (export PATH=/usr/local/miniconda/bin:$PATH)
- Add your experiment to the wiki (addExp.pl)
- make your train using (makeTrain.pl switchboard 30hr/train), if it doesn't work add (/mnt/main/scripts/user/) to the beginning.
- alter the sphinx_train.cfg file so that $CFG_LDA_MLLT = 'yes' and $CFG_LDA_DIMENSION = 32
- train as you normally would.
- build the LM as you normally would.
- create a new sub-experiment. Make sure to add it to the wiki.
- Run "makeTest.pl <flag> <corpus> <train directory> <decode directory>" inside the new subexp. Use the -t flag for the 30hr test data.
- cd into the etc folder.
- notice the files names "est.<something>". Change the names to "test.<something>".
- run (genFeats.pl -d)
- run (nohup run_decode_lda.pl <train exp> <test exp> <senone count> &). notice the "lda".
- run (parseDecode.pl decode.log hyp.trans)
- Copy over the <Exp#>/etc/<Exp#>_train.trans from the training experiment.
- run "sclite -r <exp#>_train.trans -h hyp.trans -i swb >> scoring.log"