Title: Tiny Train w/Test on Train #3
Author: Cedric Woodbury
Date: August 02, 2012
Purpose: A third training w/decoding run to test modified scripts
I used the same transcript as before in experiments 0010 and 0011. However, genTrans.pl was modified to make it more efficient and to make it take transcripts from the corpus/switchboard directory. A new script called parseDecode.pl was created. It takes the decode.log file and pulls out the hypothetical results and puts them in the same format as the transcript used for training.
The modified genTrans.pl script works. It eliminated the need to create a wavTemp directory. Rather than copying all the sph files over and then converting them, it reads them from the source directory and saves them in the target wav folder. It also reuses a single temp.wav file for conversion so there are no longer a lot of unneeded temp files. The parseDecode.pl script worked as intended.
We now have a more efficient genTrans.pl script that can pull transcripts directly from the corpus/switchboard directory so multiple experiments can use one setup. And it is now easier to setup additional transcripts by simply adding a folder to the switchboard directory. Since the parseDecode.pl script works, we now have what we need for the next step, which is scoring the hypothetical results by comparing it to the original transcript.