Speech:Exps 0305 001

Description
Author: Rose_Salemi

Date: 3-5-2018

Purpose: Establish a baseline WER using the older version of the scripts so I have something to compare subsequent trains/decodes to, and to see whether using different drone machines makes a difference.

Details: FAILED - SEE NOTES BELOW Test original makeTrain.pl, original genTrans.pl and original parseLMTrans.pl as a baseline to compare a second 5hr train/decode because Tri reports that he is getting different WERs despite using the same scripts, just on different machines. I wanted to see if I could duplicate his results, so for my first test I am using Idefix. My second test, 0305/033, is on Obelix.

ssh idefix

cd /mnt/main/Exp/0305/001

makeTrain.pl switchboard 5hr/train

genFeats.pl -t

nohup scripts_pl/RunAll.pl &

mkdir LM

cd LM

cp -i /mnt/main/corpus/switchboard/5hr/train/trans/train.trans trans_unedited

parseLMTrans.pl trans_unedited trans_parsed

lm_create.pl trans_parsed

cd ..

cd etc

awk '{print $1}' /mnt/main/corpus/switchboard/5hr/test/trans/train.trans >> /mnt/main/Exp/0305/001/etc/001_decode.fileids

nohup run_decode.pl 0305/001 0305/001 1000 &

Monitor parseDecode until it completes in another terminal window (right-click on the toolbar and choose Duplicate Session) using the command tail -f /mnt/main/Exp/0305/003/etc/decode.log (replace the parent and sub experiment numbers with yours)

parseDecode.pl decode.log hyp.trans

sclite -r 005_train.trans -h hyp.trans -i swb >> scoring.log

tail -10 scoring.log

Results: I got a 27.6 on Idefix.

[ras1002@idefix etc]$ tail -10 scoring.log | sw2013a |   1     17 | 47.1   52.9    0.0   17.6   70.6  100.0 | |=================================================================|     | Sum/Avg |   15    275 | 76.7   18.5    4.7    4.4   27.6   86.7 | |=================================================================|     |  Mean   |  1.3   22.9 | 72.6   23.6    3.8   10.0   37.4   83.3 | | S.D.   |  0.5   25.9 | 23.0   22.8    5.5   19.1   27.5   38.9 | | Median |  1.0   12.0 | 72.6   16.3    0.0    2.8   36.0  100.0 | `-'

NOTE: This result is invalid, due to an erroneous conclusion by most of the class that when you run the nohup run_decode.pl 0305/001 0305/001 1000 &

command and you immediately get back your prompt in the Linux Terminal, that you are done and can go on to the next command, parseDecode.pl decode.log hyp.trans, which takes the result of run_decode and outputs the hyp.trans transcript file.

In fact, the decode is still running in the background (which is what the "&" is for) and you can't run parseDecode until it finishes or you will end up scoring only a few lines. See the Sum/Avg; the first column has only 15 lines (sentences) when the full amount for a 5hr corpus should be 4172 lines. It should take about two hours.

To monitor/view the decoding process, you can open up another Terminal window (right-click on the toolbar and choose Duplicate Session) and run this command: tail -f /mnt/main/Exp/0305/001/etc/decode.log

(replace the numbers with your parent and sub-experiment numbers)

When it's done, you can check the number (count) of lines with the command: grep FWDVIT decode.log | wc -l while in the etc directory. Use without the -l flag if you also want to see the number of words and bytes as well. As of Spring 2018 the number of lines should be 4172 for a 5hr train.

Professor Jonas discovered this issue when we reported getting varying results and he said they should be the same no matter what drone we were using.