Speech:Spring 2014 Avengers Exps 012

Final Result Experiment
Author: Colby Date: 4/20/2014

Experiment
The script prepareExperiment.pl was used to set up the needed files. This create symbolic links to the audio files we need for the data set passed in as a parameter. This is as opposed to copying the files in an effor to reduce disk usage. The Dictionary, train.trans, phonelist, filler dictionary, and train.fileids files are created using this script. makeFeats was used to create the feats for the Experiment.


 * Train files
 * 125+ Hours of data from Conversation 3170 - the end of disk 22

The Testing data was created using genTrans10.pl. using this I generated (test, and test2) fileids and trans files.

I added the following lines to the filler dictionary: I added the following lines to the phonelist in alphabetical order
 * Test files
 * 5 Hours of testing data for Final result
 * test data contains no bracketed words
 * Test2
 * 300 lines of the test data.
 * Small subset for tuning purposes
 * [NOISE] +noise+
 * [LAUGHTER] +laugh+
 * [VOCALIZED-NOISE] +vocalnoise+
 * +laugh+
 * +noise+
 * +vocalnoise+

Acoustic Model
The Acoustic model was created using the 3170/train2 data set located here: /mnt/main/corpus/switchboard/3170/train This contains 125+ Hours of audio files. The dictionary used was the switchboard corpus dictionary located here: /mnt/main/corpus/dist/custom/switchboard.dic In the Sphinx Configuration file I changed the density to 64 and the senone value to 8000.

Language Model
The Language Model was created using the test transcription created during set up. The 5 hours of recording resulted in a 3683 word vocabulary trigram language model.

Decode
When decoding I started with the default params to act as my baseline. I then tuned various beams and other parameters until I achieved and optimized WER and xRT ratio for our hardware setup. I used the 64 density AM when decoding and watched for a 1:1 ratio loss of xRT and WER. I.E. If I gain 1% accuracy after a parameter change, I have to lose a xRT factor of 1. using this method I plan to reduce xRT significantly while only sacrificing a small amount of accuracy.

Results
Each result will display the following information:
 * Non-default parameters used during Decode
 * Summary line of the 2 Log files from decoded parts
 * Score from .align file

Default-Test2
Density 64:

Params: Default Params

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  4779 cdsen/fr, 129 cisen/fr, 305365 cdgau/fr, 8256 cigau/fr, 7.78 xCPU 7.79 xClk [Ovhrd 0.21 xCPU  0 xClk];  5586 hmm/fr, 39 wd/fr, 0.60 xCPU 0.60 xClk;  tot: 8.39 xCPU, 8.40 xClk INFO: stat.c(206): SUMMARY: 35116 fr;  5050 cdsen/fr, 129 cisen/fr, 322651 cdgau/fr, 8256 cigau/fr, 8.30 xCPU 8.31 xClk [Ovhrd 0.21 xCPU  0 xClk];  6236 hmm/fr, 47 wd/fr, 0.70 xCPU 0.70 xClk;  tot: 9.01 xCPU, 9.02 xClk
 * logdir/decode_64.default/012-1-2.log
 * logdir/decode_64.default/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1710 Errors: 552 TOTAL Percent correct = 86.80% Error = 28.02% Accuracy = 71.98% TOTAL Insertions: 292 Deletions: 59 Substitutions: 201
 * result_64.default/012.align

Pruned01
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-60";
 * $DEC_CFG_PBEAM ="1e-60";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  4266 cdsen/fr, 129 cisen/fr, 272605 cdgau/fr, 8256 cigau/fr, 7.00 xCPU 7.00 xClk [Ovhrd 0.21 xCPU  0 xClk];  3890 hmm/fr, 39 wd/fr, 0.51 xCPU 0.51 xClk;  tot: 7.52 xCPU, 7.52 xClk
 * logdir/decode_64.pruned01/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  4564 cdsen/fr, 129 cisen/fr, 291656 cdgau/fr, 8256 cigau/fr, 7.52 xCPU 7.52 xClk [Ovhrd 0.21 xCPU  0 xClk];  4438 hmm/fr, 46 wd/fr, 0.61 xCPU 0.61 xClk;  tot: 8.15 xCPU, 8.15 xClk
 * logdir/decode_64.pruned01/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1711 Errors: 551 TOTAL Percent correct = 86.85% Error = 27.97% Accuracy = 72.03% TOTAL Insertions: 292 Deletions: 56 Substitutions: 203
 * result_64.pruned01/012.align

Pruned02
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-55";
 * $DEC_CFG_PBEAM ="1e-55";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3708 cdsen/fr, 129 cisen/fr, 236989 cdgau/fr, 8256 cigau/fr, 6.11 xCPU 6.11 xClk [Ovhrd 0.20 xCPU  0 xClk];  2874 hmm/fr, 39 wd/fr, 0.40 xCPU 0.40 xClk;  tot: 6.52 xCPU, 6.52 xClk
 * logdir/decode_64.pruned02/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3990 cdsen/fr, 129 cisen/fr, 255005 cdgau/fr, 8256 cigau/fr, 6.62 xCPU 6.63 xClk [Ovhrd 0.21 xCPU  0 xClk];  3266 hmm/fr, 46 wd/fr, 0.48 xCPU 0.48 xClk;  tot: 7.12 xCPU, 7.12 xClk
 * logdir/decode_64.pruned02/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1703 Errors: 558 TOTAL Percent correct = 86.45% Error = 28.32% Accuracy = 71.68% TOTAL Insertions: 291 Deletions: 58 Substitutions: 209
 * result_64.pruned02/012.align

Pruned03
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-56";
 * $DEC_CFG_PBEAM ="1e-56";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3824 cdsen/fr, 129 cisen/fr, 244447 cdgau/fr, 8256 cigau/fr, 6.31 xCPU 6.31 xClk [Ovhrd 0.21 xCPU  0 xClk];  3065 hmm/fr, 39 wd/fr, 0.42 xCPU 0.43 xClk;  tot: 6.75 xCPU, 6.75 xClk
 * logdir/decode_64.pruned03/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  4111 cdsen/fr, 129 cisen/fr, 262762 cdgau/fr, 8256 cigau/fr, 6.82 xCPU 6.84 xClk [Ovhrd 0.21 xCPU  0 xClk];  3488 hmm/fr, 46 wd/fr, 0.52 xCPU 0.52 xClk;  tot: 7.35 xCPU, 7.36 xClk
 * logdir/decode_64.pruned03/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1706 Errors: 555 TOTAL Percent correct = 86.60% Error = 28.17% Accuracy = 71.83% TOTAL Insertions: 291 Deletions: 57 Substitutions: 207
 * result_64.pruned03/012.align

Pruned04
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-54";
 * $DEC_CFG_PBEAM ="1e-54";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3588 cdsen/fr, 129 cisen/fr, 229377 cdgau/fr, 8256 cigau/fr, 5.94 xCPU 5.95 xClk [Ovhrd 0.21 xCPU  0 xClk];  2687 hmm/fr, 38 wd/fr, 0.38 xCPU 0.38 xClk;  tot: 6.34 xCPU, 6.34 xClk
 * logdir/decode_64.pruned04/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3867 cdsen/fr, 129 cisen/fr, 247148 cdgau/fr, 8256 cigau/fr, 6.46 xCPU 6.47 xClk [Ovhrd 0.21 xCPU  0 xClk];  3052 hmm/fr, 46 wd/fr, 0.47 xCPU 0.47 xClk;  tot: 6.94 xCPU, 6.95 xClk
 * logdir/decode_64.pruned04/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1703 Errors: 560 TOTAL Percent correct = 86.45% Error = 28.43% Accuracy = 71.57% TOTAL Insertions: 293 Deletions: 58 Substitutions: 209
 * result_64.pruned04/012.align

Pruned05
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-53";
 * $DEC_CFG_PBEAM ="1e-53";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3469 cdsen/fr, 129 cisen/fr, 221757 cdgau/fr, 8256 cigau/fr, 5.76 xCPU 5.76 xClk [Ovhrd 0.21 xCPU  0 xClk];  2510 hmm/fr, 38 wd/fr, 0.36 xCPU 0.37 xClk;  tot: 6.13 xCPU, 6.13 xClk
 * logdir/decode_64.pruned05/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3741 cdsen/fr, 129 cisen/fr, 239115 cdgau/fr, 8256 cigau/fr, 6.25 xCPU 6.26 xClk [Ovhrd 0.21 xCPU  0 xClk];  2847 hmm/fr, 46 wd/fr, 0.45 xCPU 0.45 xClk;  tot: 6.72 xCPU, 6.72 xClk
 * logdir/decode_64.pruned05/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1705 Errors: 557 TOTAL Percent correct = 86.55% Error = 28.27% Accuracy = 71.73% TOTAL Insertions: 292 Deletions: 56 Substitutions: 209
 * result_64.pruned05/012.align

Pruned06
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-52";
 * $DEC_CFG_PBEAM ="1e-52";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3347 cdsen/fr, 129 cisen/fr, 213937 cdgau/fr, 8256 cigau/fr, 5.60 xCPU 5.60 xClk [Ovhrd 0.21 xCPU  0 xClk];  2336 hmm/fr, 38 wd/fr, 0.35 xCPU 0.35 xClk;  tot: 5.95 xCPU, 5.96 xClk
 * logdir/decode_64.pruned06/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3614 cdsen/fr, 129 cisen/fr, 230995 cdgau/fr, 8256 cigau/fr, 6.07 xCPU 6.08 xClk [Ovhrd 0.21 xCPU  0 xClk];  2648 hmm/fr, 46 wd/fr, 0.42 xCPU 0.43 xClk;  tot: 6.51 xCPU, 6.52 xClk
 * logdir/decode_64.pruned06/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1704 Errors: 559 TOTAL Percent correct = 86.50% Error = 28.38% Accuracy = 71.62% TOTAL Insertions: 293 Deletions: 56 Substitutions: 210
 * result_64.pruned06/012.align

Pruned07
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-51";
 * $DEC_CFG_PBEAM ="1e-51";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3223 cdsen/fr, 129 cisen/fr, 206042 cdgau/fr, 8256 cigau/fr, 5.42 xCPU 5.42 xClk [Ovhrd 0.21 xCPU  0 xClk];  2170 hmm/fr, 38 wd/fr, 0.33 xCPU 0.33 xClk;  tot: 5.76 xCPU, 5.76 xClk
 * logdir/decode_64.pruned07/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3484 cdsen/fr, 129 cisen/fr, 222714 cdgau/fr, 8256 cigau/fr, 5.86 xCPU 5.86 xClk [Ovhrd 0.21 xCPU  0 xClk];  2456 hmm/fr, 46 wd/fr, 0.40 xCPU 0.40 xClk;  tot: 6.27 xCPU, 6.27 xClk
 * logdir/decode_64.pruned07/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1704 Errors: 556 TOTAL Percent correct = 86.50% Error = 28.22% Accuracy = 71.78% TOTAL Insertions: 290 Deletions: 56 Substitutions: 210
 * result_64.pruned07/012.align

Pruned08
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-50";
 * $DEC_CFG_PBEAM ="1e-50";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3098 cdsen/fr, 129 cisen/fr, 198065 cdgau/fr, 8256 cigau/fr, 5.23 xCPU 5.23 xClk [Ovhrd 0.21 xCPU  0 xClk];  2011 hmm/fr, 38 wd/fr, 0.31 xCPU 0.31 xClk;  tot: 5.55 xCPU, 5.55 xClk
 * logdir/decode_64.pruned08/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3350 cdsen/fr, 129 cisen/fr, 214171 cdgau/fr, 8256 cigau/fr, 5.64 xCPU 5.64 xClk [Ovhrd 0.21 xCPU  0 xClk];  2270 hmm/fr, 46 wd/fr, 0.38 xCPU 0.38 xClk;  tot: 6.04 xCPU, 6.04 xClk
 * logdir/decode_64.pruned08/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1698 Errors: 559 TOTAL Percent correct = 86.19% Error = 28.38% Accuracy = 71.62% TOTAL Insertions: 287 Deletions: 60 Substitutions: 212
 * result_64.pruned08/012.align

Pruned09
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-49";
 * $DEC_CFG_PBEAM ="1e-49";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  2975 cdsen/fr, 129 cisen/fr, 190182 cdgau/fr, 8256 cigau/fr, 5.03 xCPU 5.04 xClk [Ovhrd 0.21 xCPU  0 xClk];  1862 hmm/fr, 38 wd/fr, 0.30 xCPU 0.30 xClk;  tot: 5.34 xCPU, 5.34 xClk
 * logdir/decode_64.pruned09/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3217 cdsen/fr, 129 cisen/fr, 205694 cdgau/fr, 8256 cigau/fr, 5.42 xCPU 5.43 xClk [Ovhrd 0.21 xCPU  0 xClk];  2093 hmm/fr, 46 wd/fr, 0.36 xCPU 0.36 xClk;  tot: 5.80 xCPU, 5.80 xClk
 * logdir/decode_64.pruned09/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1695 Errors: 564 TOTAL Percent correct = 86.04% Error = 28.63% Accuracy = 71.37% TOTAL Insertions: 289 Deletions: 60 Substitutions: 215
 * result_64.pruned09/012.align

Pruned10
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-50";
 * $DEC_CFG_PBEAM = "1e-50";
 * $DEC_CFG_WORDBEAM = "1e-39";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,
 * -wbeam => $ST::DEC_CFG_WORDBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3095 cdsen/fr, 129 cisen/fr, 197843 cdgau/fr, 8256 cigau/fr, 5.21 xCPU 5.22 xClk [Ovhrd 0.21 xCPU  0 xClk];  2008 hmm/fr, 36 wd/fr, 0.30 xCPU 0.30 xClk;  tot: 5.53 xCPU, 5.53 xClk
 * logdir/decode_64.pruned10/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3348 cdsen/fr, 129 cisen/fr, 214016 cdgau/fr, 8256 cigau/fr, 5.64 xCPU 5.65 xClk [Ovhrd 0.21 xCPU  0 xClk];  2268 hmm/fr, 44 wd/fr, 0.37 xCPU 0.37 xClk;  tot: 6.02 xCPU, 6.02 xClk
 * logdir/decode_64.pruned10/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1699 Errors: 558 TOTAL Percent correct = 86.24% Error = 28.32% Accuracy = 71.68% TOTAL Insertions: 287 Deletions: 59 Substitutions: 212
 * result_64.pruned10/012.align

Pruned11
Density 64:

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_BEAMWIDTH = "1e-50";
 * $DEC_CFG_PBEAM = "1e-50";
 * $DEC_CFG_WORDBEAM = "1e-38";


 * scripts_pl/decode/s3decode.pl Params
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,
 * -wbeam => $ST::DEC_CFG_WORDBEAM,

Logs: INFO: stat.c(206): SUMMARY: 38611 fr;  3091 cdsen/fr, 129 cisen/fr, 197594 cdgau/fr, 8256 cigau/fr, 5.20 xCPU 5.20 xClk [Ovhrd 0.21 xCPU  0 xClk];  2006 hmm/fr, 34 wd/fr, 0.29 xCPU 0.29 xClk;  tot: 5.49 xCPU, 5.49 xClk
 * logdir/decode_64.pruned11/012-1-2.log

INFO: stat.c(206): SUMMARY: 35116 fr;  3345 cdsen/fr, 129 cisen/fr, 213823 cdgau/fr, 8256 cigau/fr, 5.67 xCPU 5.68 xClk [Ovhrd 0.21 xCPU  0 xClk];  2266 hmm/fr, 41 wd/fr, 0.35 xCPU 0.35 xClk;  tot: 6.04 xCPU, 6.05 xClk
 * logdir/decode_64.pruned11/012-2-2.log

Result: TOTAL Words: 1970 Correct: 1700 Errors: 558 TOTAL Percent correct = 86.29% Error = 28.32% Accuracy = 71.68% TOTAL Insertions: 288 Deletions: 57 Substitutions: 213
 * result_64.pruned11/012.align

Pruned12
Density 64: USING TEST (5+ HOURS)

Params:
 * etc/sphinx_decode.cfg Params
 * $DEC_CFG_LANGUAGEMODEL_DIR = "$DEC_CFG_BASE_DIR/LM";
 * $DEC_CFG_LANGUAGEMODEL = "$DEC_CFG_LANGUAGEMODEL_DIR/tmp.arpa";
 * $DEC_CFG_LANGUAGEWEIGHT = "11";
 * $DEC_CFG_WORDPENALTY = "0.7";
 * $DEC_CFG_BEAMWIDTH = "1e-50";
 * $DEC_CFG_PBEAM ="1e-50";
 * $DEC_CFG_WORDBEAM = "1e-30";
 * $DEC_CFG_MAXHMMPF = "2000";
 * $DEC_CFG_CIPBEAM = "1e-7";
 * $DEC_CFG_MAXCDSENPF = "2750";
 * $DEC_CFG_MAXWPF = "10";


 * scripts_pl/decode/s3decode.pl Params
 * -lw => $ST::DEC_CFG_LANGUAGEWEIGHT ,
 * -beam => $ST::DEC_CFG_BEAMWIDTH,
 * -pbeam => $ST::DEC_CFG_PBEAM,
 * -wbeam => $ST::DEC_CFG_WORDBEAM,
 * -maxhmmpf => $ST::DEC_CFG_MAXHMMPF,
 * -ci_pbeam => $ST::DEC_CFG_CIPBEAM,
 * -maxcdsenpf => $ST::DEC_CFG_MAXCDSENPF,
 * -maxwpf => $ST::DEC_CFG_MAXWPF,
 * -wip => $ST::DEC_CFG_WORDPENALTY

Logs: INFO: stat.c(206): SUMMARY: 983981 fr;  2061 cdsen/fr, 129 cisen/fr, 132017 cdgau/fr, 8256 cigau/fr, 3.89 xCPU 3.89 xClk [Ovhrd 0.26 xCPU  0 xClk];  1732 hmm/fr, 16 wd/fr, 0.28 xCPU 0.28 xClk;  tot: 4.18 xCPU, 4.18 xClk
 * logdir/decode_64.pruned12/012-1-2.log

INFO: stat.c(206): SUMMARY: 941489 fr;  2090 cdsen/fr, 129 cisen/fr, 133903 cdgau/fr, 8256 cigau/fr, 4.00 xCPU 4.00 xClk [Ovhrd 0.25 xCPU  0 xClk];  1774 hmm/fr, 17 wd/fr, 0.29 xCPU 0.29 xClk;  tot: 4.30 xCPU, 4.30 xClk
 * logdir/decode_64.pruned12/012-2-2.log

Result: TOTAL Words: 48386 Correct: 38877 Errors: 19770 TOTAL Percent correct = 80.35% Error = 40.86% Accuracy = 59.14% TOTAL Insertions: 10261 Deletions: 2237 Substitutions: 7272
 * result_64.pruned12/012.align