Speech:Exps 0181

Description
Author: Colby Johnson

Date: 17Feb2014

Purpose: The goal of this experiment is to compare the results with what is gathered from Exp 0180 and determine how the accuracy shifts with longer trains.

Details: This Experiment will utilize a more optimal configuration discovered with the results of previous experiments, and a few online resources. Apparently with the vocabulary size that we are working with in our train.trans files. We should be using higher senone values. This will prove to yield better results for longer experiments. I will first be attempting to train on this first 5_hrs using a higher senone value and an optimized density of 32 (64 yielded better results but took a lot more time) It has been discovered in my research that trains of 5-10 hours should be using a density of 16 at most and should be yielding a lower WER (should be around 10% ideally, we get about 30%)

Corpus/Switchboard:
 * 10hr/train

Sphinx_train.cfg:
 * Semone value: TBD
 * Density: 32

Dictionary: Had to merge Several for a complete list of words
 * Cmudict0.7a
 * /mnt/main/corpus/dist/cmudict.0.7a
 * first_5hr_train_full "Master Dictionary"
 * /mnt/main/corpus/dist/custom/10hr.dic

GenTrans:
 * genTrans5.pl

Results Training ran in: 5 Hours 46 Min Decode ran in: 87504 Sec (24.3 Hours) SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 8859  134105| 90.8    6.1    3.1   12.8   22.0   87.6 | |=================================================================|     |  Mean   | 43.9  663.9 | 90.7    6.3    3.0   13.8   23.1   88.0 | | S.D.   | 19.4  286.0 |  3.8    2.9    1.6    7.1    9.5    8.3 | | Median | 40.0  595.5 | 91.5    5.8    2.7   13.0   21.6   88.9 | `-'