Speech:Exps 0180

Description
Author: Colby Johnson

Date: 17Feb2014

Purpose: The goal of this experiment is to compare the results with what is gathered from Exp 0181 and determine how the accuracy shifts with longer trains.

Details: This Experiment will utilize a more optimal configuration discovered with the results of previous experiments, and a few online resources. Apparently with the vocabulary size that we are working with in our train.trans files. We should be using higher senone values. This will prove to yield better results for longer experiments. I will first be attempting to train on this first 5_hrs using a higher senone value and an optimized density of 32 (64 yielded better results but took a lot more time) It has been discovered in my research that trains of 5-10 hours should be using a density of 16 at most and should be yielding a lower WER (should be around 10% ideally, we get about 30%)

Corpus/Switchboard:
 * first_5hr/train

Sphinx_train.cfg:
 * Semone value: 5000
 * Density: 32

Dictionary:
 * first_5hr_train_full "Master Dictionary"
 * /mnt/main/corpus/dist/custom/first_5hr_train_full.dic

GenTrans:
 * genTrans5.pl

Results Training ran in: 2 Hours 19 Min Decode ran in: 29990 Sec (8.33 Hours) SYSTEM SUMMARY PERCENTAGES by SPEAKER

,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 91.8    4.6    3.6   10.7   19.0   78.9 | |=================================================================|     |  Mean   | 58.2  857.7 | 91.7    4.8    3.5   11.8   20.0   80.6 | | S.D.   | 22.1  330.0 |  3.2    2.0    2.2    6.1    7.3   10.2 | | Median | 55.5  813.0 | 92.3    4.5    2.8   11.1   19.6   82.7 | `-'

Successful Completion