Speech:Exps 0288 002

Description
Authors: Jon Shallow

Date: 4/03/16

Purpose: In the previous experiment, we had set FINAL NUM DENSITIES to 32. In this experiment we decided to use the same configuration,but set FINAL NUM DENSITIES to 64. The recommend value when training over 100 hours is 32. The higher this number is, the more precisely it discriminates sound. By upping the value we thought we had the chance of obtaining a better WER.

Details:
 * Train configuration
 * Corpus: Switchboard
 * Length: 300hr
 * $CFG_VARNORM = 'no' (variance normalization)
 * $CFG_FINAL_NUM_DENSITIES = 64 (density)
 * $CFG_N_TIED_STATES = 8000 (senones)
 * $CFG_CONVERGENCE_RATIO = 0.004 (convergence ratio)
 * Timeline
 * generateFeats start: 2:11 PM 4/3/16
 * generateFeats end: 3:21 PM 4/3/16
 * Train start: 3:23 PM 4/3/16
 * Train end: 11:31 AM 4/8/16 (200 errors in 002.html)
 * Decode on seen start: 07:13 PM 4/8/16
 * Decode on seen end: 2:02 PM 4/8/16 (total time: 18:49)
 * Decode on unseen start: 2:29 PM 4/13/16
 * Decode on unseen end: 11:14 AM 4/14/16 (total time: 20:45)
 * Decode configuration
 * Seen decode
 * Decoding on: /mnt/main/corpus/switchboard/300hr/test/trans/train.trans (~5 hours)
 * Decoding at: 8000 senones to match the senone count in the train configuration
 * Unseen decode (dev.trans)
 * Decoding on: /mnt/main/corpus/switchboard/300hr/test/trans/dev.trans (~5 hours)
 * Decoding at: 8000 senones to match the senone count in the train configuration

Results: WER (decode on seen): 26.6% WER (decode on unseen (dev.trans)): 42.2%