Speech:Exps 0283 013

Description
Authors: James Schumacher

Date: 3/24/16

Purpose: Test the newly generated utt files in the full corpus after adding --bits and --encoding options in the sox command in the genUttAudio script.

Details:
 * Train configuration
 * All default values
 * Started train at 6:38 PM on 3/24/2016
 * Train ended at 6:48 PM on 3/24/2016
 * Decode values
 * 1000 files @ 1000 senones
 * Decode started at 7:11 PM on 3/24/2016
 * Decode ended at 7:30 PM on 3/24/2016

Results: SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|     | sw2001b |   18    163 | 65.6   17.8   16.6    0.6   35.0   50.0 | |-+-+-|     | sw2001a |   14    101 | 70.3   17.8   11.9    0.0   29.7   42.9 | |-+-+-|     | sw2005a |   39    701 | 64.1   17.0   19.0    2.0   37.9   79.5 | |-+-+-|     | sw2005b |   67    613 | 47.3   25.9   26.8    7.0   59.7   80.6 | |-+-+-|     | sw2006b |   29    618 | 50.0   24.8   25.2    2.6   52.6   86.2 | |-+-+-|     | sw2006a |   33    455 | 78.9    9.7   11.4    2.0   23.1   60.6 | |-+-+-|     | sw2007a |   61    614 | 64.5   16.0   19.5    0.3   35.8   65.6 | |-+-+-|     | sw2007b |   60    861 | 67.6   16.4   16.0    0.2   32.6   80.0 | |-+-+-|     | sw2008a |   24    260 | 63.5   20.4   16.2   11.9   48.5   91.7 | |-+-+-|     | sw2008b |   26    257 | 66.1   17.5   16.3    0.0   33.9   42.3 | |-+-+-|     | sw2009b |   23    181 | 65.7   16.6   17.7    0.6   34.8   39.1 | |-+-+-|     | sw2009a |   34    473 | 49.9   27.9   22.2    5.3   55.4   97.1 | |-+-+-|     | sw2010b |   22    404 | 44.3   23.8   31.9    2.2   57.9   90.9 | |-+-+-|     | sw2010a |   27    284 | 57.0   22.2   20.8    3.9   46.8   66.7 | |-+-+-|     | sw2012a |   45    838 | 45.9   20.6   33.4    1.6   55.6   73.3 | |-+-+-|     | sw2012b |   28    464 | 62.3   19.0   18.8    1.5   39.2   57.1 | |-+-+-|     | sw2013a |   35    377 | 54.4   21.5   24.1    2.1   47.7   80.0 | |-+-+-|     | sw2013b |   69    942 | 34.8   32.8   32.4    3.4   68.6   84.1 | |-+-+-|     | sw2014a |    9     79 | 67.1   17.7   15.2    6.3   39.2   77.8 | |-+-+-|     | sw2014b |   13    174 | 55.7   23.6   20.7   10.9   55.2   84.6 | |-+-+-|     | sw2015a |   21    375 | 44.0   30.1   25.9    0.8   56.8   95.2 | |-+-+-|     | sw2015b |   32    542 | 48.5   23.6   27.9    1.7   53.1   56.3 | |-+-+-|     | sw2017b |   38    658 | 66.7   15.5   17.8    2.7   36.0   76.3 | |-+-+-|     | sw2017a |   40    332 | 56.9   24.1   19.0   11.4   54.5   87.5 | |-+-+-|     | sw2018a |   49    505 | 71.7   16.4   11.9    9.1   37.4   85.7 | |-+-+-|     | sw2018b |   42    586 | 56.8   22.5   20.6    0.5   43.7   61.9 | |-+-+-|     | sw2019a |   50    594 | 62.0   18.0   20.0    5.7   43.8   90.0 | |-+-+-|     | sw2019b |   52    452 | 73.5   14.4   12.2    1.3   27.9   48.1 | |=================================================================|     | Sum/Avg | 1000  12903 | 57.4   20.9   21.7    3.1   45.8   73.9 | |=================================================================|     |  Mean   | 35.7  460.8 | 59.1   20.5   20.4    3.5   44.4   72.5 | | S.D.   | 16.3  229.0 | 10.5    5.1    6.2    3.6   11.3   17.3 | | Median | 33.5  459.5 | 62.1   19.7   19.3    2.1   43.7   78.6 | `-'

Not totally discouraging. We did use default values for training not suited to the size of the corpus. Training on more hours, with the configuration values tweaked accordingly, could possible yield much better results.