Speech:Exps 0128

Description
Author: Eric Beikman

Date: 8/06/13

Purpose: The goal of this experiment is to get a baseline decode for an acoustic model created from a 10 hour train.

Details: This particular experiment is similar to experiment [Exps_0090|0090], using the same corpus (last_5hr/test), dictionaries, transcript, audio files, phone list, and Language model from the experiment. As last_5hr/test comprises 30 minutes of audio within the 10hr/train corpus, it will suffice as a test corpus; it will also serve as a way to compare the results of this train with our previous experiments.

Results Experiment decoded a 30 minute last_5hr/test corpus without any issues.

This process took about 15527 seconds or about 259 minutes or 4.3 hours on batch machine: 'miraculix'.

The following score was created during this experiment: SYSTEM SUMMARY PERCENTAGES by SPEAKER

,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg |  437   6474 | 77.3   17.4    5.3   14.2   36.9   97.9 | |=================================================================|     |  Mean   | 36.4  539.5 | 77.4   17.4    5.1   15.3   37.9   98.1 | | S.D.   |  8.3  143.2 |  4.9    4.2    1.9    5.7    7.0    2.7 | | Median | 32.5  546.5 | 78.6   17.5    5.0   14.6   38.6  100.0 | `-'

Increasing the amount of data used to create the acoustic model resuts in a higher word error rate when compared to our previous baselines ([Speech:Exps_0090|Experiment 0090]). It it possible that previous models were training too much on the data, thus giving a higher accuracy for decodes on that data. It is also important to note that the first_5hr corpus tended to give higher word error rates than experiments using the last_5hr corpus, even if both experiments use the same processes.