Speech:Exps 0124

From Openitware
Jump to: navigation, search

Exp. 124: Decode on 0123 with Gaussian density of 16


Description

Author: Eric Beikman

Date: 7/30/13

Purpose: The goal of this experiment is to test the effects of decoding a 30 minute corpus using an acoustic model created with increased gaussian densities and senones.

Details: This particular experiment is similar to experiment 0090, using the dictionaries, transcript, audio files, phone list, and Language model from the experiment. Like experiment 0122, this will decoding 0.5 hour corpus from last_5hr/test using a gaussian density of 16; however, in this experiment we will be bumping up the senone value from its default of 1000 to 4000. The acoustic model which this experiment uses (Experiment 0123) was created using these same values as well.

Results Experiment decoded a 30 minute last_5hr/test corpus without any issues.

This process took about 15527 seconds or about 259 minutes or 4.3 hours on batch machine: 'miraculix'.


The following score was created during this experiment:

                     SYSTEM SUMMARY PERCENTAGES by SPEAKER

      ,-----------------------------------------------------------------.
      |                            hyp.trans                            |
      |-----------------------------------------------------------------|
      | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err |
      |=================================================================|
      | Sum/Avg |  437   6474 | 92.7    4.6    2.7   12.1   19.5   87.2 |
      |=================================================================|
      |  Mean   | 36.4  539.5 | 92.6    4.8    2.6   13.0   20.4   87.3 |
      |  S.D.   |  8.3  143.2 |  2.4    1.7    1.2    5.1    5.9    9.1 |
      | Median  | 32.5  546.5 | 92.4    4.8    2.4   13.3   21.4   88.0 |
      `-----------------------------------------------------------------'

By increasing the senone value and the gaussian densities, we have greatly increased speech accuracy. At the same time, we have also greatly increased the time needed to decode. What took 20 minutes now takes over 4 hours!