Speech:Exps 0115

From Openitware
Jump to: navigation, search

Repeat of Experiment 0090 (Decode using last_5hr/test on Exp 00114) using Fedora


Description

Author: Eric Beikman

Date: 7-8-2013

Purpose: Recreate Experiment 0090 using a different Distro.

Details: As the OS on caesar and the other batch machines is unsupported due to age, we wish to upgrade to not only a more modern version, but also a new Linux distribution which we believe will better support our needs. After a search, we decided on Fedora; installing the distro on one of our test machines, which we call 'rome' (previously known as Marathon). That being said, before rolling out an upgrade, we need to determine if switching to a new Linux distribution will affect the results of our experiment.

This experiment is a recreation of our current "baseline" decode experiment: Experiment 0090. We are decoding the 0.5 hour long last_5hr/test corpus using the LM from Experiment 0090 and an acoustic model from 0114.

Results Decoder ran as expected, no significant errors encountered.


The following score was created:

                     SYSTEM SUMMARY PERCENTAGES by SPEAKER

      ,-----------------------------------------------------------------.
      |                         hyp.trans                               |
      |-----------------------------------------------------------------|
      | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err |
      |---------+-------------+-----------------------------------------|
      |=================================================================|
      | Sum/Avg |  437   6474 | 62.8   18.9   18.3    3.3   40.5   80.1 |
      |=================================================================|
      |  Mean   | 36.4  539.5 | 63.9   18.3   17.7    3.5   39.5   79.4 |
      |  S.D.   |  8.3  143.2 |  8.9    4.7    4.6    2.0    8.7   15.0 |
      | Median  | 32.5  546.5 | 62.8   20.8   16.9    3.1   41.1   78.7 |
      `-----------------------------------------------------------------'

The score produced above is significantly worse than our baseline (by at least 10+ points). More tests are needed to see why this is so.