Speech:Exps 0024

Description
Author: Eric Beikman Date: (Started) 3/12/2013

Purpose: To gain a familiarity with the Sphinx trainer and the steps necessary to complete a model. This experiment will test to see if removing edundant transcript entries before training will have a difference in the resulting model's accuracy.

Details: For this Experiment, we utilized the existing Mini/Train corpus for both Training and decoding. In Experiment 0020, we had issues scoring due to redundant transcript entries. We wish to determine if leaving these redundancies in has a negative or positive effect on the resulting model. The results of this eperiment will be compared with the results of the next Experiment, 0025. For this Experiment, we will be leaving the redundant transcript entries in for the training and decoding processes. Results The first few trains failed due to missing dictionary entries. Added the following entries into the dictionary: IBM AY1 B IY1 EH1 M FEDERALES  F EH1 D ER AH0 L IY1 S DUCTWORK  D AH1 K T W ER1 K COGNIZITIVE  K AA1 G N AH0 Z IH0 T IH0 V CHOWPERD CH AW1 P ER0 D ALBRIDGE AO1 L B R IH1 JH SOUTHBEND S AW1 TH B EH1 N D VOCALIZED V OW1 K AH0 L AY2 Z D MOOSEWOOD M UW1 S W UH2 D UNDERGRAD AH1 N D ER0 G R AE1 D GTE JH IY1 T IY1 IY1 MARYLANDER M EH1 R IY0 L AE2 N D ER0 MARYLANDER'S M EH1 R IY0 L AE2 N D ER0 Z PLANOITE P L EY1 N OW0 AY0 T DADGUM D AE1 D G AH1 M EXPERIENCEWISE  IH0 K S P IH1 R IY0 AH0 N S W AY1 Z CANSEGO  K AE1 N S EY1 G OW1 HOPELY HH OW1 P L IY0 STORLY S T AO1 R L IY0 KID'LL K IH1 D L REINJURING  R IY2 IH1 N JH ER0 IH0 NG NFL  EH1 N EH1 F EH1 L PE  P IY1 IY1 UNDERGRADS AH1 N D ER0 G R AE1 D Z MARYLANDER'S M EH1 R IY0 L AE2 N D ER0 Z

After the dictionary was fixed, the train process ran successfully, creating the proper Continuous model. The creation of the language model and the Decode process proceeded with no issues, taking substancially less time than Experiment 0020.

Due to the existance of redundant entries within the reference and hypothesis transcripts, both transcripts needed to be ran through uniq. After this process, SCLite was able to generate the following score: ,-.     |                              hyp.trans                          | |-|     |         | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | | Sum/Avg | 549  10919 | 84.0    9.2    6.8    6.3   22.3   89.8 | |=================================================================|     |  Mean   |  2.9   57.8 | 83.3   10.4    6.3   12.2   28.9   92.2 | | S.D.   |  1.9   45.0 | 11.8    9.7    5.8   20.4   24.8   18.2 | | Median |  3.0   47.0 | 85.0    8.3    5.6    6.1   22.4  100.0 | `-'