Speech:Exps 0013

From Openitware
Jump to: navigation, search


Title: Mini Dev Decode


Description

Author: Cedric Woodbury

Date: August 20, 2012

Purpose: Test Decode with 5 minutes of dialog using the Acoustic Model created under Exp 0012

Details: This experiment is the first one that will not use an Acoustic Model created from the transcript used. It will use the Acoustic Model created under Experiment 0012. This experiment will use 5 minutes of dialog that immediately follows the hours worth of dialog used for experiment 0012. The decode process will attempt to use the training session from Exp 0012 to predict the spoken words in this transcript.

The experiment was set up in the normal way. The normal experiment set up process was followed but stopped short of executing RunAll.pl to run the train. The language model was created and the decode process was modified to use the Acoustic Model created under Experiment 0012. The transcript is located under /mnt/main/corpus/switchboard/mini/dev/train/train.trans The sph files used are located under /mnt/main/corpus/switchboard/mini/dev/wav/

Results

  • First the transcript and wav files needed to be created and placed in the corpus directory noted in the details section.
    • I used the createTranscript.pl file from the master transcript located under /mnt/main/corpus/dist/Switchboard/trainscripts/ICSI_Transcriptions/trans/icsi/ms98_icsi_word.text I indicated that it should start at 3600 seconds (1 hour) and extract 5 minutes of dialog.
    • I created a script called copySph.pl to create a list of the sph files used in the transcript. It then copies only the sph files used from the /mnt/main/corpus/dist/Switchboard/flat directory.
      • I ran into a small problem with this step. The transcript referenced a sph file sw2269A-ms98-a-0001.sph. Unfortunately this sph file does not exist. Not sure as to why. There were only 3 lines of dialog that called for this file so I removed them and the script processed successfully.
  • With the transcripts and sph files in place I set up the experiment normally (see Exp 0012 for details.
    • The parsed transcript was created.
    • The pruned dictionary was created.
  • I created the language model as I did before in Experiment 0012
  • Now I just needed to run the decode.
    • However I first needed to modify the run_decode.pl script so that it will allow the user to specify which experiment to use for the acoustic model
    • Information how to use run_decode.pl can be found here.
    • Decode was executed with the following command
      ./run_decode.pl 0013 0012
    • 0013 specifies that to decode experiment 0013 and 0012 specifies to use the acoustic model from experiment 0012.
    • Decoding finished without any issues.
  • I then used sclite to generate the results of the decode.

Summary

Decoding use the acoustic model from Experiment 0012 was successful. sclite produced the following results.

                     SYSTEM SUMMARY PERCENTAGES by SPEAKER

      ,-----------------------------------------------------------------.
      |                            hyp.trans                            |
      |-----------------------------------------------------------------|
      | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err |
      |=================================================================|
      | Sum/Avg |   49    864 | 44.9   46.5    8.6   17.4   72.5   98.0 |
      |=================================================================|
      |  Mean   |  2.6   45.5 | 47.3   45.4    7.3   29.7   82.4   97.4 |
      |  S.D.   |  1.5   35.7 | 18.7   19.0    8.8   55.3   59.3   11.5 |
      | Median  |  2.0   35.0 | 42.9   52.4    6.0   14.3   70.2  100.0 |
      `-----------------------------------------------------------------'