Speech:Exps 0303 005

Description
Author: Danielle

Date: 1-29-2018

Purpose: First train/decode

Details: Running a 30 hour train and decoding the trained data. I followed the instructions on these two links:

Create an LM: https://foss.unh.edu/projects/index.php/Speech:Create_LM Run decode trained data: https://foss.unh.edu/projects/index.php/Speech:Run_Decode_Trained_Data

DO NOT COPY AND PASTE COMMANDS INTO COMMAND LINE

Create LM

1) " mkdir LM " in YOUR base experiment folder (mine is 005)

2) " cd LM " into the new directory

3) Copy over the transcript used from the corpus directory: " cp -i /mnt/main/corpus/switchboard/30hr/train/trans/train.trans trans_unedited "

NOTE I ran a 30 hour train, so my input is "30hr". This input depends on the length of your train. Prepare/Execute the script that will build the language model

1) Prepare the transcript: " /mnt/main/corpus/switchboard/dist/transcripts/ICSI_Transcriptions/trans/icsi/ParseTranscript.perl trans_unedited trans_parsed "

2) Create the actual language model using this script: " cp -i /mnt/main/scripts/user/lm_create.pl . "

3) Execute the script: " ./lm_create.pl trans_parsed "

Setup decode directory and Run the decode

1) " cd etc " in your base experiment directory

2) After reading the wiki, the best way I did it was running: " awk '{print $1}' /mnt/main/corpus/switchboard/30hr/test/trans/train.trans >> /mnt/main/Exp/0303/005/etc/005_decode.fileids "

NOTE: Again, the " 30hr " depends on how long you ran your train for. " 0303" is our number for sp18 and " 005 " is my base experiment directory.

3) Next I ran: " nohup run_decode.pl 0303/005 0303/005 1000 & "

See above note

4) I then ran the command: " parseDecode.pl decode.log hyp.trans "

5) Lastly, I executed: " sclite -r _train.trans -h hyp.trans -i swb >> scoring.log "

Results: --.    |                            hyp.trans                            | |-|    | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|    | sw2001b |    1     16 | 87.5    6.3    6.3    0.0   12.5  100.0 | |-+-+-|    | sw2005b |    2      6 |100.0    0.0    0.0   50.0   50.0  100.0 | |-+-+-|    | sw2006b |    1      3 | 66.7   33.3    0.0  100.0  133.3  100.0 | |-+-+-|    | sw2007b |    3     44 | 54.5   34.1   11.4   13.6   59.1  100.0 | |-+-+-|    | sw2008a |    1     11 | 63.6   36.4    0.0   18.2   54.5  100.0 | |-+-+-|    | sw2009a |    1      9 | 33.3   66.7    0.0   11.1   77.8  100.0 | |-+-+-|    | sw2010b |    1     48 | 47.9   33.3   18.8    6.3   58.3  100.0 | |-+-+-|    | sw2012a |    2     29 | 65.5   31.0    3.4    6.9   41.4  100.0 | |-+-+-|    | sw2013b |    1      3 | 66.7   33.3    0.0  100.0  133.3  100.0 | |-+-+-|    | sw2013a |    1      6 |100.0    0.0    0.0   16.7   16.7  100.0 | |-+-+-|    | sw2014a |    1     14 | 50.0   42.9    7.1   14.3   64.3  100.0 | |-+-+-|    | sw2015b |    1     43 | 79.1   16.3    4.7    4.7   25.6  100.0 | |-+-+-|    | sw2017b |    1     24 | 45.8   45.8    8.3    0.0   54.2  100.0 | |-+-+-|    | sw2018a |    1     14 | 50.0   50.0    0.0   21.4   71.4  100.0 | |-+-+-|    | sw2018b |    1     26 | 57.7   23.1   19.2   15.4   57.7  100.0 | |-+-+-|    | sw2019b |    2     42 | 88.1    4.8    7.1    9.5   21.4  100.0 | |-+-+-|    | sw2020b |    2     28 | 46.4   53.6    0.0   25.0   78.6  100.0 | |-+-+-|    | sw2022a |    1      5 | 80.0   20.0    0.0   80.0  100.0  100.0 | |-+-+-|    | sw2023a |    2     23 | 65.2   30.4    4.3    8.7   43.5  100.0 | |-+-+-|    | sw2023b |    1      6 | 83.3   16.7    0.0   16.7   33.3  100.0 | |-+-+-|    | sw2024a |    1      4 | 50.0   25.0   25.0    0.0   50.0  100.0 | |-+-+-|    | sw2025a |    2     35 | 20.0   71.4    8.6    0.0   80.0  100.0 | |-+-+-|    | sw2027b |    1     14 | 35.7   57.1    7.1    0.0   64.3  100.0 | |-+-+-|    | sw2028b |    3     11 | 54.5   36.4    9.1   27.3   72.7  100.0 | |-+-+-|    | sw2032b |    2     10 | 90.0   10.0    0.0   40.0   50.0   50.0 | |-+-+-|    | sw2035b |    1     33 | 60.6   27.3   12.1    3.0   42.4  100.0 | |-+-+-|    | sw2035a |    1      3 |100.0    0.0    0.0   66.7   66.7  100.0 | |-+-+-|    | sw2036a |    2     20 | 55.0   30.0   15.0    0.0   45.0  100.0 | |-+-+-|    | sw2038a |    1     39 | 59.0   28.2   12.8    0.0   41.0  100.0 | |-+-+-|    | sw2039b |    3     22 | 81.8   18.2    0.0   13.6   31.8   66.7 | |-+-+-|    | sw2040a |    1     16 | 62.5   31.3    6.3   12.5   50.0  100.0 | |-+-+-|    | sw2040b |    1     14 | 78.6    7.1   14.3    0.0   21.4  100.0 | |-+-+-|    | sw2041a |    1     30 | 30.0   50.0   20.0    3.3   73.3  100.0 | |-+-+-|    | sw2041b |    1      6 | 50.0   50.0    0.0   16.7   66.7  100.0 | |-+-+-|    | sw2044a |    2     64 | 68.8   23.4    7.8    1.6   32.8  100.0 | |-+-+-|    | sw2045a |    1     10 | 90.0    0.0   10.0    0.0   10.0  100.0 | |-+-+-|    | sw2045b |    1     30 | 16.7   13.3   70.0    0.0   83.3  100.0 | |-+-+-|    | sw2050b |    2     59 | 52.5   32.2   15.3   10.2   57.6  100.0 | |-+-+-|    | sw2051b |    2     29 | 51.7   34.5   13.8    3.4   51.7  100.0 | |-+-+-|    | sw2051a |    1      6 | 33.3   66.7    0.0    0.0   66.7  100.0 | |-+-+-|    | sw2053a |    1      4 |100.0    0.0    0.0   25.0   25.0  100.0 | |-+-+-|    | sw2053b |    1      3 | 66.7   33.3    0.0  133.3  166.7  100.0 | |-+-+-|    | sw2054b |    3     24 | 62.5   33.3    4.2    8.3   45.8  100.0 | |-+-+-|    | sw2055a |    1     18 | 72.2   27.8    0.0   11.1   38.9  100.0 | |-+-+-|    | sw2055b |    1     36 | 72.2   13.9   13.9    2.8   30.6  100.0 | |-+-+-|    | sw2056b |    1     19 | 89.5   10.5    0.0    0.0   10.5  100.0 | |-+-+-|    | sw2057a |    2     70 | 70.0   27.1    2.9   11.4   41.4  100.0 | |-+-+-|    | sw2060b |    1     19 | 57.9   42.1    0.0    5.3   47.4  100.0 | |-+-+-|    | sw2061b |    2     38 | 31.6   44.7   23.7    0.0   68.4  100.0 | |-+-+-|    | sw2061a |    2     43 | 74.4   16.3    9.3    4.7   30.2  100.0 | |-+-+-|    | sw2062a |    1     14 | 78.6   21.4    0.0   14.3   35.7  100.0 | |-+-+-|    | sw2062b |    1      3 |100.0    0.0    0.0   66.7   66.7  100.0 | |-+-+-|    | sw2064b |    2     20 | 60.0   30.0   10.0    5.0   45.0  100.0 | |-+-+-|    | sw2065b |    2     35 | 80.0    8.6   11.4    5.7   25.7  100.0 | |-+-+-|    | sw2065a |    1     19 | 84.2   10.5    5.3    5.3   21.1  100.0 | |-+-+-|    | sw2071a |    1     16 | 56.3   18.8   25.0    0.0   43.8  100.0 | |-+-+-|    | sw2071b |    1     13 | 84.6   15.4    0.0    0.0   15.4  100.0 | |-+-+-|    | sw2072b |    1     16 | 25.0   50.0   25.0    0.0   75.0  100.0 | |-+-+-|    | sw2073b |    2     19 | 94.7    5.3    0.0    5.3   10.5  100.0 | |-+-+-|    | sw2078a |    2     53 | 28.3   50.9   20.8    3.8   75.5  100.0 | |-+-+-|    | sw2078b |    2     18 | 55.6   44.4    0.0   22.2   66.7  100.0 | |-+-+-|    | sw2079b |    1      7 | 42.9   14.3   42.9    0.0   57.1  100.0 | |-+-+-|    | sw2080b |    1     19 | 63.2   26.3   10.5   21.1   57.9  100.0 | |-+-+-|    | sw2080a |    1      3 | 66.7    0.0   33.3    0.0   33.3  100.0 | |-+-+-|    | sw2082a |    1      3 | 66.7   33.3    0.0  100.0  133.3  100.0 | |-+-+-|    | sw2083a |    2     32 | 50.0   21.9   28.1    3.1   53.1  100.0 | |-+-+-|    | sw2085b |    3     61 | 52.5   29.5   18.0    1.6   49.2  100.0 | |-+-+-|    | sw2086b |    2     19 | 78.9   21.1    0.0   15.8   36.8  100.0 | |-+-+-|    | sw2086a |    1     15 | 26.7   40.0   33.3    0.0   73.3  100.0 | |-+-+-|    | sw2087a |    1     37 | 81.1   13.5    5.4    0.0   18.9  100.0 | |-+-+-|    | sw2089b |    1     37 | 59.5   27.0   13.5    5.4   45.9  100.0 | |-+-+-|    | sw2089a |    2     19 | 68.4   31.6    0.0   15.8   47.4  100.0 | |-+-+-|    | sw2090a |    1     12 | 75.0   16.7    8.3    0.0   25.0  100.0 | |-+-+-|    | sw2090b |    1      8 | 62.5   12.5   25.0    0.0   37.5  100.0 | |-+-+-|    | sw2091b |    1     39 | 76.9   20.5    2.6    2.6   25.6  100.0 | |-+-+-|    | sw2092b |    2     10 | 50.0   40.0   10.0    0.0   50.0   50.0 | |-+-+-|    | sw2093b |    1     48 | 45.8   43.8   10.4    2.1   56.3  100.0 | |-+-+-|    | sw2094a |    2     20 | 65.0   10.0   25.0   10.0   45.0  100.0 | |-+-+-|    | sw2095a |    1      9 | 77.8   11.1   11.1    0.0   22.2  100.0 | |-+-+-|    | sw2096b |    2     22 | 81.8   18.2    0.0   27.3   45.5  100.0 | |-+-+-|    | sw2101a |    1      3 | 66.7   33.3    0.0  166.7  200.0  100.0 | |-+-+-|    | sw2101b |    1      3 | 66.7   33.3    0.0   66.7  100.0  100.0 | |-+-+-|    | sw2102a |    2     53 | 66.0   18.9   15.1    1.9   35.8  100.0 | |-+-+-|    | sw2105a |    1     14 | 57.1   42.9    0.0   21.4   64.3  100.0 | |-+-+-|    | sw2105b |    1     13 | 61.5   23.1   15.4    0.0   38.5  100.0 | |-+-+-|    | sw2107a |    2     11 | 63.6   36.4    0.0   63.6  100.0  100.0 | |=================================================================|    | Sum/Avg |  123   1872 | 60.7   28.2   11.1    8.5   47.8   97.6 | |=================================================================|    |  Mean   |  1.4   21.8 | 63.8   26.9    9.3   18.3   54.4   98.4 | | S.D.   |  0.6   16.2 | 19.4   16.5   11.7   31.3   32.8    8.3 | | Median |  1.0   18.5 | 64.3   27.2    6.7    5.6   49.6  100.0 | `-'