Speech:Exps 0282 005

Description
Author: Peter Ferro

Date: 2-10-2016

Purpose: This is my first sub-experiment that I have ever created. This experiment will be a test in regards to how well I can run an experiment.

Details: This experiment will determine how well I can execute a test experiment using standard parameters as indicated in the instructions.

Results: The following directories were created when I ran prepareTrainExperiment.pl:  bin etc logdir model_parameters scripts_pl bwaccumdir feat model_architecture</li> python</li> wav</li> </ul> Got some warnings once I activated the train script. These were relatively minor, and simply refer to material that didn't show up. <ul> WARNING: This phone (+laugh+) occurs in the phonelist (/mnt/main/Exp/0282/005/etc/005.phone), but not in any word in the transcription (/mnt/main/Exp/0282/005/etc/005_train.trans)</li> WARNING: This phone (+noise+) occurs in the phonelist (/mnt/main/Exp/0282/005/etc/005.phone), but not in any word in the transcription (/mnt/main/Exp/0282/005/etc/005_train.trans)</li> WARNING: This phone (+vocalized+) occurs in the phonelist (/mnt/main/Exp/0282/005/etc/005.phone), but not in any word in the transcription (/mnt/main/Exp/0282/005/etc/005_train.trans)</li> </ul> I noticed that six modules fired off almost immediately and they were skipped. The module that started up was module 20: Training Context Independent models. Module 30: Training Context Dependent models was the next one to fire. I got one warning, but no errors. Module 40: Build Trees was the next one up. +laugh+, +noise+, +vocalized+ and SIL were skipped. Module 45: Prune Trees was then initialized, followed by Module 50: Training Context dependent models. The next module to fire, module 90, got skipped. Therefore, module 99: convert to Sphinx2 format models was run... and...    Can not create models used by Sphinx-II. If you intend to create models to use with Sphinx-II models, please rerun with: $ST::CFG_HMM_TYPE = '.semi.' or $ST::CFG_HMM_TYPE = '.cont' and $ST::CFG_FEATURE = '1s_12c_12d_3p_12dd' and $ST::CFG_STATESPERHMM = '5' There is no more feedback from the program. I decided to try to log out and I get this add-on to my normal log out message... [1]   Done                          scripts_pl/RunAll.pl. My instincts say that the program finished normally after looking at two past logs (specifically Speech:Spring_2015_Stephen_Griffin_Log and Speech:Spring_2015_Zachery_Boynton_Log, both which I discovered through Google), and that this doesn't appear to be a major problem. Since the first part ran OK, I'll record what I got for each module that gave me Baum-Welch data... MODULE: 20 Training Context Independent models Phase 1: Cleaning up directories: accumulator...logs...qmanager...models... Phase 2: Flat initialize Phase 3: Forward-Backward Baum-Welch iteration 1 Average log-likelihood -2.28080384519049 Baum-Welch iteration 2 Average log-likelihood -1.74503012872003 Baum-Welch iteration 3 Average log-likelihood 0.140713790115246 Baum-Welch iteration 4 Average log-likelihood 1.33266053786056 Baum-Welch iteration 5 Average log-likelihood 1.76773418799124 Baum-Welch iteration 6 Average log-likelihood 1.8855910301487 Training completed after 7 iterations MODULE: 30 Training Context Dependent models Phase 1: Cleaning up directories: accumulator...logs...qmanager... Phase 2: Initialization WARNING: This step had 0 ERROR messages and 1 WARNING messages. Please check the log file for details. Phase 3: Forward-Backward Baum-Welch iteration 1 Average log-likelihood 1.98283607347617 Baum-Welch iteration 2 Average log-likelihood 4.97233031677522 Baum-Welch iteration 3 Average log-likelihood 6.25534157416196 Training completed after 4 iterations MODULE: 50 Training Context dependent models Phase 1: Cleaning up directories: accumulator...logs...qmanager... Phase 2: Copy CI to CD initialize Phase 3: Forward-Backward Baum-Welch gaussians 1 iteration 1 Average log-likelihood 1.98283607347617 Baum-Welch gaussians 1 iteration 2 Average log-likelihood 2.79177721335269 Baum-Welch gaussians 1 iteration 3 Average log-likelihood 2.96589426925883 Baum-Welch gaussians 1 iteration 4 Average log-likelihood 3.03908168038514 Baum-Welch gaussians 2 iteration 1 Average log-likelihood 2.62403004736864 Baum-Welch gaussians 2 iteration 2 Average log-likelihood 3.23419973594472 Baum-Welch gaussians 2 iteration 3 Average log-likelihood 3.63817243652861 Baum-Welch gaussians 2 iteration 4 Average log-likelihood 4.00872218850803 Baum-Welch gaussians 2 iteration 5 Average log-likelihood 4.20344816350175 Baum-Welch gaussians 2 iteration 6 Average log-likelihood 4.3138994786139 Baum-Welch gaussians 4 iteration 1 Average log-likelihood 3.92999816271941 Baum-Welch gaussians 4 iteration 2 Average log-likelihood 4.51412319038712 Baum-Welch gaussians 4 iteration 3 Average log-likelihood 4.80637666566846 Baum-Welch gaussians 4 iteration 4 Average log-likelihood 5.1254095906047 Baum-Welch gaussians 4 iteration 5 Average log-likelihood 5.31837702169076 Baum-Welch gaussians 8 iteration 1 Average log-likelihood 4.97596569934187 Baum-Welch gaussians 8 iteration 2 Average log-likelihood 5.58559996417973 Baum-Welch gaussians 8 iteration 3 Average log-likelihood 5.86999771068058 Baum-Welch gaussians 8 iteration 4 Average log-likelihood 6.1621661439267 Training for 8 Gaussian(s) completed after 5 iterations The language model was successfully created without issue. Here is the end result of what the language model script produced... INFO: ngram_model_arpa.c(476): ngrams 1=3310, 2=18342, 3=31169 INFO: ngram_model_arpa.c(135): Reading unigrams INFO: ngram_model_arpa.c(515):    3310 = #unigrams created INFO: ngram_model_arpa.c(194): Reading bigrams INFO: ngram_model_arpa.c(531):   18342 = #bigrams created INFO: ngram_model_arpa.c(532):    1143 = #prob2 entries INFO: ngram_model_arpa.c(539):    1654 = #bo_wt2 entries INFO: ngram_model_arpa.c(291): Reading trigrams INFO: ngram_model_arpa.c(552):   31169 = #trigrams created INFO: ngram_model_arpa.c(553):     428 = #prob3 entries INFO: ngram_model_dmp.c(492): Building DMP model... INFO: ngram_model_dmp.c(522):    3310 = #unigrams created INFO: ngram_model_dmp.c(622):   18342 = #bigrams created INFO: ngram_model_dmp.c(623):    1143 = #prob2 entries INFO: ngram_model_dmp.c(630):    1654 = #bo_wt2 entries INFO: ngram_model_dmp.c(634):   31169 = #trigrams created INFO: ngram_model_dmp.c(635):     428 = #prob3 entries And with that, it is now time to run the decoder. I decided to go with the initial 1000 for now. I plan on going up to 3506 (which I found out via ls ../wav | wc -l while in the etc directory of my experiment), but since this is a testing session and I don't want to consume too much time for my first session, I'll just stick to the first thousand or so. Everything appeared to run correctly up until I attempted to run the actual script. I got a command not found error. Undeterred, I ran this command instead... nohup perl run_decode.pl 005 0282/005 1000 The command provided no feedback and I am very suspicious that there was an ampersand that was supposed to be attached to this... at least in case of longer than expected command that I shouldn't have to stick around for. The log file was successfully created, and then I produced a scoring file on the first go. The chart is shown below. SYSTEM SUMMARY PERCENTAGES by SPEAKER

,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|     | sw2001b |   18    163 | 79.8   16.6    3.7   39.9   60.1  100.0 | |-+-+-|     | sw2001a |   14    101 | 82.2   15.8    2.0   50.5   68.3  100.0 | |-+-+-|     | sw2005a |   39    701 | 81.9   13.1    5.0   13.8   32.0   94.9 | |-+-+-|     | sw2005b |   67    613 | 64.9   24.3   10.8   29.2   64.3  100.0 | |-+-+-|     | sw2006b |   29    618 | 72.5   18.4    9.1    7.3   34.8  100.0 | |-+-+-|     | sw2006a |   33    455 | 85.9   11.0    3.1   15.4   29.5  100.0 | |-+-+-|     | sw2007a |   61    614 | 76.2   16.0    7.8   17.3   41.0   91.8 | |-+-+-|     | sw2007b |   60    861 | 81.3   14.6    4.1    9.4   28.1   90.0 | |-+-+-|     | sw2008a |   24    260 | 80.4   16.5    3.1   26.9   46.5  100.0 | |-+-+-|     | sw2008b |   26    257 | 86.4   10.1    3.5   26.5   40.1   92.3 | |-+-+-|     | sw2009b |   23    181 | 74.6   17.1    8.3   29.3   54.7   95.7 | |-+-+-|     | sw2009a |   34    473 | 69.3   23.7    7.0   17.3   48.0   97.1 | |-+-+-|     | sw2010b |   22    404 | 69.6   21.5    8.9   11.1   41.6   95.5 | |-+-+-|     | sw2010a |   27    284 | 78.5   16.2    5.3   20.8   42.3   96.3 | |-+-+-|     | sw2012a |   45    838 | 82.2   11.6    6.2   12.6   30.4   93.3 | |-+-+-|     | sw2012b |   28    464 | 77.6   16.8    5.6   15.3   37.7  100.0 | |-+-+-|     | sw2013a |   35    377 | 68.4   26.3    5.3   21.8   53.3   97.1 | |-+-+-|     | sw2013b |   69    942 | 52.0   33.7   14.3   12.7   60.7   97.1 | |-+-+-|     | sw2014a |    9     79 | 74.7   21.5    3.8   32.9   58.2  100.0 | |-+-+-|     | sw2014b |   13    174 | 82.2   13.8    4.0   16.7   34.5   92.3 | |-+-+-|     | sw2015a |   21    375 | 74.1   17.1    8.8    3.5   29.3   85.7 | |-+-+-|     | sw2015b |   32    542 | 75.5   14.0   10.5    9.2   33.8  100.0 | |-+-+-|     | sw2017b |   38    658 | 81.8   12.9    5.3   14.3   32.5  100.0 | |-+-+-|     | sw2017a |   40    332 | 76.8   16.9    6.3   29.5   52.7  100.0 | |-+-+-|     | sw2018a |   49    505 | 62.2   31.7    6.1   43.0   80.8  100.0 | |-+-+-|     | sw2018b |   42    586 | 73.7   19.8    6.5   20.0   46.2  100.0 | |-+-+-|     | sw2019a |   50    594 | 75.4   18.0    6.6   12.1   36.7   90.0 | |-+-+-|     | sw2019b |   52    452 | 83.2   13.3    3.5   24.3   41.2  100.0 | |=================================================================|     | Sum/Avg | 1000  12903 | 74.7   18.4    6.9   17.6   42.9   96.6 | |=================================================================|     |  Mean   | 35.7  460.8 | 75.8   17.9    6.2   20.8   45.0   96.8 | | S.D.   | 16.3  229.0 |  7.6    5.7    2.8   11.3   13.5    4.1 | | Median | 33.5  459.5 | 76.5   16.7    5.9   17.3   41.4   98.6 | `-'