Speech:Exps 0297 003

Description
Author: Andrew George

Date: 3-3-2017

Purpose: Running 5hour train

Details:

`feat/sw4927A-ms98-a-0060.mfc' -> `/mnt/main/corpus/switchboard/full/train/audio/mfc/sw4927A-ms98-a-0060.mfc' `feat/sw4928A-ms98-a-0034.mfc' -> `/mnt/main/corpus/switchboard/full/train/audio/mfc/sw4928A-ms98-a-0034.mfc' `feat/sw4936A-ms98-a-0033.mfc' -> `/mnt/main/corpus/switchboard/full/train/audio/mfc/sw4936A-ms98-a-0033.mfc' `feat/sw4940A-ms98-a-0013.mfc' -> `/mnt/main/corpus/switchboard/full/train/audio/mfc/sw4940A-ms98-a-0013.mfc' `feat/sw4940B-ms98-a-0057.mfc' -> `/mnt/main/corpus/switchboard/full/train/audio/mfc/sw4940B-ms98-a-0057.mfc' Complete!

Complete!

Run "nohup scripts_pl/RunAll.pl &" to begin training.

[acg12@miraculix 003]$ nohup scripts_pl/RunAll.pl &

[1] 2478

[acg12@miraculix 003]$ MODULE: 00 verify training files

O.S. is case sensitive ("A" != "a").

Phones will be treated as case sensitive.

Phase 1: DICT - Checking to see if the dict and filler dict agrees with the phonelist file. Found 4781 words using 43 phones Phase 2: DICT - Checking to make sure there are not duplicate entries in the dictionary Phase 3: CTL - Check general format; utterance length (must be positive); files exist Phase 4: CTL - Checking number of lines in the transcript should match lines in control file Phase 5: CTL - Determine amount of training data, see if n_tied_states seems reasonable. Total Hours Training: 5.20207478632504 This is a small amount of data, no comment at this time Phase 6: TRANSCRIPT - Checking that all the words in the transcript are in the dictionary Words in dictionary: 4775 Words in filler dictionary: 6 Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once

MODULE: 01 Vector Quantization Skipped for continuous models

MODULE: 02 Training Context Independent models for forced alignment and VTLN Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg

MODULE: 03 Force-aligning transcripts Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg

MODULE: 04 Force-aligning data for VTLN Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg

MODULE: 05 Train LDA transformation Skipped (set $CFG_LDA_MLLT = 'yes' to enable)

MODULE: 06 Train MLLT transformation Skipped (set $CFG_LDA_MLLT = 'yes' to enable)

MODULE: 20 Training Context Independent models Phase 1: Cleaning up directories: accumulator...logs...qmanager...models... Phase 2: Flat initialize Phase 3: Forward-Backward Baum-Welch iteration 1 Average log-likelihood -4.29953813929095

Baum-Welch iteration 2 Average log-likelihood -3.22528399896217

Baum-Welch iteration 3 Average log-likelihood -1.17170908308954

Baum-Welch iteration 4 Average log-likelihood 0.14360948860453

Baum-Welch iteration 5 Average log-likelihood 0.639224983835538

Baum-Welch iteration 6 Average log-likelihood 0.801541767523234

Baum-Welch iteration 7 Average log-likelihood 0.871504218048702

Baum-Welch iteration 8 Average log-likelihood 0.919400028709894 Training completed after 9 iterations

MODULE: 30 Training Context Dependent models Phase 1: Cleaning up directories: accumulator...logs...qmanager... Phase 2: Initialization

WARNING: This step had 0 ERROR messages and 1 WARNING messages. Please check the log file for details. Phase 3: Forward-Backward

Baum-Welch iteration 1 Average log-likelihood 0.97404330674416

Baum-Welch iteration 2 Average log-likelihood 4.10839191258828

Baum-Welch iteration 3 Average log-likelihood 4.10839191258828 Training completed after 4 iterations

MODULE: 40 Build Trees Phase 1: Cleaning up old log files... Phase 2: Make Questions Phase 3: Tree building Processing each phone with each state Skipping +laugh+ Skipping +noise+ Skipping SIL Skipping +vocalized+

MODULE: 45 Prune Trees Phase 1: Tree Pruning Phase 2: State Tying

MODULE: 50 Training Context dependent models Phase 1: Cleaning up directories: accumulator...logs...qmanager... Phase 2: Copy CI to CD initialize Phase 3: Forward-Backward

Baum-Welch gaussians 1 iteration 1 Average log-likelihood 0.97404330674416

Baum-Welch gaussians 1 iteration 2 Average log-likelihood 1.9876213407351

Baum-Welch gaussians 1 iteration 3 Average log-likelihood 2.19972363145882

Baum-Welch gaussians 1 iteration 4 Average log-likelihood 2.27945002521629

Baum-Welch gaussians 2 iteration 1 Average log-likelihood 2.27945002521629

Baum-Welch gaussians 2 iteration 2 Average log-likelihood 2.46259875572927

Baum-Welch gaussians 2 iteration 3 Average log-likelihood 2.9305915582444

Baum-Welch gaussians 2 iteration 4 Average log-likelihood 3.36600695416455

Baum-Welch gaussians 2 iteration 5 Average log-likelihood 3.5799823598345

Baum-Welch gaussians 2 iteration 6 Average log-likelihood 3.68487848443843

Baum-Welch gaussians 4 iteration 1 Average log-likelihood 3.27729874638923

Baum-Welch gaussians 4 iteration 2 Average log-likelihood 3.85697747956527

Baum-Welch gaussians 4 iteration 3 Average log-likelihood 4.15700916415856

Baum-Welch gaussians 4 iteration 4 Average log-likelihood 4.49727633594374

Baum-Welch gaussians 4 iteration 5 Average log-likelihood 4.71955167098665

Baum-Welch gaussians 4 iteration 6 Average log-likelihood 4.84172944934142

Baum-Welch gaussians 8 iteration 1 Average log-likelihood 4.46556717141779

Baum-Welch gaussians 8 iteration 2 Average log-likelihood 5.05477340087299

Baum-Welch gaussians 8 iteration 3 Average log-likelihood 5.32856496200895

Baum-Welch gaussians 8 iteration 4 Average log-likelihood 5.61711175085846

Training for 8 Gaussian(s) completed after 5 iterations

MODULE: 90 deleted interpolation Skipped for continuous models

MODULE: 99 Convert to Sphinx2 format models Can not create models used by Sphinx-II. If you intend to create models to use with Sphinx-II models, please rerun with: $ST::CFG_HMM_TYPE = '.semi.' or $ST::CFG_HMM_TYPE = '.cont' and $ST::CFG_FEATURE = '1s_12c_12d_3p_12dd' and $ST::CFG_STATESPERHMM = '5'

[1]   Done                          scripts_pl/RunAll.pl

Results: Successfully ran train and created language model. Ran into issues when decoding.
 * Successful decode until running the command:
 * sclite -r 003_train.trans -h hyp.trans -i swb >> scoring.log
 * Received error: sclite: Command not found.
 * Backed out of Miraculix and ran command from Caesar:
 * [acg12@caesar etc]$ sclite -r 003_train.trans -h hyp.trans -i swb >> scoring.log
 * Segmentation fault (core dumped)
 * Experiment failed