Speech:Spring 2014 Ramon Whitman Log


 * Home
 * Semesters
 * Spring 2014
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Week Ending February 4th, 2014
  For first time to ssh into Caesar and access in order to better understand how it works as well as change access password.  View Josh logs and other group members' logs to see what they have done as well as seeing if there is anything they have have done that I could use for help: Josh went through Speak so I would also like to set it up for my self and better understand how it works. The goal is that by setting up Speak, look through all the wiki pages and trying it myself I can prepare myself for later in the course.  Read through experiment wiki page and existing experiments.  See if any fields need to be added. Explore and document how the experiments are saved on disks. Familiarize my self with Perl and attempt to create Perl scripts for creating new experiments. 
 * Tasks:


 * Results:

 February 1th, 2014 

Tried to set up SPeak DB. Was unable to follow instructions correctly. Listing of file path and file names are in need of updating. When looking for sql files was unable to find the ones listed in the instructions. There are files in the old folder but based on this folder's name it is clear that they are older versions of existing db. Posted questions on our groups google group and tried to set up Speak again. No Luck so far, however, hope for responses and will try again tommorrow,

After trying to work to setup Speak I then switched over to working on Caesar. I successfully changed pass word with unix cmd passwd. Spent some time familiarizing myself with Unix cmds.

caesar sp14/rav57> whoami

rav57 caesar sp14/rav57> man What manual page do you want? caesar sp14/rav57> etthool

CORRECT>ethtool (y|n|e|a)? no

etthool: Command not found.

caesar sp14/rav57> uname -r

2.6.34.10-0.6-default

caesar sp14/rav57> df -h

Filesystem           Size  Used Avail Use% Mounted on /dev/sda2              20G  5.3G   14G  29% / devtmpfs            1012M  168K 1012M   1% /dev tmpfs               1012M  192K 1012M   1% /dev/shm /dev/sda3             46G  337M   43G   1% /home /dev/sdb1            438G  119G  297G  29% /mnt/main

found more commands to use and test at http://www.gotothings.com/unix/unix-commands-cheat-sheet.htm

Started to look at the different experiments as well as experiment setup notes to get a better idea on how it is done.

 February 2th, 2014  Started more testing of Unix commands. was able to identify all hidden and non hidden files in /main/, The goal today is to further more testing to see what can be done on caeser with my permission levels. I wish to see what is on caeser and see if there are any questions I come up with from testing.

caesar /mnt/main> ls -al total 64 drwxrwxrwx 16 root  root   4096 2014-02-01 17:42. drwxr-xr-x  3 root  root   4096 2013-02-14 02:54 .. drwx--  4 root  root   4096 2012-07-10 06:24 backup -rw-r--r--  1 jmr95 cis790    0 2014-02-01 17:42 cd drwxr-xr-x   6 root  cis790 4096 2013-04-25 13:34 corpus drwxrwxr-x 145 root cis790 4096 2013-09-21 20:29 Exp drwxr-xr-x  9 root  cis790 4096 2014-01-30 18:12 home drwxr-xr-x  4 root  root   4096 2012-06-27 07:39 install drwxr-xr-x 11 root  cis790 4096 2012-03-20 03:48 local -rw-r--r--  1 jmr95 cis790    0 2014-02-01 17:42 ls drwxr-xr-x   2 root  cis790 4096 2012-01-30 10:15 notes drwxr-xr-x  6 root  root   4096 2012-06-27 07:58 old drwxr-xr-x  8 root  cis790 4096 2013-02-19 01:32 root drwxr-xr-x  8 root  cis790 4096 2014-01-30 23:10 scripts drwxrwxr-x  3  2204 cis790 4096 2013-06-28 09:22 srv -rw-r--r--  1 jmr95 cis790    0 2014-02-01 17:42 ssh drwxr-xr-x  6 root  root   4096 2011-02-14 05:01 svn drwxr-xr-x  3 root  root   4096 2013-06-25 18:53 ttemp drwxr-xr-x  3 root  root   4096 2013-06-27 21:44 var caesar /mnt/main>

Also took a look at all the hardware information we had on miraculix. Had found so interesting commands in the Unix notes and tried them all out.

hwinfo

Directory: /mnt/main/Exp caesar main/Exp> ls 0001 0007  0013  0019  0025  0031  0037  0043  0049  0055  0061  0067  0073  0079  0085  0091  0097  0103  0109  0115  0121  0127  0133  0139 0002 0008  0014  0020  0026  0032  0038  0044  0050  0056  0062  0068  0074  0080  0086  0092  0098  0104  0110  0116  0122  0128  0134  0140 0003  0009  0015  0021  0027  0033  0039  0045  0051  0057  0063  0069  0075  0081  0087  0093  0099  0105  0111  0117  0123  0129  0135  0141 0004  0010  0016  0022  0028  0034  0040  0046  0052  0058  0064  0070  0076  0082  0088  0094  0100  0106  0112  0118  0124  0130  0136  0142 0005  0011  0017  0023  0029  0035  0041  0047  0053  0059  0065  0071  0077  0083  0089  0095  0101  0107  0113  0119  0125  0131  0137  0143 0006  0012  0018  0024  0030  0036  0042  0048  0054  0060  0066  0072  0078  0084  0090  0096  0102  0108  0114  0120  0126  0132  0138 caesar main/Exp>

 February 03, 2014 

Read through logs and was unable to get SpEAK running on my computer. To access and familiarize myself with SpEAK,Unix, Perl, and Caeser through reading mediawiki and online sources. Get the Web application running on my computer.  February 4, 2014  Read logs
 * Plan:


 * Concerns:

Week Ending February 11, 2014
 2/10/14 - logged in and viewed Brian's, Pauline's, and Josh's log.   Review Speech: Spring 2014 Experiment Group http://foss.unh.edu/projects/index.php/Speech:Spring_2014_Experiment_Group</li> Review Speech: Training http://foss.unh.edu/projects/index.php/Speech:Training</li> Review Speech: Spring 2014 Proposal http://foss.unh.edu/projects/index.php/Speech:Spring_2014_Proposal </li> </ol>  Friday 7th, 2014 
 * Task:

Based on our groups immediate goals in order to better prepare myself to write the perl scripts for the training I talked to Pauline. Both Pauline and I agreed that it would help us a lot to set up a train ourselves so I looked at th notes on setting up a train. Set up the task directory - http://foss.unh.edu/projects/index.php/Speech:Training#Set_up_the_task_directory Set up the Sphinx Train Configuration file - http://foss.unh.edu/projects/index.php/Speech:Training#Set up the Sphinx Train Configuration file

Read logs and looked at some of the experiments.

 Friday 8th, 2014 

Accidentally created experiment 0151 and experiment. Do not have permissions to remove. Trying to create my own training and finally went with 0152.

can't follow these instructions

Prep the experiment directory. This process creates all sub-folders, copies over some essential scripts (though not all), and imports a generic train configuration file (sphinx_train.cfg).

% /mnt/main/root/tools/SphinxTrain-1.0/scripts_pl/setup_SphinxTrain.pl -task <experiment #>

Do not execute this in the root experiment folder! (/mnt/main/Exp)

It will make a mess.

I did not follow the instructions correctly so when I set up my folder I accidentally set it up in the root Exp folder. After twenty minutes of looking up how to delete and removing the directory and files I accidentally created in the root folder of Exp I corrected myself.
 * Results:

 Friday 9th, 2014 

Did everything up to the start of train. Am stuck on the part of terms missing in the dictionary. Am following the instructions butkeep seeing the same words missing though I try it different ways.

now stuck with: Configuration (e.g. etc/sphinx_train.cfg) not defined Compilation failed in require at /mnt/main/Exp/0152/scripts_pl/RunAll.pl line 48. BEGIN failed--compilation aborted at /mnt/main/Exp/0152/scripts_pl/RunAll.pl line 48.

 Friday 11th, 2014  MODULE: 00 verify training files O.S. is case sensitive ("A" != "a"). Phones will be treated as case sensitive. Can not open the dictionary (/mnt/main/Exp/0152/etc/0152.dic) at /mnt/main/Exp/0152/scripts_pl/00.verify/verify_all.pl line 87. Something failed: (/mnt/main/Exp/0152/scripts_pl/00.verify/verify_all.pl)

Will ask Pauline and group on success on train as well as instructions. I will ask my group for help if unable to complete this by next class Not to familiar with Perl so a little anxious on righting some automated scripts. Plan to talk to group about perl and what we plan to do for the following week,
 * Plan:
 * Concerns:

Week Ending February 18, 2014

 * Logged on from Saturday to Tuesday
 * Just checked Logs and site 2/18/14


 * Task:
 * Fully explore and understand the data structure and workings of an experiment.
 * Go through the experiment and run a train instructions again to see what each step does, what files are used and where.
 * Note any current wiki guide documentation that needs to be backed up and updated.


 * Results:

v I decided to look at the beginning instead and work my way from there.
 * February 15, 2014


 * Model Building


 * 1. Preparation

To run a train and decode, we need three main groups of files: the actual audio files in .SPH format, a transcript of the audio files, and a working dictionary.

After running an experiment I learned that it is key to have a dictionary with the most current words and the phonetic spelling of the words as well as the pronunciations for names. If we do not have this then the train setup instructions provide instructions on how to manually update it ourselves:

The easiest way would be to open the file through notepad and manually adding them:

1. To get the phonetic spelling for a word: (in the instructions copied to below) http://www.speech.cs.cmu.edu/cgi-bin/cmudict

You could search for the word at the CMU Pronouncing dictionary. Be sure to click on the "Show lexical stress" check-box before searching! The trainer expects these lexical stress indicators, which are the numbers 0 through 2 which are attached to the end of certain phones, they slightly modify how the phone is pronounced. If you are trying to find a number, type the number out as a word instead of an actual numeric character. (I.E. "seven" instead of "7").

Also, do not include the periods that the dictionary puts at the end of each word! It will cause the trainer to error out. Generate the phonetic spelling based on similar words. This method is especially useful when pronouncing compound words. For example, to create the phonetic spelling for Sawmill, get the phonetic spellings of Saw (S AO1) and Mill (M IH1 L)from the CMU pronouncing dictionary, concatenating each one at the end to form S AO1 M IH1 L Generate the phonetic spelling yourself. This way is a bit harder, I only recommend doing it if you can't find word in the previous methods.

Get the IPA spelling from a good dictionary. Using the IPA to Arpabet phoneme comparisonlist. Translate each IPA symbol from the dictionary to the matching Arpabet symbol. You will need to add the stress values at the end of each stressed syllabic vowel.


 * /mnt/main/corpus/switchboard is where you can find the audio files used in all the trains
 * /mnt/main/corpus/dist is where the dictionarys for the trains can be found


 * 2. Language Preparation

This deals with working to develop a language model. I didn't fully understand what one as so I looked it up.

"Language modeling is used in many natural language processing applications such as speech recognition, machine translation, part-of-speech tagging, parsing and information retrieval.

In speech recognition and in data compression, such a model tries to capture the properties of a language, and to predict the next word in a speech sequence."

Anyway in order to do this, I looked at the instructions:


 * First build a language model by removing all unwanted characters from the raw transcript file.
 * We do this, by using the ParseTranscript.perl script
 * execute the script with "perl ParseTranscript.perl test.txt tmp.text"
 * The result of running this script is a copy of the transcript that contains only what was said in the audio files.

Creating the Language Model

This step involves creating the actual language model. This is done by running the lm_create.pl script.


 * This perl script calls four different executable commands.


 * 3. Building and Verifying Acoustic Models

This involves two things: building an acoustic and language model


 * Building an Acoustic Model


 * A mini train and decode was completed several times with different data following these steps.
 * The purpose of this task is to take conversations saved and their transcripts to be able to create a speech recognition tool.
 * The trainer grabs the .wav files, phonemes dictionary, dictionary, and transcript of the conversations.
 * We then match up the audio with the transcript.


 * Verifying an Acoustic Model


 * The decoder needed in order to verify whether works and how accurate it is. This is completed by running one script, run_decode.pl


 * February 17th, 2014


 * Running a train

These folders:


 * bin: (Don't know)
 * bwaccumdir: contains counting files
 * etc: contains the sphinx configuration file (sphinx_train.cfg), the transcript, the experiment dictionary (<experiment #>.dic). The .dic contains a list of words and their pronunciation in Arpabet format, and the experiment file-IDs.
 * feat:Features data is used in training and is derived from the recordings.
 * logdir: Contains log files for
 * model_architecture:
 * model_parameters:
 * python: in this folder, there is a folder called sphinx/ and it contains a bunch of (.py) python files.
 * scripts.pl: It looks like this folder contains all of the scripts that were executed throughout the particular experiment process.
 * wav: Contains all of the sphinx audio files
 * An html file: The trainer will create an HTML logfile the base experiment directory with the name <experiment #>.html, this document contains everything that was outputted to the screen by the trainer. Forgot to cite this from Pauline : http://foss.unh.edu/projects/index.php/Speech:Spring_2014_Pauline_Wilk_Log


 * Plan:

Our whole group's goals have changed so this week I am going in a different direction. The goal is as follows from our groups site:

We need to collectively be learning the structure of the experiment directory. We will all work on this individually through out the week and log our research in each of our personal logs. then, before the next meeting we will collaborate our efforts into a guide and add it to the information page. When learning the structure of the experiment directory, we can do this by focusing on the following:

Where things are How they are stored What does sphinx create when it runs a train?

There are currently guides on wiki that reference items in file paths that no longer exist. To go through these guides and update them will the correct file paths and file names. Decipher and explain Eric Beikman's experiment automation scripts. (Articulate what already exists)

Describe in detail what each of them do and document it on the information media wiki page.


 * Concerns:

Week Ending February 25, 2014

 * Logged on 2/22/14
 * Logged on 2/23/14
 * Logged on 2/24/14
 * Logged on 2/25/14


 * Task:


 * Read how to set up experiment and document for self to understand going through the steps
 * Document the process and key terms and file uses


 * Results:


 * February 22, 2014

What we use

I have been a little confused on what the whol process does so i wanted to understand the tools we use. The wiki we have does not explain for me the Speech system so I looked up an example of each part and cited them for myself so i can understand what we use.

http://cmusphinx.sourceforge.net/wiki/sphinx4:sphinx4trainer
 * Sphinx Trainer

Example trainer7

The Sphinx-3 HMM trainer, taken as an example, goes through the following stages:

Initialization of Context Independent (CI) models Required data: list of phones, model description Creation of model definition file for CI phones

Training of CI models Required data: transcriptions, speech data (cepstra), CI model definition, initial set of CI models Split training data into blocks and compute Baum-Welch variables Initialization of Context-Dependent (CD) models Required data: list of phones, CI model definition, CI models, transcriptions Creation of model definition file for CD phones, by creation of all possible CD phones in the dictionary, and then pruning based on frequency in the training transcripts Initialization of models based on CI models Training of untied CD models Required data: transcriptions, speech data, CD model definition, initial set of CD models Split training data into blocks and compute Baum-Welch variables Normalize Iterate Baum-Welch and normalization until convergence Building trees Make linguistic questions Build classification and regression trees, so as to classify the untied states based on proximity Pruning trees Prune trees to the desired number of senones, that is, a number of tied states. Initialization of tied CD models Creation of tied CD models definition Creation of initial set of models from the CI models Training tied CD models Required data: transcriptions, speech data (cepstra), tied CD model definition, initial set of tied CD models Split training data into blocks and compute Baum-Welch variables Normalize Iterate Baum-Welch and normalization until convergence Deleted interpolation Conversion to other formats, for backwards compatibility

Types of models used for Training

We need models? http://cmusphinx.sourceforge.net/wiki/tutorialconcepts


 * Models

According to the speech structure, three models are used in speech recognition to do the match:

An acoustic model contains acoustic properties for each senone. There are context-independent models that contain properties (most probable feature vectors for each phone) and context-dependent ones (built from senones with context).

A phonetic dictionary contains a mapping from words to phones. This mapping is not very effective. For example, only two to three pronunciation variants are noted in it, but it's practical enough most of the time. The dictionary is not the only variant of mapper from words to phones. It could be done with some complex function learned with a machine learning algorithm.

A language model is used to restrict word search. It defines which word could follow previously recognized words (remember that matching is a sequential process) and helps to           significantly restrict the matching process by stripping words that are not probable. Most common language models used are n-gram language models-these contain statistics of word sequences-and finite state language models-these define speech sequences by finite state automation, sometimes with weights. To reach a good accuracy rate, your language model must be very successful in search space restriction. This means it should be very good at predicting the next word. A language model usually restricts the vocabulary considered to the words it contains. That's an issue for name recognition. To deal with this, a language model can contain smaller chunks like subwords or even phones. Please note that search space restriction in this case is usually worse and corresponding recognition accuracies are lower than with a word-based language model.

Those three entities are combined together in an engine to recognize speech. If you are going to apply your engine for some other language, you need to get such structures in place. For many languages there are acoustic models, phonetic dictionaries and even large vocabulary language models available for download.

Experiment setup


 * First is to go the the experiment directory:
 * cd /mnt/main/Exp
 * Second, make a new directory to house the experiment you wish to do:
 * mkdir <exp ID>
 * Third, Now that an experiment folder is made you need to move to that directory:
 * cd <exp ID>
 * Fourth, This process creates all sub-folders, copies over some essential scripts (though not all), and imports a generic train configuration file (sphinx_train.cfg).
 * /mnt/main/root/tools/SphinxTrain-1.0/scripts_pl/setup_SphinxTrain.pl -task <exp ID>

At first tried following this: http://foss.unh.edu/projects/index.php/Speech:Exp but had trouble with the steps

Instead tried following: http://foss.unh.edu/projects/index.php/Speech:Training

1. Once the experiment directy and basic directories are setup the following needs to be done: Need to be in the etc directory: cd etc The lines we are interested in changing are on lines 6 through 8 and 79 & 80. Edit them so they look like the following: Substituting <experiment #> for your current experiment number like always.
 * Set up the Sphinx Train Configuration file:
 * Looked at this folder and found that there, like the instructions say, important files such as the Sphinx configuration file, the transcript, the experiment dictionary, the experiment file-IDs, and quite a bit of other important files.
 * We now need to modify the Sphinx trainer configuration file (sphinx_train.cfg) to match the specific experiment environment.
 * vi sphinx_train.cfg

#These are filled in at configuration time $CFG_DB_NAME = "<experiment #>"; $CFG_BASE_DIR = "/mnt/main/Exp/<experiment #>"; $CFG_SPHINXTRAIN_DIR = "/mnt/main/Exp";

Comment out the line on line 80 by inserting a hash/pound/number symbol (#) in front of it; likewise, uncomment the line on line 79 by removing that symbol. It should look like the following when done:

$CFG_HMM_TYPE = '.cont.';# Sphinx III #$CFG_HMM_TYPE = '.semi.'; # Sphinx II
 * quit with :wq


 * Generate the transcript and its associated audio-file list.

Trains: There are multiple trains:


 * 10hr
 * 308hr
 * first_5hr
 * full
 * last_5hr
 * mini
 * mini2
 * tiny

What are they?

They contain a train.trans file that contains a written copy a conversation.

There is also a wav directory that contains a bunch of .sph files that are the audio version of the conversation that we will use in our train.

For the longest time I did not fully understand what a transcript was.:

A transcript is a written record of spoken language. This means it is a typed version of the audio files.

In order to do this we need to follow these instructions:

We now need to generate the transcripts to be used. Transcripts consist of two portions: *The text transcript files: <experiment #>_train.trans *The audio file ID list which contains the list of audio files which make up the transcript In order to generate the necessary files it is needed to first determine which train we are going to do. Afterwards: http://foss.unh.edu/projects/index.php/Speech:Training

For example, /mnt/main/corpus/switchboard/mini/ contains "./dev", "./eval", and "./train". "./train" would be used for training and "./eval" would be used for evaluating the resulting model in a subsequent experiment. After you have selected a subset you: Execute the following script from your base experiment directory! I.E. /mnt/main/Exp/<experiment #>
 * Run the genTrans6.pl script
 * The main Corpus subsets are found in /mnt/main/corpus/switchboard/
 * Each of those directories contain both audio files and textual transcripts (though neither are in a format that we can use directly).

Example: % /mnt/main/scripts/user/genTrans6.pl <experiment #>

For example, to create a transcript for experiment 0028 with corpus subset mini/train execute:

% /mnt/main/scripts/user/genTrans6.pl /mnt/main/corpus/switchboard/mini/train 0028

genTrans6.pl may take a little bit to process, especially if the transcript is long. Generates a copy of Transcripts and audio files in designated directory.
 * genTrans6.pl


 * Generate Custom Dictionary

Dictionary is used in the train process. We first need to generate the edited the dictionary for it's specific use.

In order to generate a "pruned" dictionary we go to where it will be generated cd etc

From there we then run a script that prunes the master dictionary. Why, well we prune the dictionary so we have a dictionary has everything we need and only what we need. This is done in order to save time in the train and the decode.

It has three arguments (in order), the name of the transcript to generate a word list from, a "Master" dictionary to reference from, and the file name of the new dictionary to be created. Normal useage is as follows:

% /mnt/main/scripts/train/scripts_pl/pruneDictionary2.pl <experiment #>_train.trans /mnt/main/corpus/dist/cmudict.0.6d <experiment #>.dic

For now, use the /mnt/main/corpus/dist/cmudict.0.6d master dictionary, though this may change in the future. This dictionary creation process can be very time consuming and is based on the size of the master dictionary and the amount of unique words in the transcript. It may take a while if you have a lot of words.

Here is some comments from the file that I feel do well at explaining what the script does:


 * 1) This runs text2wfreq which gives a unique list of all the words that appear in the transcript
 * 2) including how many times each word appears.  Unfortunately that includes the (swxxx) statements
 * 3) Those results are sorted and fed to grep which yanks out the sw statement lines and outputs the
 * 4) results to a temp file.


 * 1) for each word in the temp word list this loop strips each word of any numbers
 * 2) after a whitespace (meaning that a word consisting of a numeric character will
 * 3) be allowed), it will also strip out any words which begin with a '<'.
 * 4) Such characters always precedes a non-word attribute which is not defined in the ditionary.
 * 5) It then sames the line in a temporary pruned file.

After that, we have to now copy over the "filler" dictionary into the same /etc folder


 * The filler dictionary is composed of non-speech events, mapping them to user-defined phones.
 * An example usage would be:

cp -i /mnt/main/root/tools/SphinxTrain-1.0/train1/etc/train1.filler <experiment #>.filler


 * Generate phone list http://www.speech.cs.cmu.edu/sphinxman/scriptman1.html#02


 * phonelist, which is a list of all acoustic units that you want to train models for.

How you do it:

Copy the genPhones.csh script to your etc folder:

% cp -i /mnt/main/scripts/user/genPhones.csh.

Execute it with:

% ./genPhones.csh <experiment #>

We need to insert a new phone into the <experiment #>.phone list created in the last step.

Vi and edit <experiment #>.phone Insert SIL in the appropriate alphabetic-ordered spot. Not doing this will cause the trainer to error out.

Generate Feats data. Creating the feats data. "Feats data, short for Features, is used in training and is derived from the recordings. The data derived from this step is also used when decoding."

To create Feats for your train, simply execute In your base experiment folder:

% /mnt/main/scripts/train/scripts_pl/make_feats.pl -ctl /mnt/main/Exp/<experiment #>/etc/<experiment #>_train.fileids

For example, to create the feats data for experiment 0028, execute:

% /mnt/main/scripts/train/scripts_pl/make_feats.pl -ctl /mnt/main/Exp/0028/etc/0028_train.fileids


 * Start the Train!

Run the following in your base experiment folder:

% /mnt/main/scripts/train/scripts_pl/RunAll.pl

All Phones used in the dictionary are defined in the <experiment #>.phone file.


 * Language Model

Setup the Language Model folder and copy over the unedited transcript. this transcript will be used in order to build a language model, this is defined earlier in the log as a model to identity sequence of words and help restrict the matching process.

From your Base Experiment folder make a folder called LM.

% mkdir LM

Go into this new directory.

% cd LM

Copy over the transcript used from the corpus directory: Put the corpus path you used when creating your transcript (using genTrans.pl) in !

% cp -i /trans/train.trans trans_unedited

FOR EXAMPLE: If we are using the mini/train corpus:

% cp -i /mnt/main/corpus/switchboard/mini/train/trans/train.trans trans_unedited


 * Important scripts

After talking to my group we found that there are a lot of scripts undocumented. Pauline has gone through many of the scripts already and documented who made them, where they are located, and what they do. Based on this, I have read her logs and for my purpose I will just list there names for me to reference and link to Pauline's page when need the full details. http://foss.unh.edu/projects/index.php/Speech:Spring_2014_Pauline_Wilk_Log

List
 * 1) clone_exp.pl - mnt/main/scripts/user/clone_exp.pl convert.pl - mnt/main/scripts/user/convert.pl
 * 2) copySph.pl - mnt/main/scripts/user/copySph.pl
 * 3) createTranscript.pl - mnt/main/scripts/user/createTranscript.pl
 * 4) createdict.pl - mnt/main/scripts/user/createDict.pl and mnt/main/scripts/train/scripts_pl/createdict.pl
 * 5) dictionary.pl - mnt/main/scripts/user/dictionary.pl
 * 6) dictionary2.pl - mnt/main/scripts/user/dictionary2.pl
 * 7) dictionary3.pl - mnt/main/scripts/user/dictionary3.pl
 * 8) find.pl - mnt/main/scripts/user/find.pl
 * 9) gen_errors.pl - mnt/main/scripts/user/gen_errors.pl
 * 10) genFileIDs.csh - mnt/main/scripts/user/genFileIDs.csh
 * 11) genPhones.csh - mnt/main/scripts/user/genPhones.csh
 * 12) genTrans.pl - mnt/main/scripts/user/genTrans.pl also in mnt/main/corpus/scripts/genTRans.pl
 * 13) genTrans2.pl - mnt/main/scripts/user/genTrans2.pl
 * 14) genTrans3.pl - mnt/main/scripts/user/genTrans3.pl
 * 15) genTrans4.pl - mnt/main/scripts/user/genTrans4.pl
 * 16) genTrans5.pl - mnt/main/scripts/user/genTrans5.pl
 * 17) genTrans6.pl - mnt/main/scripts/user/genTrans6.pl
 * 18) lm_create.pl - mnt/main/scripts/user/lm_create.pl
 * 19) make_feats.pl - mnt/main/scripts/train/scripts_pl/make_feats.pl
 * 20) make_phoneset.pl - mnt/main/scripts/train/scripts_pl/make_phoneset.pl
 * 21) parseDecode.pl - mnt/main/scripts/user/parseDecode.pl
 * 22) parseTranscript2.pl - mnt/main/scripts/user/ParseTranscript2.pl
 * 23) process_missing_words.pl
 * 24) the title claims that it processes missing words. Not sure on this one yet.
 * 25) pruneDictionary2.pl - mnt/main/scripts/user/PruneDictionary2.pl
 * 26) pruneDictionary3.pl - mnt/main/scripts/train/scripts_pl/pruneDictionary3.pl
 * 27) RunAll.pl - mnt/main/scripts/train/scripts_pl/RunAll.pl
 * 28) run_decode.pl - mnt/main/scripts/user/run_decode.pl
 * 29) run_decode2.pl - mnt/main/scripts/user/run_decode2.pl
 * 30) run_decode3.pl - mnt/main/scripts/user/run_decode3.pl
 * 31) setup_SphinxTrain.pl - mnt/main/scripts/user/setup_SphinxTrain.pl
 * 32) setup_tutorial.pl - mnt/main/scripts/train/scripts_pl/setup_tutorial.pl
 * 33) train_01.pl - mnt/main/scripts/user/train_01.pl
 * 34) train_02.pl - mnt/main/scripts/user/train_02.pl
 * 35) tune_senones.pl mnt/main/scripts/train/scripts_pl/tune_senones.pl
 * 36) updateDict.pl - mnt/main/scripts/user/updateDict.pl


 * Meeting

Met with David and Colby from the Modeling group. We talked about the training process and I learned that they have new steps to revise the training process.

I plan to meet with David or Colby and document the train and experiment.
 * Plan:


 * Concerns:

Week Ending March 4, 2014

 * Logged in 3/1/2014
 * Logged in 3/2/2014 and tried to log on to caesar: could not.
 * Task:


 * 1) Familiarize myself with Master run train.pl
 * 2) Josh is making changes to script so update it's script page if any big changes have been done.
 * 3) Create a rough draft of the new experiment creation page based on Master run train.pl


 * Results:


 * March 3,2014

Couldn't find Josh's master run train.pl to test. Instead, went through the script and what we has done so far in order to understand how it works.

So far, here is what it can do based on the comments: This script will do the following: 1.Creates a new Experiment directory (specify with -n flag) in the root experiment directory. 2.Runs the setup_SphinxTrain.pl (customize with -s flag) script with the appropriate arguments. 3.with -a flag) Gives the user's default group (Yours is: '$group') read/write permissions to both the newly created base experiment directory, but also everything within that directory. 4.Edit the configs on lines 6-8 & lines 79-80. 5.Optionally with -r) Replace a config file with a supplied one without editing it. 6.Optionally with -b flag, implied with -r & -c) Backup the existing config file by appending a '.old' to the end of the filename. If a  prospective backup filename is already used, the script will append successively higher numbers until it finds a unique one. 7.(Optionally with -c), copies over a pre-existing config file at the specified location OR experiment and adapt it for use for the given experiment.

Time to break it down. Josh is making minor changes to the script and when it is finally completed we I plan to remake the experiment creation page.

1. In order to create the experiment directory the master script actually calls the script: exp_dir_setup.pl Experiment directory setup script: Accomplishes the following tasks:

1) Creates a new Experiment directory in the root experiment directory.

2) Runs the setup_SphinxTrain.pl script with the appropriate arguments.

3) Gives that base experiment directory group read/write permissions.

This step requires two arguements: Arguments:
 * 1) REQUIRED: ExiermentId (non-existing one)
 * 2) REQUIRED: Assigning Write privledges on folders for user (defaulted to true)


 * 2.

Runs the setup_SphinxTrain.pl (customize with -s flag) script with the appropriate arguments.

Configuring the Sphinx Configuration CFG File

Scripts being used: exp_sphinx_config.pl

Arguments:
 * 1) REQUIRED: ExperimentId
 * 2) OPTIONAL: Density value (MUST BE MULTIPLE OF 2)
 * 3) OPTIONAL: Senone value

Experiment Configuration file setup script. This script is done in order to prepares a given Experiment config file (sphinx_train.cfg) for an experiment.

What it does: 1) Edit the configs on lines 6-8 & lines 79-80. 2) (Optionally with -r) Replace a config file with a supplied one. 3) (Optionally with -b flag, implied with -r) Backup the existing config file appending a ".old" to the end of the filename. If a prospective backup filename is already used, the script will append successively higher numbers until it finds a unique one.


 * 3/4/2014


 * 3.

Part # involves generating the transcripts need for the experiment. This works by calling on the genTrans#.pl file.

Part 3. Generate the Transcripts Need to move to the base experiment folder we just created. Scripts being used: genTrans5.pl


 * 1) Arguments:
 * 2) REQUIRED: Transcript Dictionary name (i.e. first5_hr/train, 10hr/train, tiny/train)

Right now we are using the genTrans5.pl version. There is no appropriate commenting in the genTrans5.pl so I do not fully understand how it work. The final result of running that script, however, provides us with the needed transcript.

I plan to start the documenting on the wiki page for the master train script. Waiting on Josh's minor tweaks and will start once he finalizes the steps he has already written.
 * Plan
 * Concerns:

Week Ending March 18, 2014

 * Logged in 3/15/14
 * Logged in to review other's activity 3/16/14
 * Logged in 3/17/14
 * Logged in 3/18/14


 * Task:
 * Check josh Completed Master page
 * Log changes to master script
 * Create new rough draft of completed master page


 * Results:

Completed rough version of new experiment page. Linked below:

http://foss.unh.edu/projects/index.php/Speech:Spring_2014_Ramon_Whitman_Log/Train


 * Plan:

Finalize new experiment page.

Read additional logs and and update experiment page when Josh completes it.


 * Concerns:

Week Ending March 25, 2014

 * Logged in 3/22/14
 * Logged in 3/23/14
 * Logged in 3/24/14 worked on train page

Try to run a variety of train and decodes and see the results I get.
 * Task:

Ran the first_5hr.train: at 32 density and 4000 senone value and was was unable to run a decode. When executed the instructions it said it ran the decode instantly. Could get the decode to run and therefore was unable to get the results i desired.
 * Results:

"Not enough reference files loaded, Missing:"

This error is caused by duplicate identical transcript entries in either the hypothesis transcript and/or the reference transcript. Usually it is the hypothesis transcript that causes the error, so we will focus on that.

Go to your experiment's etc directory if you aren't already there. Remove all redundant lines. We use a built-in Unix tool called uniq to do this for us. The output of this tool needs to go to a new file.

% uniq hyp.trans >> hyp.trans.uniq

Restart SCLite while using the newly created hyp.trans.uniq file.

sclite -r <experiment #>_train.trans -h hyp.trans.uniq -i swb >> scoring.log

If you get the same error again: Repeat the above process, but for the <experiment #>_train.trans file. Be sure to specify the new <experiment #>_train.trans.uniq file where appropriate in the sclite statement.


 * the above didn't work for me if I followed the instructions correctly.

I had also ran 0221 mini train at 16 density and 1000 senone value. tried: sclite -r <exp#>_train.trans -h hyp.trans -i swb >> scoring.log was missing files.

caesar 0221/etc> sclite -r 0221_train.trans.uniq -h hyp.trans -i swb >> scoring.log Error: double reference text for id '(sw2245a-ms98-a-0166)' Error: Not enough Reference files loaded Missing: (sw2005a-ms98-a-0052) (sw2020b-ms98-a-0018) (sw2022a-ms98-a-0005) (sw2028a-ms98-a-0049) (sw2234a-ms98-a-0007) (sw2245a-ms98-a-0166)

Try to run a successfull decode and document the results. Finish the how to set a train page.
 * Plan:

Tried 10hr Train and got:

ERROR: "ngram_model_arpa.c", line 465: File tmp.arpa not found ERROR: "ngram_model_dmp.c", line 105: Dump file tmp.arpa not found

Retried and got

ERROR: "ngram_model_arpa.c", line 76: No \data\ mark in LM file
 * Concerns:

Week Ending April 1, 2014

 * Logged in 3/27/14 created missing experiment pages
 * Logged in 3/29/14
 * Logged 3/30/14
 * Logged in 3/31/14


 * Task:

Created child experiment of first_5hr train in 0233.

Added a step to train page since it did not explain how to start the train once everything was all set up.

Update the Experiment page with the experiments that I've run

Update the Experiment Setup page http://foss.unh.edu/projects/index.php/Speech:Exp

Review logs


 * Results:


 * child s201

I ran this child and was unable to run a train:

MODULE: 90 deleted interpolation Skipped for continuous models MODULE: 99 Convert to Sphinx2 format models Can not create models used by Sphinx-II. If you intend to create models to use with Sphinx-II models, please rerun with: $ST::CFG_HMM_TYPE = '.semi.' or $ST::CFG_HMM_TYPE = '.cont' and $ST::CFG_FEATURE = '1s_12c_12d_3p_12dd' and $ST::CFG_STATESPERHMM = '5'


 * child s203

Created 16 density 5000 senome train from the last_5hr train.

ran train and created acoustic model

Went through Language Model steps and was unable to create a Language model. It froze and lost connection whenever I tried to prepare the transcript ./lm_create.pl trans_parsed


 * s204

Created a train from the last_5hr train with density 16 and senome default of 1000.

Ran up to Decode part when when script was executed there was an instantaneous result to the Decode.


 * Plan:

To create multiple child experiments and have a successful decode.

Help group in creation of experiment page.


 * Concerns:

Week Ending April 8, 2014

 * Logged in 4/5/14
 * Logged in 4/6/14
 * Logged in 4/7/14
 * Logged in 4/8/14

Organized a meeting for Sunday at the school library. Plan to get as much of the group as possible in order to start going over the experiment.
 * Task:

Looked over what an experiment is with Forrest as well as all the parameters we can mod when running a train:

(listed on our google groups)

Looked over what an experiment is with Forrest as well as all the parameters we can mod when running a Decode:

(listed on our google groups)

Read logs of all members and kept constant communication with members through the google groups and some via phone or email.

Struggling with decode on rw_002

Did a first_5hr/train on 32 Density and 10500 senone.
 * Inputs


 * Results

Baum-Welch iteration 1 Average log-likelihood Training failed in iteration 2 Something failed: (/mnt/main/Exp/0252/rw_003/scripts_pl/30.cd_hmm_untied/slave_convg.pl)

[1]   Exit 255                      /mnt/main/scripts/train/scripts_pl/RunAll.pl

line 45: use SphinxTrain::Util;

Found a list of parameters that can be modified for a train or decode. Tried three different format experiments each reaching a different step in the experiment process. Researched how to fix the problems but was unable to figure out what was run with the experiments.
 * Results:


 * Plan:

Experiment with different trains and density and senone. Look up possible parameters that can be modified in order to decrease error rate.

Concerned that the new decode page is hard to follow on how to properly run a decode.When I reach the decode stage I get an instant result. Would like to figure out what the error is and fix.
 * Concerns:

Week Ending April 15, 2014

 * Logged in 4/11/14
 * Logged in 4/12/14 read logs
 * Logged in 4/13/14
 * Logged in 4/14/14 read logs and researched

Fully comprehend senones and the role in training
 * Task:

The more senones model has, the more precisely it discriminates the sounds. But on the other hand if you have too many senones, model will not be generic enough to recognize unseen speech. That means that the WER will be higher on unseen data. That's why it is important to not overtrain the models. In case there are too many unseen senones, the warnings will be generated in the norm log on stage 50 below:
 * Results:

Based on this we would want more senones for a longer experiment. Since we are doing a 100 train we want a high senone value but not to high otherwise we may overtrain.

A senone is modeled by a set of streams and their corresponding stream weights. I.e. the HMM emission probability for a senone and a given frame is the weighted sum of outputs from any number of streams. (If you consider the output of a stream to be a log probability, then a weighted sum of logprobs with multiplicative weights becomes a weighted product of probabilities with exponential weights.) The internal representation of a senone, thus, consists of an array of  stream identifiers and an equally sized array of stream weights, and an equally sized array of class indices. When a score is computed, the class index of a stream is given to the stream's score computer. The return value, then, is what we called the stream's output. When we are using Gaussian mixtures, then a class index would be the index of a distribution. If we were using a neural net, then this index would probably be the index of an output node of the net, or some subnet identifier.

Senones are used in a train in order to build an acoustic model. Based on the citation above, These senones are modeled based on the audio streams used for training. The stream weights make up the HMM used for the acoustic model. We went over in class about HMMs.

In order to better understand senones it is important to understand the speech recognition process.

Recognition process
 * We take waveform which represent the audio we are trying to train on. Split it on utterances by silences then try to recognize then try to recognize what's being said in each utterance.
 * To do that we want to take all possible combinations of words and try to match them with the audio. We choose the best matching combination.

Features:


 * Since number of parameters is large, we are trying to optimize it.
 * Numbers that are calculated from speech usually by dividing speech on frames. Then for each frame of length typically 10 milliseconds we extract 39 numbers that represent the speech. That's called feature vector.


 * Model describes some mathematical object that gathers common attributes of the spoken word.
 * For audio model of senone is gaussian mixture of it's three states - to put it simple, it's a most probable feature vector.


 * The model of speech is called Hidden Markov Model or HMM.
 * In this model process is described as a sequence of states which change each other with certain probability.

An acoustic model contains acoustic properties for each senone. There are context-independent models that contain properties (most probable feature vectors for each phone) and context-dependent ones (built from senones with context).

Finding only senone information relating to JANUS anotrher speech recognition software and Sphinx4. There is only basic definition of what senones are and their relation to speech.
 * 4/13/14
 * Plan:


 * Concerns:

Week Ending April 22, 2014

 * Logged in 4/19/14
 * Logged in 4/20/14
 * Logged in 4/21/14
 * Logged in 4/22/14


 * Task:


 * 1) Create an experiment
 * 2) do further research for the group.

Ran a mini train errored out so redid as a first_5hr
 * Results:

Figure out why decodes wont work. Was unable to run a successful decode.
 * Plan:
 * Concerns:

Week Ending April 29, 2014

 * Logged in 4/26/14
 * Logged in 4/27/14

Define the new scripts David wrote,
 * Task:

Document alternate steps to creating an experiment.

Created new All in one experiment page. Tells how to run an experiment from start to finish.
 * Results:

Created new scripts page for prepareexperiment2.pl


 * Plan:


 * Concerns:

Week Ending May 6, 2014
start working on end of semester report.
 * Logged in 5/3/14
 * Logged in 5/4/14
 * Logged in 5/5/14
 * Logged in 5/6/14
 * Task:

Started defining and listed all scripts created this semester.
 * Results:

Did an overview of results of the experiment group.


 * Plan:


 * Concerns: