Speech:Spring 2016 Peter Ferro Log


 * Home
 * Semesters
 * Spring 2016
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Week Ending February 9, 2016
2/3/16: First time logging in (as root, as the corresponding user accounts have not been created yet) Went to /mnt/main/Exp, browsed experiments, and Matt gave me some info on the purpose of some of these directories within. 2/4/16: I attempted to connect three times remotely to caesar.unh.edu via my Mac computer via the terminal command ssh pjf2003@caesar.unh.edu. I was unable to connect because the operation timed out. I then attempted to connect three times remotely via my Windows laptop through putty. The operations timed out again. Because I was unable to connect to Caesar, I decided to look up the documentation and see what was out of date and/or looked like it needed some fixing up, continuing my investigations from previously. 2/5/16: I attempted to log on again remotely, this time logging in to cisunix.unh.edu. Both PuTTY and Terminal were complaining about keys being changed around, and my credentials did not work for some unusual reason... maybe I need to figure out if it is currently a valid username, or do I need to report a lost password, perhaps? 2/6/16: I had no choice but to physically go to the college to do any server-side work. I didn't know why my wildcats username and password are not working for cisunix.unh.edu. By physically going to the college, I was finally able to log on with my wildcats username for the first time. While there, I was able to resolve my technical difficulty with CISUnix by utilizing a feature that I had forgotten about because I had last used it nine months ago: by using BlackBoard's Link-Up feature under Login Help. 2/8/16: I logged in again and decided to start looking up the scripts and the experiment-related steps.

2/6/16: Try to figure out how to get my credentials for CISUnix restored with my Wildcats password. I must have remote access to Caesar, especially because of latency issues for me to physically drive myself here.
 * Task:

2/3/16: Discovered the following problems with the wiki:  Speech:Install has a missing enlarged image file near the Fileserver Installation (caesar) section. Speech:Models_Data_Prep has an incorrect directory for the .sph files. Instead of /media/data/Switchboard/disk1/swb1, it should be mnt/main/corpus/switchboard/dist/disk1/swb1 Speech:Run_Train_Setup_Script has a dead wiki link in Step 4: Modify Train Configuration. For whatever reason, Sphinx Train Configuration Guide does not exist.  2/4/16: More wiki investigations. Here is what I found...  In Speech:Active_Directory, Active Directory Setup on Fedora 18 is an empty section (perhaps this should be updated to indicate instructions for Redhat Enterprise?). Speech:Network only contains OpenSUSE-related material (does this need to be updated to refer to Redhat?). Speech:ExternalResources is mostly empty.  2/6/16: Made more wiki discoveries. Some of these were discovered after I logged on to Caesar.  Speech:Alter_Configs has an empty Decoding section.</li> In Speech:ExpDir, trees and wavTemp have empty descriptions. This page has given me a clue as for what the trees directory is supposed to do: they contain sub-directories that contain .dtree files that contain information on what sound waves consist of individual phonetic syllables. I actually recognize the style of the filenames in the trees directory as being like how I would type in speech commands for the Software Automatic Mouth (or SAM for short) on the Commodore 64: on a per-syllable basis, although maybe not with the exact lettering indicated.</li> Speech:Models_Data_Prep does not accurately represent the location of any of the internal files for the tools (I already corrected one of them). Some of them even use ~, which definitely can't be used when a user is trying to refer to that kind of directory, as it refers to the user's home directory, which in my case, is completely empty. So far, I have not yet determined where these files are located.</li> </ul> 2/8/16:  Speech:Scripts understandably doesn't contain everything for scripts, and some of them in Caesar are not listed. Perhaps I'll note down some of the scripts mentioned elsewhere that could use documentation... maybe. It depends on whether or not it's worth noting down. I did find the original source code to one of the scripts stored right in the wiki.</li> Speech:Master_run_train.pl has given me exactly what I'm looking for in regards to how to deal with speechtool-related directories for Speech:Models_Data_Prep. Instead of /speechtools/SphinxTrain-1.0/, we get /mnt/main/root/tools/SphinxTrain-1.0/ for all of the directories except for the one I listed earlier.</li> Speech:Run_Train_Setup_Script refers to a prepareExperimentMJ.pl that does not exist in the /mnt/main/scripts/user directory. Perhaps they mean prepareTrainExperiment or something else, or is it located somewhere else?</li> </ul>
 * Results:


 * Plan:

2/5/16-2/6/16:  Being unable to connect remotely (my credentials are not working properly for cisunix.unh.edu, meaning I am currently unable to perform a double login) is a big problem for me, as it means I would have to go to the college to work with the server.</li> </ul>
 * Concerns:

Week Ending February 16, 2016
2/13/16: I received an e-mail from Matthew Heyner in regards to ideas for the proposal. I realized that script documentation was part of the set, and I looked up Speech:Scripts to determine what was going on there... 2/14/16: I decided to see if I could take a local copy of the scripts that were on Caesar from my own house. In order to do that, I had to perform a lookup on my computer on what exactly I needed to do to perform such a task via a command line terminal. I was ultimately able to adapt one such method I found online via this lovely little page, which seems to be Linux-based. I did have to adapt it to my computer's option set, as I found out the -P option did not do what I expected it to do, so I looked up the manual and realized I had to adapt accordingly. As soon as I got that issue resolved and verified that it was working correctly, I was successfully able to transfer data from Caesar through CISUnix to my own computer at home. This allows me to look them up using my own non-command line utensils and see what's going on without accidentally harming the script files on Caesar.

2/11/16: Run a test experiment by following the directions. 2/14/16: Start organizing the scripts.
 * Task:

2/10/16: I talked to Ben Leith from the Modelling group to deal with the wiki issues that I discovered. This is what's going to happen...  Speech:Install will be ignored.</li> Speech:Active_Directory will also be ignored. This page is no longer in use.</li> Speech:Network will be taken care of by the Systems group.</li> Speech:Models_Data_Prep is going to be corrected.</li> Speech:Run_Train_Setup_Script will be dealt with by the Modeling group. I have done one correction on the article.</li> <li>Speech:Alter_Configs will be dealt with by the Modeling group.</li> <li>Speech:ExpDir has an empty bin directory in addition to the other two directories. I plan on attaching a brief description to the trees directory.</li> <li>Speech:Scripts... we'll refer to that on a case-by-case basis. It depends on what needs to be documented, depending on what's out of date.</li> </ul> I completed Step 1 of Speech:Run_Train_Setup_Script while in class. 2/11/16: Did a successful run of Speech:Exps_0282_005. I have recorded the rest of my data there. During my session, I discovered that Speech:Run_Decode_Trained_Data had a section that was accidentally duplicated. I corrected this immediately after ensuring that the sections matched. 2/13/16: I have discovered that Speech:Scripts is missing two scripts referenced in the tutorials: prepareTrainExperiment.pl, which appears to exist as prepareExperiment.pl in the Scripts section, and run_decode.pl. Both of these also contain parameters. 2/14/16: I have discovered that some of the scripts in the user directory contain either modified or duplicate versions of scripts stored in the train/scripts_pl directory. Here are the results... 2/16/16: I corrected a potential typo on Speech:Run_Decode_Unseen_Data as a direct result of being contacted in regards to technical difficulties. This is because run_decode_v2.pl does not exist, even after it's copied.
 * Results:
 * setup_SphinxTrain.pl has two versions in the user directory...
 * setup_SphinxTrain.pl in the user directory is a modified version of the one stored in train/scripts_pl.
 * fs_setup_SphinxTrain.pl is a duplicate of the original file stored in train/scripts_pl.
 * createdict.pl has a duplicate in the user directory.
 * pruneDictionary.pl in the user directory appears to be a modified copy of one of the other pruneDictionary.pl scripts (most likely the first one).
 * dictionary2.pl in the user directory is a modified version of dictionary2.pl in train/scripts_pl. dictionary3.pl is a duplicate of dictionary2.pl in the user directory.

2/11/16: <ol> <li>Prepare a train by following the steps in Speech:Run_Train_Setup_Script.</li> <li>Create a language model by following the steps in Speech:Create_LM.</li> <li>Run the decoder by following the steps in Speech:Run_Decode.</li> </ol> 2/11/16: Would running nohup perl run_decode.pl 005 0282/005 1000 with an ampersand help? I'm tempted to run a decode session with an ampersand in place so that if I have to, I can close out of my session. It doesn't take as long as training, but it may still be of use when huge decoding sessions are undertaken.
 * Plan:
 * Concerns:

Week Ending February 23, 2016
2/17/16: Two new scripts are planned:  prepare_decode.pl mkDec.pl and add_experiment.pl. My tasks are... These are long term tasks, to be concluded within a month or so. The short term tasks are...
 * Task:
 * Co-develop the  prepare_decode.pl mkDec.pl script with Matthew.
 * Provide development support for add_experiment.pl, which is being developed by Kevin Soucey.
 * Gain knowledge on the decoding process and log some details on what gets executed for future reference. This involves looking up the scripts that get executed in Speech:Run_Decode, analyzing them and determining what's going on in the background.

2/18/16: I discovered that there was only one .pl script that is ever run in the decoding stages, and that is run_decode.pl. run_decode.pl executes exactly one program: /usr/local/bin/sphinx3_decode, and it uses Perl to make argument construction easier and reduce redundancy.
 * Results:

Today I decided to acquire a copy of Sphinx from SourceForge so I could look up the potential documentation locally. I discovered the flags in doc/s3_description.html (in the sphinx3-0.7 folder, extracted from a .tar.gz file acquired from SourceForge, whose link can be found in Speech:Function) under Configuration Arguments Overview.

The following parameters are called by run_decode to execute the decoder...
 * -hmm: Refers to a collection of acoustic module files, which contain model definitions (mdef), gaussian mean (mean) and variance (var), mixture weights (mixw), state transition matrices (tmat) and sometimes a sub-vector quantized model (subvq).
 * -lm: Refers to the language model that was created in Speech:Create_LM.
 * -dict: Refers to a dictionary containing words in the intended language.
 * -fdict: Refers to a dictionary containing non-standard "words", which could include noise, silence and other utterances.
 * -ctl: Refers to a language model and a regression matrix used on a per-utterance basis.
 * -cepdir: Contains a directory prefix for cepstrum files that are specified in a control file. The control file contains lines of code referring to individual utterances.
 * -cepext: Didn't find this in the flags section. However, implicitly, I'm saying that the cepstrum files contain the extension specified, since the control file does not use extensions.

The output is sent to ./decode.log (and it explains nicely why I get nothing). I discovered that prepareDecode.pl already exists as a script in the user directory. We're going to have to get this resolved in regards to filenames... 2/21/16: We had a group meeting today with all four members at 7 PM. The meeting was supposed to be on Skype, but it ended up getting transferred to Google Hangouts due to technical difficulties (one member couldn't connect at all, and I couldn't see the chat log), which caused a 30 minute delay before project discussions began as the issue was resolved. I made some psuedo-code for mkDec.pl, which is the name of the perl file I will be creating that will execute all of step 1 of the decoding process. It takes three parameters, the same as what the run_decode.pl file already does, and it executes part one of Speech:Run_Decode_Trained_Data (there is actually very little difference between trained and untrained based off of what the wiki has). I do not have plans to have this script do scoring at the moment. My fellow group members from the Experiment group have a copy of my pseudo-code, which includes theoretical Perl code as well (obviously untested). I'm going to consider testing the system calls by having them print out what will be executed first so that I can verify that the commands look good to execute. If they are not good, then all I have to do is check out what went wrong. If they are good, then I can turn the print statement into a system call and have the command run. I'll make a note to create the corresponding Speech:Scripts page as well. 2/23/16: Today I decided to do some record keeping on my end (collecting last modified dates, privileges, some potential user and group info and the filename) and determine which ones do not have execute privileges in the user directory. The following scripts have this potential problem... master_run_train_old.pl, last modified 2/26/14, is missing execute privileges in the center (left and right sides have execute on). This is the only script with inconsistent privileges for execute: the rest of them are all consistent. This one in particular has a newer version that doesn't have this potential problem from 2/25/15. Refer to Speech:Spring_2016_Meagan_Wolf_Log for the full list. All of the scripts in question don't seem to be actively referenced in the tutorials (unless they're referred to by other perl scripts). In other news, I am currently aware that my psuedo-code that I made for the script is being distributed. The psuedo-code in question has some actual theoretical perl in it, while other pieces of code are not authentic and use concepts acquired from looking up other pieces of scripts. If possible, I plan on creating a testing environment in my home directory so that I can safely test my code as print statements.
 * buildme.sh (last modified 2/28/14)
 * checkTrain.pl (last modified 3/26/14)
 * cleanTrans.sh (last modified 3/3/14)
 * exp_dir_setup.pl (last modified 7/3/15)
 * gen_errors.pl (last modified 2/13/14)
 * monoGen2.pl (last modified 3/27/14)
 * pullFromTrans.pl (last modified 2/26/14)
 * updateCFG.sh (last modified 4/10/14)


 * Plan:


 * Concerns:

Week Ending March 1, 2016
2/24/16: I used vim because I was realizing that I had to make quick patches to the code. I might switch my command line text editor to nano, as I discovered that the editor exists on Caesar, and I am more familiar with using that editor, having a better memory of it. 2/28/16: Google Hangout meeting occurred tonight at 6:30 PM after being delayed by a day from Saturday at 4 PM.

2/24/16: Produce mkDec.pl script. The script itself is a series of system commands. 2/28/16: Create a new sub-experiment, run train and create language model, and test mkDec to ensure that the script will not progress past a certain point if the proper steps have not been taken yet.
 * Task:

2/24/16: I was able to create an almost fully functional prototype consisting of say statements to substitute for system calls. I do have to do some debugging, as there are a few numbers missing from the head command (that's why I do say statements first). 2/25/16: Today I updated my Perl code to correct a bug in one of the say statements (caused by me accidentally using the wrong variable name and therefore getting nothing) and added enhanced argument verification. The verification I added on were in regards to whether or not all the arguments existed and whether or not they were the correct length (experiment and sub-experiment IDs are four and three digits, respectively, padded with zeroes). 2/29/16: I updated my Perl code to fix a bug with the first cd command with the sub-experiment ID. My local test environment (not caesar) didn't have autodie. Since I know what the exit values are since I tested a successful cd command and a failed cd command, I adjusted accordingly after looking up the official documentation, and instead went with the "execute or die" method, meaning the code must execute and successfully give off a certain value, or else the program dies. I uploaded the code and tried running the code from my home directory with 0282 as an experiment ID (the code stopped before any modifications were made). The code failed on the first cd command with a directory not found error despite entering a valid experiment ID. The command worked outside of the script. I eventually looked up http://stackoverflow.com/questions/7009055/how-do-i-cd-into-a-directory-using-perl and realized that I should have used the chdir function. Because of this realization, I immediately dealt with this problem and was able to successfully verify that the initial chdir codes worked.
 * Results:

2/24/16: Debug the script and prepare it for actual execution. 2/28/16:
 * Plan:
 * Gradually turn all the say calls into system calls (except for the last one, which will be an exec call).
 * I will test each step one by one and ensure that the error catcher is working. The idea is that the script will immediately stop if an error occurs, and prevent potential catastrophe.
 * Concerns:

Week Ending March 8, 2016
3/2/16: I created a sub-experiment, Speech:Exps_0282_009, dedicated to testing my mkDec code. 3/8/16: I took a look at the script analysis from Meagan Wolf. Some of the scripts mentioned had apparently vanished in the meantime, based off of a time gap when she was recording the directory info and when she returned to look up the scripts.


 * Task:

3/5/16: I created a train for sub-experiment 009 so that I could test my mkDec.pl code. The first stop-gap has been confirmed as working. I have discovered that the head command works after a train, which means that if I ran mkDec.pl all the way, then it would have failed at the run decode step, which means the second item on my plan needs to be revised so that run_decode.pl doesn't have to deal with a potentially bad language model. The second condition was successfully tested once I made the modifications, but only for when the LM folder does not exist. Once I tested that, I fully opened up the code to run at its fullest potential, and the code was successfully executed. 3/6/16: See Speech:Exps_0282_009 for me analyzing the results, which was performed today.
 * Results:

3/2/16: Test mkDec.pl using experiment 0282 sub-experiment 009 under the following conditions (after removing the breakpoint)...
 * Plan:
 * Nothing executed (only the experiment directory has been set up, and therefore attempting to go to etc should fail)
 * Train executed, language model not created (the head command should fail because of a file not found error)
 * Train executed, language model created (the script should run successfully, ending after the run_decode.pl script is executed)
 * Concerns:

Week Ending March 22, 2016
3/12/16: The specifications for my new makeTest.pl code were changed, and I have modified my tasks as such. My instincts say the head command I was using may change to a cp command, since there is no longer a partial copy function in the code specifications.

3/9/16:
 * Task:
 * Change arguments to source (optional), destination (Source and destination are filepaths, example 0282/009), corpus name, and flag indicating .trans file to use
 * Remove mkdir functionality (that means don't create the DECODE directory: user must create directory first).
 * Rename mkDec.pl to mkTest.pl makeTest.pl
 * Remove stopgaps (Mr. Jonas wanted this to be an expert's tool, which doesn't have as much emphasis on user-friendliness)

3/9/16: I created the Speech:mkDec.pl page today and made mkDec.pl available for all to use. I even performed a chmod on the file so that the script can be executed from command line, as previously the permissions would not have allowed the script to be directly executed from the command line without a program parameter. 3/15/16: I uploaded a beta of the makeTest.pl file today. Argument verification is gone, but I still stop if something goes wrong later on. All of the commands should have the correct parameters passed, and one of them in particular required that I execute a line count command. This is because the resulting line count is a required argument of the run_decode.pl file, and that means I have to recreate that parameter that got cut (for me, I'm implicitly using all of them).
 * Results:


 * Plan:
 * Concerns:

Week Ending March 29, 2016
3/23/16: Deal with missing files when the source does not equal the destination in makeTest.pl.
 * Task:

3/23/16: See Speech:Exps_0282_012. I was testing my makeTest.pl file there. I updated the script for another testing session. 3/26/16: See Speech:Exps_0282_012. I updated my makeTest.pl file and ran another test. This one seemed far more successful. 3/28/16: I created Speech:MakeTest.pl, which is the documentation for the updated version of mkDec.pl. I also marked mkDec.pl as obsolete as a result, since makeTest.pl is the new version. 3/29/16: See Speech:Exps_0282_012. I posted some findings as a results of my technique that I used for makeTest.pl and made a comparison to the experiment that I used as the source... basically, comparing my code to what I previously did the manual way.
 * Results:

3/23/16:
 * Plan:
 * Look up what files are being requested in the decode.log file to determine what files should be copied over (because of additional files being created) and what folders should be soft linked. Here are the folders from the source that I should be dealing with...
 * model_parameters ~ Symlink
 * etc ~ Copy it (I create an extra file in there)
 * feats ~ Symlink
 * Create code in makeTest.pl that ensures that these links are only created if the source does not equal the destination.
 * Maybe copy code from run_decode.pl? I would do this if I don't copy all of the folders over because of a numbering conflict. This also means that there will be no more copy of run_decode.pl.


 * Concerns:

Week Ending April 5, 2016
'''Hello, Iron Man team. You will only find my progress update logs on makeTest.pl here.' And maybe a furby or two.'' ^_^

3/30/16:
 * Task:
 * Revise arguments to -d noaa/full 0283/016 0282/012, with all arguments except for the last one required
 * Add mkdir functionality for DECODE, etc, and LM
 * Copy fileids from corpus file via awk '{print $1}'

4/3/16: I have modified the argument structure accordingly, as well as partially removing the symlink functionality (some directories will require more than just copying the. I've also coded in mkdir functions for directories that do not exist. It's not uploaded yet because I still have some theory to work out how to properly do... 4/4/16: Symlinks are gone, and I have my first series of code to update the names of any files copied over that have non-matching sub-experiment IDs. 4/5/16: Code is ready to be tested. I myself backed away from testing all of the code at once (although I did do some testing, but it was not within the script itself) because I had gotten nervous about doing testing on Caesar after noticing that a power outage had happened and that the servers had been affected as a result. I am de-prioritizing NOAA support for now due to a difference in transcript formats and the fact that there doesn't appear to be scripts for NOAA material.
 * Results:


 * Plan:

4/3/16:
 * Concerns:
 * The transcript formats are inconsistent between Switchboard and NOAA. Plus, there's no script processing perl file for NOAA. I am de-prioritizing supporting more than one type of transcript file for now, although I have a method to extract fileids from NOAA scripts (although it's not coded).
 * In run_decode.pl, $TRAIN and $TRAIN_INPUT_DIR meant exactly the same thing, which was redundant to me. If I bring the original version back, I'm guessing I will be using it like before: on the entire destination, and nothing else (this means I better do some renaming on copied files so that they can all be referred to by task ID). I think I got confounded in the class meeting, to be honest...

Week Ending April 12, 2016
4/10/16: I had a discussion with Matthew after he proofread my script and updated it to correct a few typos. The discussion was with regards to the feat folder not having the corresponding .mfc files. To generate the feats, the corresponding .sph files must exist in the correct directory.

4/11/16: Ran into none other than Mr. Jonas himself after my Internship class. We ended up discussing the makeTest.pl file... and this has resulted in me changing my specifications as per the following... Successfully created DECODE dir AM pointed to (source) LM generated from (corpus) Note: Generate feats, then execute run_decode.pl (destination) (destination sub-experiment, extracted from destination) (senone count, extracted from model_parameters)
 * No auto-detection when dealing with invalid dashed flags (I had originally considered this)
 * For decoupling reasons, generating feats and running the decode is going to be done externally (there are already scripts for generating feats, with some of them brand new from the Modeling group, as I was made aware of from Mr. Jonas... the script I was referred to I believe was Speech:LinkTransAudio.pl.). Instead, this message will be displayed...

4/10/16:
 * Task:
 * Have the code extract extra .sph files for any utterance that is not already there.
 * Take the code from generateFeats.pl and have it use the decode fileids rather than the train fileids.

4/12/16: See Speech:Exps_0282_024. I updated my makeTest script and tested it with this experiment.
 * Results:


 * Plan:


 * Concerns:

Week Ending April 19, 2016

 * Task:

4/13/16: Matthew had a version 12 makeTest.pl with some additional files copied over that he told me about in-class today. I decided to incorporate them into version 14, since they helped with dealing with potentially unseen data (especially with regards to feats), and I updated makeTest.pl accordingly. 4/17/16: Updated Speech:MakeTest.pl to the latest version. I also looked at decode.log... and ended up having to redo the decode on Speech:Exps_0282_024. See that page for more details. 4/18/16: I had discovered a slowdown in connections starting Saturday evening. When I took a look at Speech:Spring_2016_Neil_Champagne_Log, I realized that they were aware of the problem. I decided not to complain because only the initial connection was slow (and it didn't take several minutes to connect... but even then, I have some patience to spare): the session itself worked fine. I updated Speech:MakeTest.pl with some instructions on how to integrate with two known feat generation scripts. 4/19/16: Did some research on vibranium and one of the parameters with regards to attempting to mass-produce the shields. That means for this day, I dedicated my time to doing top secret work.
 * Results:


 * Plan:


 * Concerns:

Week Ending April 26, 2016
4/21/16: Re-introduce softlinks into makeTest.pl... for better or for worse. The main reason for doing so is to self-document what the source experiment ID was. 4/24/16: Update makeTest again to support the latest version of run_decode.pl, which has modifications made to it that address a major concern of mine (unfortunately, it's not the default script in /mnt/main/scripts/user... instead, it's currently stored in /mnt/main/scripts/user/History/run_decode/6b). 4/25/16: LM creation is supposed to use the .trans file from the AM. Thus, look for <src-subex#>_train.trans?
 * Task:

4/23/16: After voicing my concerns to Mr. Jonas (indicating that some revisions to other scripts would be required), I created a version 15. This version is only to test the softlinks: it will not be made official until the go-ahead is given. Matthew created an updated genFeats.pl to deal with this new version of makeTest.pl. See Speech:Exps_0282_026 for more details, since this experiment was used to test the script. 4/25/16: I got an urgent message saying that my LM implementation was invalidating experiments, especially with unseen data. So I updated my script to deal with that. I also updated my script to refer to run_decode.pl and to revert to copying individual files from the source etc directory. See Speech:Exps 0282 030 and Speech:Exps 0282 031: I used these two to test my script. 4/26/16: Made a minor correction on an error message. I also updated Script:MakeTest.pl and updated the default version used.
 * Results:

4/22/16: To reverse copying the entire directory (well... almost) 4/24/16:
 * Plan:
 * All directory renaming will be cut from the script. This is because run_decode only accepts a single task ID, and that means I must make all suffixes equal to the source experiment. At least it means that it documents which sub-experiment ID is the source. (There is a concern about archiving this stuff, though, when they get put into their own folders... that means they must be re-aligned when they get archived)
 * The etc directory cannot be softlinked. It is used for scoring, meaning there is a risk of overwriting a previous experiment's data. That means I must copy over files from the original directory.
 * Re-introduce re-naming of all non-softlinked files copied over with non-matching sub-experiment IDs
 * No DECODE directory
 * In sphinx_train.cfg (if you're copying over the file), update $CFG_DB_NAME and $CFG_SPHINXTRAIN_DIR.

4/22/16: I have major concerns over the other students getting confusion as a result of having to use the source sub-experiment ID rather than the destination sub-experiment ID. Copying over the directories and renaming them averted this problem, but now I have to actively create files that maintain the original sub-experiment ID from the source. In addition, I cannot accept the genFeats.pl script anymore because of a mis-match in sub-experiment IDs.
 * Concerns:

Week Ending May 3, 2016
4/27/16: Apparently there are two different kinds of transcript formats. I was aware of this initially, but now I have a defined name for each data structure, named like this (with general format shown): Corpus transcript format id start stop ________ Experiment transcript format ____________ <id>

4/27/16: De-emphasize language model creation.
 * Task:

4/30/16: Version 17, which I have done some coding for today, will only bother with the Language Model to copy it from the source if it doesn't already exist. I have also added in a couple more files to the etc directory, including sphinx_train.cfg, which will have two modified parameters. Speech:Exps_0282_033 has been prepared for recording the results. 5/1/16: Speech:Exps 0282 033 now has results filled in. 5/2/16: Updated Speech:MakeTest.pl and made version 17 official. 5/3/16: Updated some wiki pages by adding some missing source code to Speech:ParseDecode.pl, Speech:Lm create.pl, Speech:Find.pl, and Speech:Gen_errors.pl.
 * Results:

5/1/16: For tomorrow, update documentation and make version 17 official.
 * Plan:
 * Concerns:

Week Ending May 10, 2016
5/10/16: For tomorrow, I plan on updating Speech:Run_Decode_Trained_Data and Speech:Run_Decode_Unseen_Data. In case makeTest.pl fails, then I will list the steps that are performed so that it can be replicated manually.
 * Task:

5/4/16: I have decided that version 17 will be the last official version. Apparently there's a last-minute overhaul involving some new perl module containing decode parameters. There may not be enough development time to do such a thing. I contributed to the final report by giving my input on makeTest.pl, what it does and what was its purpose.
 * Results:


 * Plan:


 * Concerns: