Speech:Spring 2015 Morgan Gaythorpe Log


 * Home
 * Semesters
 * Spring 2015
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Week Ending February 3, 2015

 * Task:

Monday February 2
 * Results:

Reviewed the experiment creation process. There are a few places that need improvement. Under the 'Set up Terminal' page the Linux section is completely blank. Although the process for Mac and Linux machines is probably almost identical, any differences should be noted, or the two sections should be merged instead. Under the 'Create the Language Model' page the disclaimer section seems mostly irrelevant here and probably should be removed. The color scheme of reds and greens should be removed entirely or should be replaced with bold highlighting which seems more in line with the rest of the wiki. In general any references to scripts may need to be replaced with any new or updated scripts once we start work on them.

Tuesday February 3

Looked over one third of the scripts. Some did not include code in the wiki page. dictionary.pl is the same as createdict.pl and can be marked off the list. dictionary2.pl has no available info on the wiki. dictionary3.pl does not have the code on the wiki but from the description given sounds useful. I will need to look over find.pl when we get access to the server as not enough information is given to determine if the script is useful or not. gen_errors.pl should be kept and used. generateFeats.pl has some useful parts which should be kept. genFileIDs.pl does not seem to be that useful and should be marked off the list. GenTrans0.pl, GenTrans2.pl, GenTrans3.pl, and GenTrans4.pl are replaced by Gentrans5.pl/GenTrans6.pl. GenTrans5.pl and GenTrans6.pl are the basically same file with only formatting differences. I would like to see the lm_create.pl code on the server, but this script sounds useful for creating the Language Model. master_run_train.pl need to be heavily updated to remove its ability to automatically create experiment directories on the server. I would like to see the code for parseDecode.pl on the server before I make a decision.

Sunday February 1
 * Plan:

Checked my teams logs and put together a list of the scripts I will be going through before Wednesday.


 * Concerns:

Week Ending February 10, 2015

 * Task:

Saturday February 7
 * Results:

Read the group's logs to keep up to date with each member's activities.

Sunday February 8

Checked my group's logs one more time this week.

Monday February 9

Today I reorganized the existing experiments by Semester. I also made each Semester collapsible to make reading much easier by having the reader expand only the Semester(s) he or she wishes to view. I did not change any of the links on the experiments page. I only changed the overall view so all the links will still connect to their corresponding experiments. The hardest part of this task was learning the API to make the collapsible elements and fighting with the wiki to make the page appear presentable. This took much longer than I or my group's members had estimated. In addition I added the range of experiment numbers to each Semester's header to make finding a specific experiment easier to find if the user only knows the experiment's number.

Sunday February 8
 * Plan:

Tomorrow I plan on organizing the old experiments by semester. For Tuesday I plan to write a draft describing what the new script needs to do.


 * Concerns:

Week Ending February 17, 2015
Monday February 16
 * Task:

Today I began writing the create wiki experiment script. The overall plan for what the script will do is fairly straight forwards. First the script will prompt the user for their name and the purpose of the experiment. The script will then send a login request to the foss.unh.edu wiki. If the server requires a token, the script will then confirm the token before continuing login. After logging in the script will find the next available experiment number. Finally the script will add a new experiment to the page using the author's name, printing the date the page was made, and recording the purpose of the experiment. The user will be given this experiment number by the script. This script will take a few days to be completed. I hope that tomorrow I will be able to at least login to the wiki, prompt the user for their name and purpose, and navigate to the experiments page. The most difficult part I see in writing this script will be formatting the url commands and sending them in the correct order.

Saturday February 14
 * Results:

Checked my team's logs for the week so far.

Tuesday February 17

After a few hours of work I figured out how to login to the server on foss.unh.edu. I had to login to to http://foss.unh.edu/resources/api.php through the perl media wiki api. I tried to login through http://foss.unh.edu/project/api.php but I kept receiving a JSON decode error which prevented me from getting any further. I have not attempted to modify or add any pages yet. I am not sure how the foss.unh.edu server is set up but its settings and structure appear to differ than the standard media wiki site which made requesting a login and confirming the token more difficult than I anticipated. I expected to be further along the script at this point, but the script I have now is a good starting point.


 * Plan:


 * Concerns:

Week Ending February 24, 2015

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending March 3, 2015

 * Task:

Sunday March 1
 * Results:

I completed a lot of work on the script. The script uses the LWP::UserAgent perl module to handle http requests and responses. By utilizing these requests and responses I have been able to log my self into the wiki. I have also been able to get the next experiment number from the list of experiments on Speech:Exps. I programmed the script to make edits to pages on the wiki. Tomorrow I will tie up loose ends on the script and finish it. I have not tested the script on Caesar. The script will need the LWP::UserAgent module and all of its dependencies installed to work. This script took me much longer than I thought it would to complete.

Monday March 2

Today I cleaned up the script a bit. There are a number of steps involved to make this script work. The script connects to the foss.unh.edu server and attempts to login with the given username and password. The scripts gets a login token and passes it back to the server to create a connection with the user logged in. The script then gets an edit token to allow the user to make changes to the wiki. Next the next free experiment number is received and stored. Afterwards the user enters the author's name and a brief description of the experiment they wish to create. The script then creates the new experiment entry on the Speech:Exps page. Finally the script creates the new experiment's own page with the information provided by the user. The script has been deployed and tested on Caesar. I did not have to install any perl modules to Caesar before the script worked. The user's password is visible while he or she types it in.


 * Plan:

Monday March 2
 * Concerns:

My only concern on the create wiki experiment script is that I did not have to install any perl modules onto Caesar for the script to work. I do not know if this is because my machine has the modules and thus running the script through putty gave it access to the modules that are local on my machine. I really hope not as this would mean that the script would not run on Caesar if someone who does not have the appropriate modules. I hope that the modules are already on Caesar. This would mean the script would run for anyone who runs the script from Caesar (through putty, etc.).

Week Ending March 10, 2015

 * Task:

Saturday March 7
 * Results:

Today I created a sub-experiment script. This script allows a user to enter in an experiment number (preceded by a zero Ex: 268 -> 0268). This script takes in very similar information compared to the main experiment script. The script creates a link to the sub experiment on the main experiment's wiki page. The page scheme is as follows: Speech:Exps_experimentNumber_subExperimentNumber so for example 0268 001 would be located in Speech:Exps_0268_001. This script and the main experiment script are excellent places for later semesters to learn how to create or edit content on the wiki.


 * Plan:


 * Concerns:

Week Ending March 24, 2015

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending March 31, 2015

 * Task:

Tuesday March 31
 * Results:

Ran a test 5hr train to get my self familiar with the process. I used the default settings as described in the tutorial on the wiki just to get started.

Results from train:

,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|     | sw2001b |   18    163 | 80.4   16.6    3.1   39.9   59.5  100.0 | |-+-+-|     | sw2001a |   14    101 | 78.2   19.8    2.0   49.5   71.3  100.0 | |-+-+-|     | sw2005a |    3     10 | 80.0   20.0    0.0  120.0  140.0  100.0 | |-+-+-|     | sw2005b |    2     31 | 83.9    9.7    6.5    0.0   16.1  100.0 | |=================================================================|     | Sum/Avg |   37    305 | 80.0   17.0    3.0   41.6   61.6  100.0 | |=================================================================|     |  Mean   |  9.3   76.3 | 80.6   16.5    2.9   52.3   71.7  100.0 | | S.D.   |  8.0   69.7 |  2.4    4.8    2.7   49.9   51.3    0.0 | | Median |  8.5   66.0 | 80.2   18.2    2.5   44.7   65.4  100.0 | `-'


 * Plan:

I tried copying a file to Caesar today. I had a lot more trouble than I anticipated. To upload the file I used the SSH Secure File Transfer Client to copy the script to cisunix.unh.edu, I then used the scp command (while logged on cisunix.unh.edu through a terminal) to copy the file from cisunix.unh.edu to caesar.unh.edu An example of the scp command:
 * Concerns:

scp createWiki_Sub_Experiment.pl msj57@cisunix.unh.edu:/mnt/main/scripts/user

copies the createWiki_Sub_Experiment.pl script to /mnt/main/scripts/user on cisunix under the username mjs57

Week Ending April 7, 2015
Saturday April 4
 * Task:

Checked with my team on the status of the competition. Did a little bit of research into why increasing the amount of hours per train also increases word error rate (even though it should do the opposite). I haven't really discovered the reason yet but I talked to Sam about settings for the trains. From what I understand we still don't know too much about all of the settings yet. If I had to guess one of theses settings has not been set correctly. I am still waiting to hear back from Sam about the settings, but this is a subject I would like research further.

Sunday April 5

Checked team's progress

Monday April 6

Changed some settings on a new train I made. I nano into the sphinx_train.cfg and change the settings I wanted. I ran into some trouble setting up the corpus, but the issue was only on the wiki (I updated the page to have the correct path to the corpus files). I successfully ran the train afterwards and got a small decrease in word error rate (nothing special though). I hope to build upon what I did today and collaborate with my team members on Wednesday to plan where to go next. ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|     | sw2001b |   18    163 | 76.7   20.2    3.1   48.5   71.8  100.0 | |-+-+-|     | sw2001a |   14    101 | 80.2   17.8    2.0   64.4   84.2  100.0 | |-+-+-|     | sw2005a |   34    654 | 80.3   15.6    4.1   15.3   35.0  100.0 | |-+-+-|     | sw2005b |   59    554 | 57.4   32.7    9.9   31.0   73.6  100.0 | |=================================================================|     | Sum/Avg |  125   1472 | 71.3   22.7    6.0   28.3   57.0  100.0 | |=================================================================|     |  Mean   | 31.3  368.0 | 73.6   21.6    4.8   39.8   66.1  100.0 | | S.D.   | 20.4  276.7 | 11.0    7.6    3.5   21.3   21.5    0.0 | | Median | 26.0  358.5 | 78.4   19.0    3.6   39.8   72.7  100.0 | `-'


 * Results:


 * Plan:


 * Concerns:

Week Ending April 14, 2015
Tuesday April 14 I tried to run another 2 trains today. I haven't heard from anyone about what we can do to fix the issues a number of people are having. I talked to Nick, Chris and Kayla. We all seem to be having trouble. Hopefully we can discuss and resolved the issue tomorrow.
 * Task:


 * Results:


 * Plan:

Monday April 13 Today I attempted to run some 5hr trains. I kept running into errors during the decode step. I have included the information from the decode.log file in the sub experiment I was trying to run. INFO: kbcore.c(442): Begin Initialization of Core Models: ERROR: "cmd_ln.c", line 724: Cannot open configuration file /mnt/main/Exp/0270/003/model_parameters/003.cd_cont_2000/feat.params for reading INFO: kbcore.c(462): Parsed model-specific feature parameters from /mnt/main/Exp/0270/003/model_parameters/003.cd_cont_2000/feat.params INFO:  Initialization of the log add table INFO:  Log-Add table size = 29356 x 2 >> 0 INFO: INFO: feat.c(848): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none' INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0 INFO: kbcore.c(489): .cont. INFO:  Initialization of feat_t, report: INFO:  Feature type         = 1s_c_d_dd INFO:  Cepstral size        = 13 INFO:  Number of streams    = 1 INFO:  Vector size of stream[0]: 39 INFO:  Number of subvectors = 0 INFO:  Whether CMN is used  = 1 INFO:  Whether AGC is used  = 0 INFO:  Whether variance is normalized = 0 INFO: INFO:  Reading HMM in Sphinx 3 Model format INFO:  Model Definition File: (null) INFO:  Mean File: (null) INFO:  Variance File: (null) INFO:  Mixture Weight File: (null) INFO:  Transition Matrices File: (null) FATAL_ERROR: "mdef.c", line 680: No mdef-file
 * Concerns:

Week Ending April 21, 2015

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 28, 2015

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending May 5, 2015

 * Task:


 * Results:


 * Plan:


 * Concerns: