Speech:Spring 2015 PatsXLIX


 * Home
 * Semesters
 * Spring 2015
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Team Logs

 * Bruins
 * [Patriots]

Team Member Logs

 * Nathaniel Biddle -  Tools Group
 * Melissa Bruno -  Systems Group
 * Garrett Bryant -  Modeling Group
 * Krista Cleary -  Data Group
 * Trevor Downs -  Proposal Group / Modeling Group
 * Dakota Heyman -  Data Group
 * Refik Karic -  Tools Group
 * Taylor Kessel -  Experiment Group
 * Kyle Poirier -  Systems Group
 * Nicholas Tello -  Experiment Group
 * Zebadiah Wood -  Modeling Group

Script Analyses
Analyses of the scripts found in /mnt/main/scripts/train/scripts_pl. Note that these scripts are copied into an experiment directory during the setup steps (/mnt/main/Exp/Exp#/SubExp#/scripts_pl/). 3/29 Update - I suggest that everyone takes a look at the files in the etc directory of an experiment. Especially sphinx_train.cfg as this is where several of the parameters that you will be seeing come from ($CFG_...).


 * setup_SphinxTrain.pl - Krista, Nick
 * setup_Sphinx3.pl - Refik, Trevor
 * genTrans.pl - Refik, Kyle
 * pruneDictionary.pl - Dakota, Krista
 * generateFeats.pl - Nick

RunAll.pl
 * Melissa (1-3)
 * 00.verify/verify_all.pl
 * 01.vector_quantize/slaveVQ.pl
 * 02.falign_ci_hmm/slaveconvg.pl
 * Zeb/Taylor (4-6)
 * 03.force_align/slave_align.pl
 * 04.vtln_align/slave_align.pl
 * 05.lda_train/slave_lda
 * Nathaniel (7-11)
 * 06.mllt_train/slave_mllt.pl
 * 20.ci_hmm/slave_convg.pl
 * 30.cd_hmm_untied/slave_convg.pl
 * 40.buildtrees/slave.treebuilder.pl
 * Garrett (12-15)
 * 45.prunetree/slave.state-tying.pl
 * It calls the two other scripts in the same directory.
 * All three of these scripts don't seem to do anything that we will be able to modify.
 * I have a text file containing a summary of all three if needed.
 * 50.cd_hmm_tied/slave_convg.pl
 * This is a very large script.
 * It depends a lot on the contents of the model_parameters directory, so I will try to investigate the contents of that.
 * 90.deleted_interpolation/deleted_interpolation.pl
 * References means, variances, and transition_matrices from the model_parameters/001.1000
 * And mixture_weights from model_parameters/001.1000._delinterp
 * 99.make_s2_models/make_s2_models.pl
 * Uses same values as the above script
 * Also uses mdef

Script Progress

 * Refik
 * (3/29) Tried to look at my scripts. Couldn't figure out how to get FileZilla or the cisunix ftp program to connect to Caesar. SSH with Putty works fine but it does not let me open the scripts.
 * I had the same problem, but you can just vi into the file using the command shell. It can be kind of a pain, but you can still copy the text into notepad++ to make it easier to read. ("vi + FILENAME" jumps to the bottom of the document) -Garrett
 * (4/6) What happened to the scripts I was assigned? I checked both the Exp and scripts directories but did not find any of my scripts? Two weeks ago they were in both of those folder when we initially took a look at what was inside.
 * (4/13) Looks like setup_sphinx3.pl is the script that is responsible for copying the files over to the experiment directory you make. It might be worth spending some time to clean this up so that we can save disk space. Every time you run a train a copy of that script file is made into your experiment directory.
 * Dakota
 * pruneDictionary.pl
 * Purpose:
 * Searches through a transcript and creates a dictionary based on the words found in the transcript.
 * Variables:
 * $trans_file = Transcript provided by the user
 * $dict = Dictionary provided by the user (should be Master Dictionary)
 * $output_file = Resulting file that includes the new dictionary
 * Process:
 * User specifies correct transcript and dictionary to use. The filename for the output file is also defined.
 * A list of every word that appears in the transcript, as well as the number of appearances by each word, is placed in a temp file.
 * The words are then stripped of non-word attributes and the results are placed in the output_file.
 * Nicholas
 * setup_SphinxTrain.pl
 * (03/30) Purpose:
 * Creates the directory for the sphinx files, along with many of its sub directories. Then copies all executables to the local bin directory. Then creates the config file for sphinx.
 * Variables:
 * $SPHINXTRAINDIR = The directory being used
 * $DBNAME = the name of the directory that is being used
 * $TEMPLATE = where a template of the $DBNAME is stored for creation of future trains
 * $help = doesn't seem to be used for anything?
 * $force = stores the 'force' command to be called by larger pieces of code such as $FORCE_MODE
 * $update = stores the 'update' command to be called by larger pieces of code such as $UPDATE_MODE
 * $LEAVE_MODE = used to hold the value 0
 * $UPDATE_MODE = used to hold the value 1
 * $FORCE_MODE = used to hold the value 2
 * $replace_mode = used to be the transgressor between the other three modes when changed need to be made mid statement
 * $result = holds the other variables as a master location for other variables.
 * Process
 * Script creates the needed directories for SPHINX to operate and creates the config file using default values
 * (03/31) Purpose:
 * This script calls scripts_pl as well as makeFeats.pl. It removed the link to the audio file and then creates a new link pointing to the new audio directory.
 * Variables
 * None
 * Process
 * A necessary script that does exactly as its purpose states.
 * Note(s)
 * There isn't really anything that can be changed in this script. It does what it need to do without allowing for much change that would improve results.
 * Melissa
 * 00.verify/verify_all.pl
 * Purpose: (from comments, cat file | grep #)
 * This script goes through two phases:
 * PHASE 1: Check to see if the phones in the dictionary are listed in the phonelist file
 * PHASE 2: Check to make sure there are not duplicate entries in the dictionary
 * Variables:
 * 01.vector_quantize/slaveVQ.pl
 * Purpose:
 * The agg_seg script aggregates all the training feature vectors into a single dump file and the kmeans script uses the contents of this dump file to compute the vq centroids in the vector space
 * Variables:
 * 02.falign_ci_hmm/slaveconvg.pl
 * Purpose:
 * This script launches all the ci - continuous training jobs in the proper order. First it cleans up the directories, then launches the flat initialization, and the baum-welch and norm jobs for the required number of iterations. Within each iteration it launches as many baumwelch jobs as the number of parts we wish to split the training into.
 * Variables:

Experiment Results

 * Results Key:
 * SPKR   | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err
 * 5_hrTrain
 * 0266/003
 * This was just the base train with nothing modified
 * Sum/Avg | 2933 35843 | 75.2   18.5    6.4   17.4   42.2   94.1
 * 0275/001
 * Values modified:
 * lowerf from 133.3 to 233.3 (Not much of a change. Probably better to just leave it)
 * upperf from 6855.5 to 2855.5 (This number seemed really high, but changing it may skew the results)
 * CFG_CONVERGENCE_RATIO from 0.04 to 0.004 (A comment said it should be 0.004.  I think this is what caused any reduction in errors)
 * Sum/Avg | 3506 42940 | 77.3   16.5    6.1   16.6   39.3   93.7
 * 0275/002
 * Sum/Avg | 5000 73986 | 73.4   20.3    6.3   14.6   41.2   96.1


 * Created Sub Experiment 004
 * Changes Made
 * Noted on pen and paper, and to be shared with group at Wednesday meeting until secure page is made. Results will be done the same way.

Competition E-mail Stream
5-7-2015

Dakota Heyman Hello Patriots, I figured we should probably establish a plan for how we are writing the final report. From what I understand, this will be very similar to our proposal, but in the past tense rather than future tense. with that being said, it makes sense for us to divide the work into our starting groups. I will do modelling myself, because we "unfortunately" lost Zeb and I know Trevor was thrown in that group really late (since he was in charge of putting     together the proposal at the beginning). Trevor, I was hoping that you could assume your position as the collaborator and make all of the groups' work flow together. I believe he said that he wanted this report done by this weekend, so  we kind of have to get it done quick. I'm also assuming that there aren't any groups that took Prof Jonas's advice to start the report early. Anyways, let me know if you have questions, or I got any of this information wrong. I think it would make sense to email the group as  everyone makes progress on the report, so we know that everybody is doing there part and Trevor can clean it up. From, Your ex-co-team-coordinator

5-6-2015

Dakota Heyman Thanks everyone! If you already left your house with the previous version, that's fine. We can just write in Automatix for the drone. If you haven't left yet then I attached the latest version. Thanks again, Dakota

Garrett Bryant

Sorry for the late reply! I did the decode on Automatix. Everything else looks good for the report. We are almost done!

Krista Cleary

I'll just print it before I leave my house. I'm planning on leaving around noon, so that's when I'll print it.

Dakota Heyman

Thanks Krista, that'd be great! I'm not sure of your timetable but if possible could you wait like 30 minutes to print out to see if people have feedback they want to add? I also need the drone that Garrett used to run the decode.

If you don't have time and need to print now, that's fine. We'll just write in the drone with a pencil haha.

Krista Cleary

We're supposed to print and staple 3 copies of it to turn in. I think it looks good. If you want I can print and staple it? Dakota Heyman Are there any last minute changes that people want to add to the report? Also, does anyone know if the report is supposed to be emailed to him or printed out and handed to him in person? Thanks, Dakota 5-5-2015

Krista Cleary

That is correct Dakota.

Dakota Heyman Krista can correct me if I'm wrong but I believe that she later realized the result was from a 5hr decode rather than a 125hr. -Dakota

Melissa Bruno

Instead of saying that we didn't achieve an effect result for 125-hour trains, could we include Krista's 28.8% WER train results? Melissa Dakota Heyman Hi all, I have compiled together a rough draft of the final report from sections written by Kyle, Nathaniel, Refik, and Trevor. I added in some more information that we have learned about our best result (like the real-time factor). Please let me know what you guys think could be improved upon or elaborated further. As for the final results, Garrett could you let me know what drone you decoded on for the real-time factor? I was just guessing methusalix. Also if any of those other sections of the 3rd page seems wrong, let me know. Thanks, Dakota Heyman 5-2-15

Dakota Heyman

That's great news! The real-time factor seems really high but who know, maybe the other teams' is even higher. Zach from the other team emailed Jonas to ask if the results "absolutely had to be decoded on   drones “so I'm guessing theirs is pretty high too. I'll put Jonas' response    below so everyone can see it:    ___    "Professor Jonas, is it an absolute requirement that our final result be decoded on a drone?"    It would be difficult to compare real-time factors. What problem    are you having? I would expect teams run many decodes on caesar and take    the top ones (3 maybe) and re-run them concurrently on each team's three    assigned drones. If time is a huge issue (i.e. say decodes are running at    6xRT which would be a day and a half on the 6 hour test set) you can take a    "sampled"    subset of either 125_decode.trans or 256_decode.trans, clearly    describe it in your document and use it (maybe 90 minutes is enough...and    when  say sampled I mean NOT the first 90 minutes but say every 4th    utterance to cover it evenly. A sampled test set should give a good approximation of the full test set and would be acceptable to submit if properly discussed in your result report. Let me know if that is helpful. ___   Regards, Dakota Melissa Bruno That's awesome! Yesterday I was worried this train would only bring the word error rate down by another .1%, but that's a huge improvement! Also, I know real-time factor is supposed to be one of the deciding factors, but doesn't that depend too much on factors on the system (how many trains are running at the moment, ect.) for that to make any sense? Quote from a paper published by IEEE that supports my point: "D. Noisy Observations Given a parameter vector, WERs are reproducible.        Unfortunately this is not the case for the real-time values. Computer performance    depends on the state of the computing infrastructure, e.g., the operating    system or network." http://www.dcs.shef.ac.uk/~th/publications/elhannani_spl09.pdf The author then goes on to describe the fancy calculus he used to normalize the "noise" factor, because not doing so would be bad science. Am I missing something, or is one of the deciding factors for how well we did not reliable at all? I looked up this paper after noticing how bewildered teams from previous semesters were about how all-over-the-place WERs are. Best, Melissa Garrett Bryant So the 256hr train that took 9 days to run got 32.2%... Looks like we made it into the race guys. I have to run in on a drone for real-time, but I already know its going to be over 60 hours for that.

5-1-2015

Dakota Heyman

Hey Garrett, Well those results aren't exactly what we hoped for but at   least we have definitive numbers to place as results. Who knows, maybe the other team has an even worse real-time factor. Either way, great work on   getting results! Regards, Dakota Garrett Bryant

Hey people, All of the good introductions have been taken, so I had to   resort to calling you 'people.' I re-decoded the 256hr train that got us 41% on Automatix, so   we have a real-time factor. It took about 31 hours to run, but I don't   know how remember how long the 256hr decode data is to compare it. I   think it was around 6 hours, which is really bad... On the bright side, the 41% error rate went down to a whopping 40.9% error rate! That was with the language model and dictionary changes that we had discussed. My newest 256hr train is still decoding at the moment. It is   currently running on Caesar and it has taken 48 hours thus far, so it    isn't looking good for the real-time problem. Anyways, I will keep y'all posted. -Garrett

4-29-2015 Dakota Heyman

Hi everyone, Right now the game plan is to continue modifying the Language Model to decrease error rate. Our best result so far is a 41% WER on a   256hr train. Garrett is currently decoding that train with a modified LM   to see if the results increase. Garrett and I also went over the dictionary that I had created and found it to be ineffective. Garrett created a script to   generate this dictionary and is currently testing it against his 256hr train. The other task this week is to create the report for our results. There are only 4 sections so we have delegated the sections to 4 team members: Kyle - Goal Trevor - Description Refik - Results Nathaniel - Summary Emails will be sent individually to each team member with more details on what to include in the report. Everyone else will continue to attempt to decrease the WER by changing sphinx configuration, modifying LM, and using the new dictionary (needs to be verified by Garrett first). As   results (good or bad) are found, email the group so we can analyze them together. If anyone has any questions about anything, feel free to let the group know so we can help you out. Regards, Dakota Melissa Bruno Thanks, Dakota! I ran three standard 5 hour trains with a   senome of 1000 with the three different -vocab_type options just to    see if changing them made things better or worse. The change from -vocab_type 1 to   -vocab_type 2 didn't make any difference, and the change from -vocab_type 1 to -vocab_type 0 actually brought the error rate up a bit. It might be a different story if we change the senome value, but just going off of this it actually might be better to leave that value alone for now. Best, Melissa Dakota Heyman Hey Melissa, After class I'll email everyone the game plan. I think class will be short today due to the commencement fair so our game plan will probably be the same this week as it was before (New LM / New   dictionary). I haven't tried a train with idngram2lm set as -vocab_type 0 but i'll   try one after class today. I didn't have much time to work on the project this week but I believe that I will be more free this week to contribute. Regards, Dakota Melissa Bruno

Hi team, I've had to call out of work the last few days because of   a sinus infection, which means I'm going to be missing class at a    very inopportune time. Is anyone willing to write up a summary after class of our game plan for the final week and include it in this email chain? Also, has anyone tried running a train with idngram2lm set as -vocab_type 0? I still think that has a lot of promise and I am trying to run a train with that option right now, but I'm curious as to whether anyone else did this successfully or ran into any issues. Thank you, Melissa

4-28-2015

Melissa Bruno

Hey all, Because I was in a hurry last time I did this, here's a detailed investigation I did of the default cutoff of tmp.vocab in the language model: 5-hour train 0275 001 /LM $ cat trans_parsed | tr ' ' '\n' | sort | uniq | wc   3310 $ cat tmp.vocab | wc   3313 125-hour train 0275 006 /LM $ cat trans_parsed | tr ' ' '\n' | sort | uniq | wc   12297 $ cat tmp.vocab | wc   12300 256-hour train 0275 002 /LM $ cat trans_parsed | tr ' ' '\n' | sort | uniq | wc   29791 $ cat tmp.vocab | wc   20004 Unlike the other two trains, the vocab this language model is using does not have the same number of unique words in its vocab because of the default 20,000 word cutoff. It has almost 10,000 less words than it should. Using top -40000 option 256-hour train 0275 002 /LM $ cat trans_parsed | tr ' ' '\n' | sort | uniq | wc   23570 $ cat tmp.vocab | wc   24573 The vocab being used to create this language model is not being capped off at 20,000 because the wfreq2vocab -top 40000 option is being used in lm_create.pl. Basically, just always make sure to use wfreq2voccab –top 40000 in lm_create.pl when decoding longer trains. It may be worthwhile to decode older trains after adding that. lm_create.pl should be edited it /mnt/main/Exp/###/###/LM/lm_create.pl. Hope that helps, Melissa 4-27-2015

Dakota Heyman I completed work on the dictionary but I'm not sure how to verify if it's correct. I'll document my process here since I   can't on my log: I used the createTranscript.pl script to create a   transcript out of 125hr_train.fileids since the /mnt/main/Exp/0272/002/etc directory only has a train.trans that includes a lot more than 125hr. I used the following command. The format is   createTranscript.pl: <input_transcript <output_transcript <length_of_time <start_time /mnt/main/scripts/user/createTranscript.pl   /mnt/main/Exp/0272/002/etc/train.trans 125hr_train.trans 625422 0   I used tail -10 of the 125hr_train.fileids located in Jonas Exp directory to see that the last audio file was sw3170B-ms98-a-0045. Then I incremented the number (which ended up being 625422 until I got      to sw3170B-ms98-a-0045 (I used tail again to see when I was getting close). After that, I simply copied the new transcript from my personal directory to my Exp directory (0275/010).    Finally, I used pruneDictionary to create a dictionary out of the    new transcript:   /mnt/main/scripts/user/pruneDictionary.pl    /mnt/main/corpus/switchboard 125hr_train.trans 010    This was the terminal output:    Processing 236578 words against dictionary...    Added 236241 files to add.txt    Created 010.dic    I'm going to try running a 125hr_train with it and I'll update you guys with the results.    Regards,    Dakota

Dakota Heyman

The issue I'm having is getting the script arguments right. pruneDictionary needs a transcript so first I need to use GenTrans but GenTrans wants a corpus directory. When I put the path of the 125_train_fileids (/mnt/main/Exp/0272/002/125_train_fileids) it doesn’t recognize it as a proper argument. Also while looking at GenTrans, it generates a train_trans and a train_fileids. We already have the file_ids so somehow we need to get a 125_train_trans as well. I have class until 9 so I'll look at it more tonight. Melissa Bruno Thanks, Dakota. I tried SSHing into cisunix.unh.edu on   my work and personal computers after receiving your email, and still got the same error. I'm at work and able to connect just fine, though, so I'll just stay in the office late.