Speech:Spring 2017 Sharayah Corcoran Log

From Openitware
Jump to: navigation, search


Week Ending February 7th, 2017

Task

2/3: Continue to research Torque, LDA, and look into software currently being used.

2/5: Continue research, read over student logs.

2/6: Learn about Torque by locating documentation, and add Tools Group information to 2017 Proposal.

2/7: Jeffery is trying to run an experiment but is having problems, I am trying to see if I can assist him in any way by reading over the instructions for creating and running an experiment. I will also try creating an experiment.

Results

2/3: Expanded notes for LDA and Torque proposal. Discovered that utilizing LDA could reduce the WER by approximately 25%, make the decoder faster, and reduce the size of the acoustic model.

2/5: Continued to read about LDA and Torque.

2/6: I was able to locate Torque documentation after some time, it was not readily explained on the CMU Sphinx SourceForge site, so I had to spend time searching for Torque information elsewhere. I located the following website: Torque Resource Manager. I am fairly certain this is the correct Torque resource manager being used in conjunction with CMU Sphinx. I also discovered how to enable Torque, although this was apparently done in the past. Therefore, the Tools Group and I will need to determine why this is not working. I also added the tentative Tools Group project proposal to the 2017 Proposal page linked here: Spring 2017 Proposal Page.

Update: While I was out sick, the Tools Group goals changed slightly. As a result the Tools Group proposal was changed by the team members that were in class at the time.

2/7: I successfully created an experiment, it looked like Jeffery and I were both omitting the domain when trying to create an experiment. After making this change and including the ad domain prior to my username, I was able to successfully create an experiment. I let Jeffery and others know in Slack and Jeffery was able to create an experiment successfully.

Plan

2/3: Begin finalizing the Tools Group proposal. Organize information on Torque and LDA to convince Professor Jonas that utilizing both features will provide significant benefits to speech processing and performance. Also, Huong discovered that Torque may have been installed already, I plan to further investigate this and to see if I can find documentation from prior semesters. I will be updating my logs three more times before our next capstone meeting.

Update: Thanks to Huong, we discovered that 2013 Summer Torque was installed. Will look into this further.

2/5: Same as before.

2/6: Determine if Torque is in fact installed currently, and if it is, how to make it work correctly.

2/7: To complete the Tools Group experiment, also look into improving experiment instructions given other groups think it is necessary.

Concerns

2/3: My only concern is having enough time to get a thorough understanding of the speech project as a whole. Once fulfilling the Tools Groups goals for this week I will begin looking into other group tasks in greater detail.

2/5: Same as before.

2/6: My new concern is getting Torque to work properly, will try doing this after discussing this with Professor Jonas and other Capstone groups.

2/7: No new concerns at this time.

Week Ending February 14, 2017

Task

2/8: Out sick today, but currently communicating with team members through Slack. Our groups role has changed somewhat, it looks like we will now be focusing on learning how to install PocketSphinx and any benefits it provides. I will now do research on PocketSphinx.

Update: It looks like another task was added, we must learn about G++ and GCC. I will begin looking into G++ and GCC in more detail.

2/11: Continue reading on GCC, G++, and PocketSphinx.

2/13: Begin developing a thorough understanding of G++, GCC, and PocketSphinx.

2/14: Read through logs of teammates and other class members.

Results

2/8: Began learning about PocketSphinx, it looks like it was designed to be used on hand-held devices, but it can be used on desktops. It also requires SphinxBase in order to be used. When we determine what drone we are installing PocketSphinx on, we will need to make sure that both SphinxBase and PocketSphinx are installed. There are also a number of packages required specifically for Unix like systems. The instructions for PocketSphinx are located at PocketSphinx Instructions

2/11: Continued to look over the CMU SourceForge webpage on PocketSphinx.

2/13: Currently, it seems that GCC can compile at C or C++ but G++ always compiles as C++. It seems that some G++ features are backwards compatible with GCC, but this is subject to change as new G++ versions are released. There is a webpage that has some features that are included in GCC that may be/have been removed from new versions of G++, these features can be found here: Deprecated-Features My research of PocketSphinx went well, the following webpage has a tutorial for installing PocketSphinx and required software packages PocketSphinx

2/14: I've been keeping a close eye on Vitali's logs since he seems to be having the most success with training, as such I'm using his logs as a reference. I plan on trying to run my own train this coming week. Additionally, it looks like the Tools Group has been doing research for G++, GCC, and PocketSphinx.

Plan

2/8: To have a thorough understanding of how to install PocketSphinx and packages that are required for it to run. I would also like to research possible benefits and drawbacks of PocketSphinx.

Update: Will read about G++ and GCC and post findings during next log.

2/11: My plan for the next day is to take notes on the differences between PocketShinx and Sphinx 3. I will also focus on understanding how to install PocketSphinx.

2/13: Continue writing notes on G++, GCC, and PocketShpinx so the Tools Group can make comparisons and possibly try installation.

Update: I hadn't thought of this until now, but trying to install PocketSphinx on an available drone will actually be hugely beneficial. This will give us the opportunity to test PocketSphinx without utilizing LDA and MLLT. Then we could enable both and measure exactly what the benefits are. Overall this is a great opportunity to test its usefulness compared to Sphinx 3.7. LDA and MLLT are very interesting since they can lead to significantly better WER, so I would definitely like to utilize these features with PocketSphinx.

2/14: Trying to determine if installing G++ is really worth it. I think it is worth trying to determine if we really need a dedicated C++ compiler instead of GCC which has more flexibility. At this point it seems a bit unnecessary. Also, I hope to work with our team to start writing up the benefits and drawbacks of G++, GCC, and PocketSphinx. I am going to look into PocketSphinx performance a bit more and expand my notes a bit. Will update this log with any findings.

Concerns

2/8: My only concern is running into problems installing the various PocketSphinx packages.

2/11: No new concerns. Will check in soon with any new concerns that will arise.

2/13: No new concerns. Actually, I am excited now, if we are able to get PocketSphinx installed on a drone, we can create documentation that can track any and all benefits and drawbacks!

2/14: No new concerns.

Week Ending February 21, 2017

Task

2/17: Today I will attempt to locate all important Tools Group files on Obelix, Professor Jonas requested we do this in case Majestix is not fixed.

Update: Systems group got Majestix up! They have e-mailed Jonas so he can set up our user accounts.

2/19: Today I will be trying to run a train and decode successfully. I will update the log after my attempt.

2/20: I am attempting to run another train, I have created an additional sub experiment folder. The train is currently running, will update the log once it is complete.

2/21: Checking in.

Results

2/17: I took a look at Obelix files and folders, I didn't see anything that would be beneficial for the tools group goals this semester. I was mostly looking for GCC, G++, and/or PocketSphinx files, but it appears that none of those are installed on Obelix.

2/19: I created an additional sub experiment for the Tools Group. I then began looking through Vitali's logs, and the instruction Wiki so I could run a train. I began running a train after following the steps on Train Instruction Wiki, but the train was not successful. I got the following error: Cannot create models used by Sphinx 2. This happened about two and a half hours into the train and then the script just hung. I exited the script, I will have to take a look and see what may have caused this error. Oddly enough, it seems that a lot of data was trained. I wouldn't be surprised if others had similar problems, it seems that most people were not able to run a train successfully.

2/20: My second attempt at running a train resulted in the same error message as before. About an hour or two in I receive a message that says: Cannot create models used by Sphinx 2.

2/21: Checking in, finishing up the last little bit for our group's G++ and GCC comparison.

Plan

2/17: I will read through the 2016 tools group logs and determine if they installed anything on Obelix that we can use. It looks like Majestix is the machine that we need to make a more seamless transition to G++. I also plan on running a train this week since Jeffery had some problems and the instructions are not clear. I will be attempting this in the next few days and will make note of my steps in case I do figure it out. As a group we have set a goal to finish our comparison report on GCC and G++, as such I have been working with team members to document our findings. I will continue to do this until our next class meeting.

Update: Since Majestix is now up, I will not focus on Obelix installations, instead I will be looking at Majestix (once our user accounts are added) so I can take note of what is installed and how we can set up G++.

2/19: I plan to determine what I did wrong when trying to run a train so I can correct it. I will also continue to work on the G++/GCC comparison report with my group.

Update: Tomorrow I will also attempt to run a decode on the LM that I was able to create, I want to see if that is successful, if not I suspect it's just user error when I tried to run the train.
Update 2: I tried to ssh into Majestix, it's up, but I cannot log in with my username and password. I'm guessing Jonas hasn't run the script to add student logins to Majestix. I emailed Jonas about this.
Update 3: Systems Group is needing us to install GCC on the Rome machine, working with Mark from Systems to get access to Rome. It's up, I just can't log in. Will update this log as needed.

2/20: Since the second train of mine failed as well, I have reached out to the Tools Group on Slack. I am currently waiting for a reply, I know that Jeff had tried running a train, I want to see what error messages he received if any.

Update: I heard back from Jeff, he got a different error than me. It looks like the 30hr/test doesn't work properly because it says that a file is missing, which is correct. The 30hr/train is the one that creates the necessary file for training. Not sure how others were able to train successfully.

2/21: Checking in.

Update: As of yesterday evening Systems Group and Jonas got Majestix set up, I was able to ssh in successfully. Tools Group will be taking a look at Majestix and we will try to install G++ this coming week.
Concerns

2/17: I worry that I will not be able to run a train successfully. I believe Vitali was the only person that was able to run one successfully. I will try looking through Vitali's logs to make note of anything that the training/decoding page does not cover.

2/19: No significant concerns, just want to successfully train and decode. Will keep reading through logs and instruction wikis. It seems that past semesters have written over each other too much, or have not been good at keeping thorough notes, this makes figuring out how to run a train and decode a bit more difficult.

2/20: My current concern is successfully running a train.

2/21: No new concerns.

Week Ending February 28, 2017

Task

2/23: Checking in, this is an update on what was done during capstone yesterday.

2/26: Checking in, updating my log with what I did on Friday.

2/27: Currently running a train on Majestix so we can compare the training results to the results we get after installing G++. Will update my log with the results once the training and decoding is complete.

Update: I figured out the decode problem, I will run a decode tomorrow and will update my log with the results.

2/28: Ran another train and Majestix and tried to do a decode and score. Decode and scoring failed on Majestix, trying to determine why.

Results

2/23: Our group was able to run a decode successfully, the results of the decode are posted in the 001 sub-experiment folder. We were also able to get the files we needed copied onto Majestix, and we were able to un-mount and re-mount /mnt/main. We also signed up for the GRC.

2/26: I didn't have time to try to unmount mnt/main from Majestix last Friday, however it Huong shared with our group how to verify that mnt/main was unmounted. Jeff was trying to do this so he could run a train on Majestix. Although I did log into Majestix to get it unmounted I was able to help Jeff get sshed into Majestix properly by letting him know what commands were needed (aka ssh-keygen, etc).

2/27: Train ran successfully on Majestix, I was also able to create the language model. However, I am having problems running the decode. I will continue to look into this and update the log when this has been resolved.

2/28: Training is working fine on Majestix, however I was unable to do a decode successfully. I have specified the directory the run_decode.pl script is in, but for some reason it is not running on Majestix. I have asked my team members if they have tried running a decode yet on trained data on Majestix. I am currently waiting to hear back from them.

Plan

2/23: This week I plan to run another train and decode, I will post the results to the logs and the experiment wiki page. I believe Huong is going to try installing G++ on Majestix after running a train, the group will use Slack to communicate in case any problems arise.

2/26: Tomorrow and Tuesday I plan on running a train on Majestix. I know Jeff was having problems trying to unmount since he was being prompted for a root password. Tomorrow I plan on trying this to see if I can figure out a solution. It appears that we will not be able to install G++ on Majestix this week remotely since it is not connected to the Internet. This was discussed with the Systems group, they are working on getting it connected if possible.

2/27: Since I have completed the train and created the language model I plan to complete the decode tomorrow on Majestix. I figured out what I was doing wrong, I was not specifying where the run_decode.pl file was, I need to do this for it to run correctly. This should make the decode go very smoothly for tomorrow. I will add my findings to our experiment wiki page as well as our group's log so we can reference it after completing the G++ installation.

2/28: Figure out why decode and scoring is not working on Majestix. I suspect this is user error, so I am waiting to hear back from my team mates to see if they have tried a decode or score on Majestix yet.

Concerns

2/23: My only concern is having Internet access on Majestix, without this we will not be able to install G++.

2/26: My main concerns going forward is getting Majestix connected to the Internet. We will need Internet access to be able to install G++ remotely, if it is not connected then we will need to do the installation in person. Not having Internet access could cause problems going forward for other software installations as well.

2/27: My concerns have not changed, we still do not have Internet access to Majestix which means we won't be able to install G++ remotely. We will need Internet access or will need to install it in person. The Systems Group is working to get Internet access to Majestix, they have been very good at communicating so I am not concerned, I know that eventually we will have Internet access.

2/28: No new concerns.

Week Ending March 7, 2017

Task

3/2: Today I tried taking a look at Majestix to see what files were on it. Jonas wanted us to copy the files to one of our user folders for safe keeping. We were able to run a decode on Majestix in class on 3/1.

3/3: Checking in.

3/6: Run a train on Obelix, double check Majestix to see if the text files from Spring 2016 tools group are on the machine anymore.

3/7: Attempt to run a new train in a new sub experiment directory, as well as attempt a decode on Obelix since the prior one was not successful. (Please refer to the 3/6 update below for more detail).

Results

3/2: I read through the Spring 2016 Tools Group log as well as some individual member logs, it appears that some of the files (specifically the ones Daisuke added) are not on Majestix. Huong and I both ran into the same problem when trying to decode on Majestix. Professor Jonas helped us locate the decode problem (too many files had been copied from Caesar). Towards the end of the class period we realized that since we already had completed trains on Majestix, we would be able to run a decode (temporarily) on Caesar until the Majestix decode issue was resolved (which it was). The outcome of this is posted on the 0299/007 experiment wiki page.

3/3: Checking in.

3/6: I checked Majestix again thoroughly, I believe the text files from the Spring 2016 tools group are no longer on Majestix. I let my team members know. I plan on running a train this evening on Obelix, I will update my log when the train is complete.

Update: The train was successful, however the decode failed. It appears that I am having the same issue I had on Majestix when I tried to decode last week. It looks like running the rsync -r /mnt/main/local obelix:usr/local copied over correctly, however, it appears that /mnt/main/local has an additional local directory. This means there is an additional directory on Obelix and any other machine that ran that command following March 1st (it appears the second local directory was added to /mnt/main on this date). I have emailed Professor Jonas about this problem and I am currently waiting to hear back from him. In the meantime I have created a new directory on Obelix named "local-bad", this directory holds the additional local directory. To save time I tried to recreate the LM for my 2099/008 sub experiment folder, however, the decode failed again. As such, I will create another experiment and will run another train, once the train is complete I will attempt to run a decode. I will update this log when this is attempted and when I hear back from Professor Jonas.
Update2: I have heard back from Jonas, he has corrected the local directory issues on Obelix and /mnt/main

3/7: I was able to run a train on Obelix but building the LM failed, as such I fully expect the decode to fail as well. I spoke with Mark from the Systems Group, he also noticed the problem with /mnt/main having two local directories. Mark believes he has identified the problem and he is going to be trying to decode on the Systems group machine tonight. I will be communicating with him to see if he finds a way to resolve the issue when trying to run decode from the machines.

Update: I am running a decode on the data I trained on Obelix in experiment folder 0299/009. It is running correctly now since Jonas was able to remove the additional local directory on /mnt/main an Obelix.
Update2: Was able to run a successful train an decode on Obelix.
Plan

3/2: I plan to further investigate what folders and files may be needed from Majestix. Since Professor Jonas is wanting to switch us to Obelix, I will be coordinating with the Systems Group so I can run a train and decode.

3/3: Checking in.

3/6: Run a train on Obelix and take a snapshot of the files and folders on Obelix so we can make comparisons after the G++ install.

3/7: Re-create the bad LM and run a decode in 0299/009 sub experiment folder after Mark attempts his.

Concerns

3/2: I am still concerned that we will not have Internet access to all the drones, that will make installations very difficult if someone is not on-site to plug in the Ethernet cable. This obviously makes it difficult for our team to do our job. To address this I will try to run a successful train and decode run on Obelix as soon as possible so we can use that data as a reference before installing GCC and G++ on Obelix (assuming we are told to do so).

3/3: Checking in.

3/6: No real concerns, I just hope that we are able to install G++ this week so we can move on and get PocketSphinx installed.

3/7: My concern at the moment is hearing back from Professor Jonas so he can remove the extra local directory from /mnt/main.

Week Ending March 21, 2017

Task

3/16: Checking in, I will be looking into how to run a decode on unseen data. I intend to do this on Obelix in the next few days.

Update: I looked at the unseen decode wiki page, the modeling group page, and Greg's wiki log. These three resources helped me understand possible problems (and solutions) to run a decode on unseen data.

3/19: Checking in, I am getting ready to run a train on Obelix.

3/20: Run a decode on seen data.

3/21: Run a train on Obelix. Then run a decode on unseen data. I have started the train that I will use for the unseen data decode, I will update my log when this is complete.

Results

3/16: No results as of yet, I have been looking over the modeling group's unseen train documentation and also the original documentation that contains instructions for unseen data.

3/19: I will be running a train on Obelix so we can compare it's WER to the train that was run before the gcc installation.

Update: The train ran successfully.

3/20: I will update this once the decode is complete.

Update: The decode on trained data was successful, the information has been added to the experiment webpage.

3/21: The train completed successfully on Obelix today. I was also able to build the language model and run a decode on unseen data without any problems. The results are posted on the experiment wiki page.

Plan

3/16: Run a train and decode on Obelix for both seen and unseen data.

3/19: I will complete the train, then will run a decode on seen and unseen data.

3/20: I know plan to run another train on Obelix so I can decode on unseen data. I will update the log once this is complete.

3/21: Since the trains and decodes are complete, I plan to further help complete the GCC installation guide, as well as gather information for Pocket Sphinx installation proposal (assuming that this is still wanted at some point).

Concerns

3/16: No concerns currently.

3/19: No concern at the moment.

3/20: No concerns.

3/21: No concerns, I was worried about running a decode on unseen data, but the instructions (although cryptic) were able to help me do this successfully on Obelix. I have no concerns going forward, I only hope that the G++ installation will go smoothly.

Week Ending March 28, 2017

Task

3/25: I am checking in, in the next few days I will run a train and decode on Obelix. This will be for comparison purposes since we installed G++ on Obelix this past week.

3/26: Checking in.

3/27: I am running a train and decode on Obelix today so we can compare the results to the tran/decode that was run before installation.

3/28: Work on Rebel team work, try to run a train after modifying config files.

Update: This is in progress, I am doing more research on modifying the config files. I don't want to start making changes until I can hypothesis that the changes are worthwhile.
Results

3/25: No results as of yet, I will post train and decode results when I run them.

3/26: Checking in.

3/27: I will update this when the train and decode are complete.

Update: The train and decode were successful. It appears that there were no changes in the train or decode after the G++ install on Obelix. The outcome of the decode is posted on the Tools Group 012 experiment wiki page.

3/28: I have created the sub experiment directory for my experiment. I have been taking a look at the sphinx_train.cfg file, I have not made any changes to my copy as of yet. I am looking over tutorials so I can better understand what variables I should modify. I am looking over the following webpage [1] so I can make improvements to the file.

Plan

3/25: Run a train and decode on Obelix, including a decode on unseen data. Also plan on starting a PocketSphinx installation document. The tools group is thinking that it may not be worthwhile testing PocketSphinx this semester. I will discuss this with our group and see what they suggest.

3/26: Checking in.

3/27: My plan is to do a train and decode on 5hr train for the Rebels group, as well as start a Pocket Sphinx proposal.

Update: I have created my experiment for Rebels group work, I am waiting to run the train since I have not modified the necessary config file as of yet. I am needing to do more research before making changes. I have notes from last class that I will be looking over, as well as the Robust Group Tutorial I linked in this log.

3/28: Continue reading about the config file, and how to make changes that will improve the WER. I will be spending most of today doing research so I can have a deeper understanding of how each variable in the config file effects our models.

Concerns

3/25: No concerns at the moment, the Rebel team, and the Tools Group are both making progress and are having few problems.

3/26: Checking in.

3/27: No concerns.

3/28: My main concern at the moment is being able to make worthwhile changes to my experiment so I can improve our results for the Rebels team.

Week Ending April 4, 2017

Task

4/1: Checking in, met with a Rebels Team member and a Tools Group member about tasks that are required for next week.

4/2: Checking in.

4/3: Determine how many files were effected by the G++ installation on Obelix, and begin the PocketSphinx installation proposal.

4/4: Continue to do reading on changing the configs in order to get a better WER.

Results

4/1: Learned that we need to keep running experiments for the rebels team. For the Tools Group we are supposed to create a document for G++ installation on Caesar.

4/2: Checking in.

4/3: The directories and files that were impacted by the G++ installation on Obelix can be found under the Tool's Group G++ Installation Documentation, found here: [2]. I have been looking into PocketSphinx and it's performance and possible advantages of installing it. From what I have found thus far, PocketSphinx is the fastest decoder offered by CMU, however, it has slightly lower accuracy rates. That said, if an experiment comparing accuracy and performance was desired, it might be worthwhile installing PocketSphinx after implementing LDA and RNNLM since these features would greatly increase performance. Additionally, getting Torque to function properly would also significantly increase Sphinx3 performance. It seems to me that the focus of this semester should be implementing these performance and accuracy features, and then comparing it to PocketSphinx. If Sphinx3 can match, or outperform PocketSphinx in accuracy, it might be worthwhile to forego any PocketSphinx installation at all. That said, it is entirely dependent of what purpose the speech recognition system is supposed to serve.

4/4: I have learned a bit more and have a few ideas of what to change, I will be trying to run an experiment in the next few days. Additionally,the G++ installation proposal for Caesar is complete.

Plan

4/1: Will start the G++ installation proposal for Caesar in the next few days.

4/2: Checking in.

4/3: Continue adding to the PocketSphinx proposal documentation, I have reached out to my team members to discuss possible drawbacks of PocketSphinx and to get their input.

4/4: Run an experiment with edited config on Obelix for Rebels Team.

Concerns

4/1: No concerns at this moment.

4/2: Checking in.

4/3: My only concern at the moment is having enough time tomorrow to run an experiment for the Rebel's team.

4/4: No concerns currently.

Week Ending April 11, 2017

Task

4/8: Checking in.

4/9: Checking in.

4/10: Get with the Rebels Team to discuss the outcomes of our experiments. Try to determine if we are needing to install miniconda for a 64bit machine or not. Also, get in touch with the Systems Group to ensure that a backup of Caesar is made so the Tools Group can install G++ this Wednesday.

4/11: Ensure that the team was able to successfully install miniconda, and work on experiments for the rebels team.

Results

4/8: Checking in.

4/9: Checking in.

4/10: I have reached out to my team members to discuss the miniconda install, and to see if we should try installing 64bit instead. Additionally, I have reached out to the Systems Group to check on the backups. I have also been reading about experiments and how to modify the config files for trains.

4/11: I spoke with the tools group and it seems that the miniconda installation is complete. Additionally, we have discussed what we will need to do for setting up a hotfix directory for our gcc and g++ installation. I have reviewed the wiki logs from Jonathan from the spring 2016 tools group, if the backup of Ceasar is complete the tools group will be able to install g++ on Ceasar tomorrow.

Plan

4/8: Checking in.


4/9: Checking in.

4/10: Help in any way I can with the miniconda installation, also continue to expand the Pocket Sphinx proposal document, and perhaps go into more detail about why it may not be necessary to install it.

4/11: Install g++ on Ceasar as long as a backup has been made by the systems group. Discuss with the tools group how to create a hotfix directory and how to add symbolic links to the drones so they can access the gcc and/or the g++ directories.

Concerns

4/8: Checking in.

4/9: Checking in.

4/10: No concerns at this time.

4:/11: No concerns at this time.

Week Ending April 18, 2017

Task

4/15: Checking in.

4/16: Checking in.

4/17: Do more research on Pocket Sphinx in order to make a worthwhile argument for (or against) installation. Additionally, look at Greg's logs and continue reading CMU experiment documentation in order to make effective changes to the configs in order to get a better WER.

4/18: Try to gather more information comparing Pocket Sphinx and Sphinx 3. Once more information is gathered as a group, add more information to the Pocket Sphinx proposal documentation we have on the tools group wiki page.

Results

4/15: Checking in.

4/16: Checking in.

4/17: At this point the Tools Group has gathered a fair amount of information about Pocket Sphinx, thus far it seems that Pocket Sphinx would not provide any significant benefits, besides performance, to our speech project. It seems that we, as a group, should focus on improving the system we already have (using Sphinx 3) before trying to encourage Pocket Sphinx installation. Additionally, few comparisons have been made between Sphinx3 and Pocket Sphinx. It seems that it might be a worthwhile article opportunity to fine tune Sphinx3 and then compare it's performance and accuracy to a fine tuned Pocket Sphinx. I suspect that fine tuning Pocket Sphinx would be very time consuming, at this point it seems like it would only be a distraction from our current goals.

Despite my reading, I still have some areas that I need some clarification. Greg from the modeling group has done more experiments than myself, I have looked through his logs to get a better understanding for the changes he has made. The Rebel's team hopes that the changes and experiments we have run thus far has resulted in a better WER than the Empire team.

4/18: We have concluded that Pocket Sphinx, like I mentioned in some earlier wiki entries, might not be worthwhile at this point in time. Arguments for this have been added to the Pocket Sphinx proposal document.

Plan

4/15: Checking in.

4/16: Checking in.

4/17: Run an experiment for the Rebel's team, as well install GCC and G++ on Wednesday, assuming that Caesar is backed up. I know that the System's group has been having problems getting the backups working correctly. If Caesar is backed up we should be able to install GCC and G++.

4/18: At this point we plan to install GCC and G++ tomorrow given Jonas' permission thanks to the Systems Group being able to back up Caesar successfully.

Concerns

4/15: Checking in.

4/16: Checking in.

4/17: My only concern at the moment is being able to install GCC and G++ on Caesar this coming week.

4/18: No concerns at this moment in time.

Week Ending April 25, 2017

Task

4/21: Email Jonas to request permission to install GCC on Caesar.

4/23: Checking in.

4/24: Checking in, still have not heard back from Jonas about getting permission to install GCC on Ceasar.

4/26: Email Jonas again and create a pre-gcc snapshot directory on Miraculix in case he is wanting further validation.

Results

4/21: Emailed Jonas letting him know which files were hit by the GCC installation on Obelix.

4/23: I am still waiting to hear back from Jonas, once we hear back from him we will install GCC on Caesar.

4/24: Still waiting on an email from Professor Jonas so we can install GCC on Ceasar, in the meantime I have reached out to the Capstone group to see if I can install GCC for testing purposes, that way I can ensure that my snapshot (and directory comparison) was correct. Currently waiting to hear back from the team assigned to Miraculix so I can go ahead with that installation.

4/26: I still haven't heard back from Jonas, that said I was able to complete a correct pre-gcc snapshot on Miraculix in case Jonas wants us to test the installation of GCC on Miraculix as well.

Plan

4/21: Checking in.

4/23: Install GCC on Caesar.

4/24: Install GCC on Caesar, install GCC on Miraculix (as a precaution) after getting permission from the team assigned to that machine.

4/26: Install GCC on Caesar, after getting permission from Jonas.

Concerns

4/21: Checking in.

4/23: No concerns at this time.

4/24: My only concern is hearing back from Jonas and/or the team assigned to Miraculix. I will update my logs when I do hear from them.

4/26: No concerns at this moment in time.

Week Ending May 2, 2017

Task

4/29: Work on running a 5 hour train and decode on Caesar and Idefix for comparison purposes. Also work on combining the GCC and G++ documentation.

4/30: Checking in.

5/1: Checking in.

5/2: Run a 5 hour train on Idefix when it comes available for comparison purposes. Keep working on the single GCC/G++ documentation. Work with our teams to try to find ways of improving the WER.

Results

4/29: We successfully completed the 5 hour train and decode on Caesar. We are waiting to run the 5 hour train and decode on Idefix since it is currently running a 300 hour train. Additionally we have begun combining the GCC and G++ documentation, but it is still a work in progress.

4/30: Checking in.

5/1: Checking in.

5/2: I checked with the Rebels team, they ran into some errors while running the 300 hour train, so they needed to restart the train. As such I have been unable to run the 5 hour train for comparison purposes. The train that was needed on Caesar is complete and the results have been posted to the Wiki. I am waiting to hear back from Greg, he was going to let me know when the 300 hour train was complete. The Tools Group has continued to add our GCC and G++ documentation to a single wiki entry.

Plan

4/29: Run a train and decode on Idefix when it becomes available. Continue to work on GCC and G++ documentation.

4/30: Checking in.

5/1: Checking in.

5/2: Run a 5 hour train on Idefix when it becomes available. Continue to work on documentation for our teams and the GCC and G++ installation guide.

Concerns

4/29: No concerns at this time.

4/30: Checking in.

5/1: Checking in.

5/2: No concerns at the moment.

Week Ending May 9, 2017

Task

5/6: Checking in.

5/7: Checking in.

5/8: Need to finish running a 5 hour train and decode on Idefix.

5/10: Continue working on Tools Group documentation.

Results

5/6: Checking in.

5/7: Checking in.

5/8: Ran a 5 hour train and decode on Idefix since it is now available. Worked on the Rebel's team final report.

5/10: Worked on Tools Group documentation.


Plan

5/6: Checking in.

5/7: Checking in.

5/8: I was able to run a successful train and decode on Idefix for a 5 hour train. The results are in the Tools Group experiment wiki. The trains that Jonas requested for GCC and G++ comparisons are now complete and documented on that page. I will continue to work on the Rebels team final report along with other team members.

5/10: No further plans, only plan to finish capstone today!


Concerns

5/6: Checking in.

5/7: Checking in.

5/8: No concerns at this moment in time.

5/10: No concerns.