Speech:Spring 2015 Nathaniel Biddle Log


 * Home
 * Semesters
 * Spring 2015
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Week Ending February 3, 2015
2/3 : Read through logs to learn more about the project.
 * Task:

2/3: I gained greater familiarity and have narrowed down the skills that I'll need to focus on over the next few weeks and the topics that I need to focus my study on.
 * Results:

2/3: I plan to familiarize myself with Unix shortcut commands and Perl so that I can be a more useful and contributing member of my group. I also plan to learn more about the system and do some hands on work during class this week.
 * Plan:

2/3: I am not a novice, but not an expert, but I will continue to learn and attempt to gain greater and greater proficiency in all topics involved with this course and contribution to the project.
 * Concerns:

Week Ending February 10, 2015

 * Task:

2/4: Just to research more logs tonight and also to research whether or not Sphinx-4 will work on redhat. We will be creating a google hangout group so that we can communicate.

2/10: Continue to research logs, prepare for next class by becoming better versed w/unix commands and working through Caesar, and be prepared for group discussion about the software suite during the 2/11 class.


 * Results:

2/4: I have been looking a little bit closer at the two Justins' logs from Spring 2014 semester, as they were the previous tools group. I searched around redhat's hardware and platform compatibilities and will continue to over the next few days to see if it can cooperate with the Java-written Sphinx-4.

2/10: It seems that we should be able to make Sphinx-4 work on Redhat 6, despite it being solely written in Java. IBM has J9, a package that has made the JVM work with Red Hat Enterprise Linux Server release 6.2. There is also the JPackage, that should make it work. It seems that redhat, recognizing that Java is an extremely popular programming language, is making it so that java developers will want to use their OS. This is still just initial research however and there could be something that I'm missing and it may turn out that we can't make it work, but it is looking promising.


 * Plan:

2/4: I will research more and also begin to become more accustomed to the Linux environment of the SSH. I will either create or join a hangouts group once we create it. I will make sure I'm becoming knowledgeable about Perl scripts as well.

2/10: My plan is to more concretely divide up tasks and set a course of action tomorrow so that we can have our piece of the proposal done as soon as possible.
 * Concerns:

2/4: No concerns yet.

2/10: Not concerned, but I want to know exactly what I'm responsible for, but that will be easily figured out in class tomorrow during group discussion.

Week Ending February 17, 2015

 * Task:

2/11: To understand how to run an experiment so that once my account on Obelix is made I can start to test how the experiments are running on red hat before and after updates of the suite are made.

2/16: To make sure I'm understanding how to run a train and decode so that I can effectively test the toolkit.

2/17: Verify access to Obelix. Become better versed in Unix commands and navigation of the Caesar environment. Attempt to run a train and decode to make sure I am getting the same results as were gotten doing the same experiment.


 * Results:

2/11: I gained a better understanding of trains and decodes and how to run them. I read more logs and gained a better understanding of my role in the tools group of the project.

2/16: Caesar was down today, so I couldn't really accomplish anything.

2/17: I became better versed with Unix, I was able to log in to Obelix using my username, and I successfully ran a train using the steps outlined in the information section of the mediaWiki. It was a basic 5 hour train, experiment, 0265. I don't know if I did it entirely right, so before moving forward with Language Model creation and decoding, I will work with my group to figure out whether or not I did it right and, if so, where I can view the results and what I can compare them to. I will attempt to create the Language model and run a decode tomorrow, finishing the experiment.


 * Plan:

2/11: Once my account is created on Obelix, I will run an experiment to become confident in gaining the same results on the same system. I am doing this so that my future attempt using updated tools will be more easily analyzed. I want any difference in word error rate to be because of a problem with the tools, rather than a problem with the way I've set up the experiment.

2/16: I will make sure to give another attempt tomorrow. I will continue to read documentation.

2/17: Now that I have run at train, I can move forward in attempting to update the software suite.
 * Concerns:

2/11: No concerns right now.

2/16: No concerns.

2/17: I am not concerned, I just want to make sure tomorrow that I didn't mess anything up or waste an experiment listing by running my train.

Week Ending February 24, 2015

 * Task:

2/21: Read people's logs. Finish experiment on Obelix.

2/22: Checking in. Reading logs. finish proposal piece.

2/23: Attempt to finish experiment on Obelix. Prepare for Wednesday's class.

2/24: Checking in. Reading logs. Finish Experiment to best of ability.


 * Results:

2/21: Didn't quite finish, but will pick up tomorrow. I read some logs and did a little research on the most current versions of some of the tools.

2/22: Continued reading logs. proposal portion finished. slight research today.

2/23: Corresponded back and forth with team about the final touches that were made to the proposal to make sure that everyone was on the same page. I attempted more work on the experiment, but I didn't quite figure it out yet. I will make sure I do my best to get it done tomorrow night so that I have more to offer on Wednesday during group updates.

2/24: Finished to best of ability. read logs.


 * Plan:

2/21: Finish tasks and finish my piece of the proposal for tomorrow.

2/22: more research tomorrow. finish experiment if possible.

2/23: Finish experiment tomorrow. Prepare for Wednesday's class.

2/24: prepare for Wednesday class.
 * Concerns:

2/21: No concerns

2/22: No concerns

2/23: No concerns, but curious about how the proposal will be received.

2/24: No concerns.

Week Ending March 3, 2015
2/25: Work on writing up summaries of some of the articles that were found during our literature search. Begin to learn how to do a train and decode without just running the master script.
 * Task:

3/2: Checking in.


 * Results:

2/25: I read some journal articles and am preparing synopses of each of them to put on the mediaWiki. I began to set up a sub-experiment in 0265, our team's experiment, for trying another train and decode.

3/2: Checking in.


 * Plan:

2/25: Continue experiment and finish write-ups of synopses of literature. Continue reading logs and learning about the process of training and decoding.

3/2: Checking in.
 * Concerns:

2/25: No concerns.

3/2: None

Week Ending March 10, 2015
3/8: Checking in.
 * Task:

3/9: Attempt to run a train using modified values.

3/10: Checking in


 * Results:

3/8: Checking in.

3/9: I ran a 5 hour train with different senone values.

3/10: Checking in


 * Plan:

3/8: Checking in.

3/9: Continue experimenting with different values on practice trains.

3/10: Checking in


 * Concerns:

3/8: Checking in.

3/9: No concerns.

3/10: Checking in

Week Ending March 24, 2015

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending March 31, 2015

 * Task:

3/29: Checking in.

3/30: Checking in.

3/31: Finish current trains and adding to tools page.


 * Results:

3/29: Checking in.

3/30: Checking in.

3/31: I finished a 5 hour train and continued some literature browsing.


 * Plan:

3/29: Checking in.

3/30: Checking in. 3/31: Get acquainted with changing values for trains.
 * Concerns:

3/29: Checking in.

3/30: Checking in.

3/31: No concerns as of now.

Week Ending April 7, 2015

 * Task:

4/3: Checking In;

4/4: Checked in;

4/5: Studied scripts, brainstormed potential approaches to problem.

4/7: Run 5hr train and continue to review and dissect scripts and speech recognition programs.


 * Results:

4/3: Checking In;

4/4: Checked in;

4/7: Ran 5 hr train successfully, results in exp section 0265/008, continued brainstorming problem approaches. Read literature about speech recognition word error rate and read the mathifier article again in the related readings section.

4/8: I'm putting this 4/8 post here since it's just late at night before class. I did another train, 0265/009 with different senone values so that I can begin to get an understanding of the differences caused by using different values.


 * Plan:

4/3: Checking In;

4/4: Checked in;

4/7: Continue strategizing and brainstorming.
 * Concerns:

4/3: Checking In;

4/4: Checked in;

Week Ending April 14, 2015

 * Task:

4/9: Look at config files, language model, decode.log of Garrett's 256 hour train to see what errors there are and what may have caused them (as well as potential solutions for them).

4/10: Checking in.

4/11: Checking in.

4/12: Send what you found out and be prepared to discuss what you learned in class with the group.


 * Results:

4/9: I compared the default sphinx3.cfg template with the sphinx_decode.cfg for my 5hr train and for garrett's 256 hour train to see if I can understand the way they are working differently and why garrett's worked so well. Today is just the early stage of looking at it so I haven't learned too much. I'll likely have more to say tomorrow or in the soon coming days. I also started looking at decode.log for Garrett's train.

4/10: Checking in.

4/11: Checking in.

4/12: I decomposed the error log file and just found that many of the words were not found. All 1000 errors were words that weren't found, either because the decision trees were too branched or because of different reasons.


 * Plan:

4/9: Carry out my task to the best of my ability.

4/10: Checking in.

4/11: Checking in.

4/12: Prepare for class discussion about current issues with the result of the 256 train and how those issues can be addressed.
 * Concerns:

4/9: None today

4/10: Checking in.

4/11: Checking in.

4/12: No concerns

Week Ending April 21, 2015

 * Task:

4/19: Checking in

4/20: Checking in

4/21: Continue analysis of scripts, lang model, and sphinx config.


 * Results:

4/19: Checking in

4/20: Checking in

4/21: No hugely meaningful leaps in understanding, but I have been learning how to change more values.


 * Plan:

4/19: Checking in

4/20: Checking in

4/21: Try to fix errors in my 125 attempt.
 * Concerns:

4/19: Checking in

4/20: Checking in

4/21: No concerns

Week Ending April 28, 2015
4/23: Continue attempting to get better result.
 * Task:

4/24: Checking in

4/25: Checking in

4/26: Send results of literature search if applicable to current stage of project.


 * Results:

4/23: In process of running 125 hr train.

4/24: Checking in

4/25: Checking in

4/26: I didn't find as much as I would like about how to tweak the language models, but did find one article about optimizing HMMs when using Sphinx and a few articles about speech recognition software that were overviews of the factors that affect performance.


 * Plan:

4/23: Continue trying; see if a better result is founded through manipulation of weights, etc.

4/24: Checking in

4/25: Checking in

4/26: Continue lit search as well as attempt to finish 125 hour train with an effective result.
 * Concerns:

4/23: Time running out; I feel  like there is a lot that I still have to learn that would be beneficial in creating a better result. I would prefer a bottom-up approach to this project, however the time constraint is pushing us into using a top-down method and we are limited to minor tweaks because of it; admittedly there is a lot that we can do, I just feel slightly overwhelmed by the time crunch.

4/24: Checking in

4/25: Checking in

4/26: No concerns

Week Ending May 5, 2015

 * Task:

5/2: Checking in

5/3: Checking in

5/4: Make sure rough draft is in.

5/5: Prepare for next draft and correspond with group about final report for Pats group.


 * Results:

5/2: Checking in

5/3: Checking in

5/4: Finished summary section of the report, as tasked with.

5/5: Corresponded via email about the next steps with the report.


 * Plan:

5/2: Checking in

5/3: Checking in

5/4: Lay in wait to make final touches on report as we finish the real time factor for the effective 256 hour train.

5/5: I abandoned my attempted 125 hour train because we were running out of time and I kept running into an error when attempting to decode.
 * Concerns:

5/2: Checking in

5/3: Checking in

5/4: No concerns

5/5: No concerns; we now have a competitive 256 result.