Speech:Spring 2018 Hannah Yudkin Log


 * Home
 * Semesters
 * Spring 2018
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

January 30th, 2018

 * Task:
 * The first task was to change our passwords and get familiar with logging into Caesar.
 * We discussed the various needs of the project and selected teams. Brian Barnes, Steve Thibault, and I will be in the Modeling Group and are need to find a time in our combine schedules to work together.


 * Results:
 * I joined the modeling team and found material that will help us better understand the topics. I also spoke to previous students about how they approached the project.


 * Plan:
 * Individually, we all need to research how to train a model on sphinx and look back at the previous data to see what the previous students have accomplished. As a team, we need to establish team goals and responsibilities for the group.


 * Concerns:
 * Personally, I am not as well versed in this material or process as other students. This means I need to spend more time becoming familia with the process which is why I wanted to be in the modeling group.

January 31th, 2018
Read Logs
 * Task:


 * Results:


 * Plan:


 * Concerns:

February 2, 2018

 * Task:
 * We need to become more familiar with the system and start running experiments. I also have to learn more about the projects and what has already been accomplished.


 * Results:
 * I have read the basic concepts behind speech recognition and have started to look into the data we will use in our experiments. I found these two sites to be helpful.
 * https://cmusphinx.github.io/wiki/tutorialam/
 * https://cmusphinx.github.io/wiki/tutorial/


 * Plan:
 * I plan to discuss my findings with my team on Sunday to develop a concrete plan on our next steps. We have all agreed to read the supplied material so we are all on the same page. I think it would be helpful if we see the advancements other models have used and see if we can better apply those same concepts to sphinx.


 * Concerns:
 * I am slightly concerned how all the teams will be able to coordinate with each other with such a large class. We may need to select one individual from each group to represent the team, so we can all be working towards the same goals.

February 4, 2018
Read others logs.
 * Task:
 * Results:


 * Plan:


 * Concerns:

February 8, 2018

 * Task:


 * Read Logs.


 * Results:


 * Plan:
 * Concerns:

February 10, 2018
Read logs.
 * Task:


 * Results:


 * Plan:


 * Concerns:

February 11, 2018
Meet with modeling team to complete our section of draft proposal. We also need to research ways of improving the model and plan out a course of action for the rest of the semester.
 * Task:

We were able to complete our section. Because there was a large emphasis on having a cohesive project proposal, one student offered to combine everything together and reformat anything in a way that made it seem like it was from one connected team. That allowed our group to focus on planning and trying to understand the material we already have, and how we can build upon it.
 * Results:

Our draft proposal needs to be more specific. With the feedback from the professor, we will be better able to decide an area of focus an have more in depth goals. Also, we need to meet with the other groups and create a strategy that will allow each team to work separately but to the same goal.
 * Plan:

Although we have a somewhat general outline of our plan, we need to delve deeper into each of the proposed ways we can improve our model. There is a lot of material we need to sift through, and in order for us to make any improvements on the word error rate we need to better understand exactly what has been done before. We also need to understand the various techniques of discussed enough to make an informed decision about which direction to take the project.
 * Concerns:

February 12, 2018
Celebrate birthday!: Our group needed to reformat our goals section to have bullet points so that it would conform to the format of the other groups proposal sections.
 * Task:

Had fun.
 * Results:

Meet with Steve to successfully train, decode, and score on Monday. Many of the other members in the different teams were having issues running a train and we needed to meet to see if we could all run it on our own computers.
 * Plan:


 * Concerns:

February 13, 2018
Successfully run a train, decode and scoring. I needed to create a new experiment because there have been issues with login/accessing things on the wiki. The experiment was successful.
 * Task:

Also, based on negative feedback from Professor Jonas, I have offered to reformat the thesis with Camden so that it meets the professors clear instructions. We have decided to base ours off of the Spring 2014 proposal. We also feel it would be very help for future students to have the terms used in the wiki to be clearly defined so they don't have to waste the first few weeks struggling to understand what is being discussed. Ran a train.
 * Results:

Meet with Modeling Group this Sunday to rewrite our section of the plan as well as research more on the topics we plan on implementing. Camden and I decided a new format for the proposal, and we have each group separately working on their sections. We will then meet to combine all sections and add upon anything that is lacking. I am worried that everyone will struggle to complete the necessary work it time. I myself and struggling with determining my timeline because we have not decided which direction with the model we will be going in.
 * Plan:
 * Concerns:

February 17, 2018
We need to flesh out or proposal. We also need to determine correct procedures for a train decode and scoring of unseen data. We were able to restructure the overview of our proposal. We created a draft outline, a visual representation of formatting, and instructions for other teams.
 * Task:
 * Results:

Individually we need work out exactly which member of our team is completing what task.
 * Plan:
 * Concerns:

February 18, 2018
Reformat the group proposal in order to make the entire document have only one voice. We also wanted to create a glossary that explained any terms that were related to the content discussed in the proposal. It was necessary to have more depth, better descriptions of terms, more background information from previous semesters, and more specific timelines.
 * Task:

We were able to increase the proposal from being 9 pages and 2,132 words to 22 pages and 7,893 words. I feel we were able to delve significantly deeper into all of the topics and create a glossary which no teams have done in the past.
 * Results:

Camden and I established an outline which we shared with everyone last Tuesday which we hoped would lead to clearer expectations and more detailed group and individual sections of the proposal. We asked the teams to have their sections completed by Saturday night so that we would have all day Sunday to read through and edit. We then would allow everyone to look at the final drafts before we uploaded it to the wiki. Something that I struggled with was trying to get specific teams to update their sections. Although we had created an outline with a visual and textual description, I don't think Camden and I had explained what we needed well enough. Certain teams did not have an adequate background from previous semesters or were missing individual specific goals, or the necessary terms added to the glossary. Camden and I tried to get in contact with the teams, and for the most part, we were able to have them fill in or replace the sections in question.
 * Plan:
 * Concerns:

February 19, 2018
Read logs.
 * Task:
 * Results:


 * Plan:


 * Concerns:

February 20, 2018
Even though we submitted our proposal, Professor Jonas reviewed it again and gave each group feedback late Monday night. We have until Tuesday at 9pm to make any changes. He specifically mentioned about the modeling section: Pointed out an error (a typo in 2017 final report) which caused you to draw incorrect conclusions and plans so you need to fix it. The current best result is 41.3% on full data (300hr) and anything moving forward needs to work with that difference between seen and unseen data as the former is just a sanity check and the latter is where we do research. you also seem to have an unfinished paragraph with regard to Torque as it just kind of sits there and you don't give any plans...perhaps it is in progress. Reformatted the proposal for the third time. Professor Jonas emailed specific members of each group instructions, however, it was very confusing trying to compile all the information. Also, many of the teams did not work on their sections until right before the deadline and we end up with a cluster fuck of messages on discord. We then got ANOTHER extension for some reason, and the plans continued.
 * Task:
 * Results:

We pulled everyone away from the group chat because it was just horrible, and the team leaders talked privately about what we needed to do by tomorrow. I became the pseudo-team leader of the proposal and asked the leaders to send all emails to Professor Jonas to each other so that we could be on the same page. We also planned on completing the sections by midday tomorrow.
 * Plan:

February 21, 2018
We need to finish reformatting the proposal. The section I was directly responsible for was editing the section relating to our table that has the teams and group members. The previous format I had done had used HTML formatting, but because Professor Jonas likes me to suffer, I needed to reformat it to be wiki specific. We also needed to add links for our names to our logs, as well as reread for errors.
 * Task:

Because I am amazing and wonderful, the table is perfect and I can pat myself on the back because no one actually reads these. All future semesters have my express permission to copy my formatting and use it for their future proposals. All sections of the proposal have fixed timelines, concise plans, and accurate and detailed backgrounds.
 * Results:

Accept whatever grade we have received, because we all put a significant portion of time and energy into all three drafts.
 * Plan:

Dan suggested an After Action Report to go over why we were less than successful when trying to enact our respective changes. I think this is great because getting home to 200+ messages of people all freaking out was not ideal. We need to have a more well thought out plan for our results paper.
 * Concerns:

February 23, 2018
Model Group first 300hr Train, Decode and Scoring (TDS): Conclude experiment run on Majestix commenced at 10:15 am Friday, 23 February. "nohup...." ran until after 11:00 pm Friday, 23 February.
 * Task:

Concluded Experiment 0303 044 Train completed on Majestix, but will have to continue on another drone server as "sclite" is not able to be run on this drone server as it interferes with GCC per Systems Group. edit: FATAL_ERROR: "mdef.c", line 680: No mdef-file
 * Results:

We need to discover why we keep getting core dumps. It seems inconsistent on which servers will produce a core dump in an experiment because it has happened to Steve on different machines with the same parameters, as well as the same machine with the same parameters. We don't know what hell is going on. We need to be able to replicate the previous experiment done by 2017 because as it stands we aren't making any progress in understanding what any of this means.
 * Plan:
 * Concerns:

February 25, 2018
Read logs. Investigate core dumps as well as our own failures.
 * Task:
 * Results:


 * Plan:
 * Concerns:

February 27, 2018
Create a new subfolder for modeling-specific experiments. Also, Professor Jonas explained that the way we were running experiments was flawed and we did not have a good enough understanding of what we were changing in experiments to be making any legitimate conclusions. We were tasked to copy an experiment recursively to see how 2017 got their best results.
 * Task:

We created a Main experiment for Spring 2018 Capstone Modeling Team students to run our train/decode jobs. Add copy -r the best experiment of 2017 to the to-do list.
 * Results:

We plan on successfully recreating that experiment, but as of right now we don't know the script that was run or the parameters used. So that's cool. The documentation for the best experiment references an experiment that is NOT IN THE WIKI which is super fun, so we have to figure out how we're gonna make that happen. We've read the logs of those who ran the experiment but its less than helpful, so we can't depend on them for much.
 * Plan:
 * Concerns:

March 1, 2018
Read Logs.
 * Task:


 * Results:


 * Plan:
 * Concerns:

March 3, 2018
Read logs. Again?
 * Task:


 * Results:


 * Plan:
 * Concerns:

March 4, 2018
Read logs.
 * Task:
 * Results:


 * Plan:


 * Concerns:

Week Ending March 12, 2013

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending March 26, 2013

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 2, 2013

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 9, 2013

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 16, 2013

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 23, 2013

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 30, 2013

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending May 7, 2013

 * Task:


 * Results:


 * Plan:


 * Concerns: