Speech:Spring 2018 Daniel Beitel Log


 * Home
 * Semesters
 * Spring 2018
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Week Ending February 5th, 2018
30/JAN/2018- The Task for today was to be assigned to a group within the main project. Once the groups were made, the mission was to meet with our group members and introduce ourselves. What group we were placed in would alter the course of our research and role within the project. One of our three group members was not in class so they needed to be notified and updated about what group they were in. Lastly, we as a group needed to discuss what was the game plan moving forward and how we would communicate outside of class. This is crucial because it will impact the project proposal that is due in few weeks. After class, writing my first log was important in order to capture what went down during class.
 * Task:

01/FEB/2018- There were a few tasks that needed to be completed today. The first being was trying to get a hold of our missing group member. Contact was made and I will be in touch with them later tonight. I also met with my other group member to shift through the various experiments done by various other semesters to try and understand what it was that they did. Looking at the Wiki, I have found the the page under the Information Tab that describes how to set up an experiment. We are also looking into how to move files into our own individual directories.

02/FEB/2018- The tasks for today was to continue to add people to the Discord server I created. I also wanted to continue to go through and read files from previous experiments, such as the http files and perl files to try and understand how to set-up an experiment. I wanted to read through different experiments to try and find similar files in each experiment to see what the common files were. I also reread how to set-up an experiment and the data set-up wiki as well.

03/FEB/2018- My task today was to figure out how to run an experiment. https://foss.unh.edu/projects/index.php/Speech:Run_Train_Setup_Script is the wiki page that spelled out how to set-up and run an experiment. I created a sub-directory within my personal experiment directory. The first step was makeTrain.pl switchboard 30hr/train which added all the files needed to start an experiment. makeTrain.pl -t switchboard 30hr/test was the next command which helped set-up the directory. I followed this up by running genFeats.pl -t, I ran this in my experiment sub directory and it worked. I finally ran nohup scripts_pl/RunAll.pl & and this began the train.

30/JAN/2018- The result of the group placement was that I became a member of the Experiment group.This means that going forward I will go through and see what has been done during previous semesters and see how we as a group can improve. This most likely means close interaction with the modeling group and possibly the programming group during certain aspects of the class. Not quite sure what that may look like, but once I complete more research it will become more clear. Since one member of our team was not present for class, I found and emailed them through Mycourses to make sure they knew what group they belonged to. We also as a group made a decision to use the App/Program Discord for communications. This was set up and tested amongst the members present for class. The missing member was updated in the email how to gain access to the communications plan. We discussed that we would be in contact with each other throughout the week and in a few days we will discuss any new ideas that may arise during research. Research and getting an idea of the system in its current state will impact our project proposal contributions.
 * Results:

01/FEB/2018- I made contact with missing member and will follow up with them tonight. The meet with my other group member was good and we also worked with members of other groups to make sure we all had an understanding on how to use certain tools. A game plan for the weekend is currently being worked on and experiments will be created shortly.

02/FEB/2018- I have gotten all of my group members onto Discord. The missing member was informed of their role in the project and the objectives laid out so far. I have also now have a Discord server created with members from all 5 groups within capstone. This will help keep everyone on the same page. I am still unsure exactly how to set-up an experiment. I understand that things are required but the wiki is providing very limited answers. I also installed Filezilla and learned how to use that with Caesar. It is a great tool. Other members are confused about the experiment set-up so that is something I need to continue to research. I will also continue to add members of the capstone class to Discord as needed.

03/FEB/2018-The test ended very quickly. At phase 7 I got a statement that said Something failed at the verify/verify_all.pl which then ended the test. A file named 001.html was then created within the test sub-directory.

30/JAN/2018-The plan is to establish communications with all team members. Research and explore more of previous semesters experiments and gain an understanding of how and why they work/don't work. Having a good communication plan will be very important for overall group success.
 * Plan:

01/FEB/2018- Continue to learn how the experiments work and try and get one completed this weekend.

02/FEB/2018- Continue to research and learn the requirements to run an experiment.Continue to work with other members to further the project along.

03/FEB/2018- The plan moving forward is to understand how I got the result I did and then change/improve it to get different results to have a better test. 30/JAN/2018-None so far, we have a lot of work to start getting up to speed on.
 * Concerns:

01/FEB/2018-None so far.

02/FEB/2018- Trying to find out what is needed specifically to run an experiment.

03/FEB/2018- Nothing as of right now

Week Ending February 12, 2018
06/FEB/2018- The task for today was to come to class and compare results of the train that I attempted over the weekend.Once we had compared results the task then became how to replicate a train being done correctly. This involved going through previous semester logs and the wiki page on how to run a test. On the wiki page, some statements about preparing to run a train are confusing leading to some of us running unnecessary commands resulting in bad trains. As with prior weeks, I continue to try and reach out to our missing team member. Continue to get entire class connected with group communications.
 * Task:

08/FEB/2018- The tasks for today was to continue to add members of the class to discord until the entire class has the ability to communicate in one central place. I also wanted to reach out to our missing group member to see if they were still participating in the project. Our group's individual group proposal that needs to be completed by Sunday needed to be worked on. Learning how to set up a language model and use the decoder was also on my list of things to accomplish today.

09/FEB/2018- The task for today is to create the language model and run the decoder. Once that is completed I will go through any errors that come up. If I do receive any errors I will bring them up in the discord server to see if they are similar to any other errors people have received doing the same task and trouble shoot with them.

10/FEB/2018- The task for today was to run another train and do another Language Model/Decode session to see if the errors in my first attempt were fixed or of they occurred again. Depending on the result, it will impact what I work on next.

06/FEB/2018- The results where that I had indeed run the train incorrectly. The correct commands to create/run a train is as follows: 1.Create a sub-directory in your test folder and cd into it (to gain access to a drone server, type "ssh servername" to switch from Caesar to one of the drones. cd into main/Exp/0303/yoursubdomain.) 2. run makeTrain.pl switchboard 30hr/train 3. run genFeats.pl -t, 4. run "top" to make sure no one else is running a train in the server you wish to use 5. run nohup scripts_pl/RunAll.pl &. This will result in a train. We have had numerous people get the same result without errors ending the train and that is a good thing. Got most of the class in a Discord server and subdivided by groups.
 * Results:

08/FEB/2018- The entire class is now on the discord server I created. Each member is included in a general class thread and also broken up by group. I was also included into a leadership thread which included a member from each group to make sure we can ensure the project is a success. We as a leadership group determined that by Sunday at 6pm, each group proposal was due in order to have another member view and make changes and create a unified group proposal. This helps us with accountability. I have reached out to our missing member yet gain and have gotten zero response. Talking with other members of the project, I got the information to run a language model and decode. I also have access to their results to compare to once I complete mine. I will be running mine on Friday morning and have offered to go through it with anyone else who wants to do it tomorrow.

09/FEB/2018- This is based off of my work and other team members work as the way to create the language model and run the decoder as we know of right now. I have found certain aspects of the Wiki pages do not work the way they are explained. So the first step in this process is to set-up the Language Model. When setting this up, the correct way is to create the LM in your sub-experiment folder. The Wiki has it listed as creating it in the overall class experiment folder which would not work. So my directory looked like this, /mnt/main/0303/002(002 being my sub-experiment folder). Then I created the Lm folder within that so if I ls -al within my 002 sub-experiment folder I would see my 001 experiment directory and the LM directory. You then CD into the Lm folder. The rest of the commands on the Wiki worked just fine in created the information needed within the LM folder. Next I moved onto the Decode portion. cd out of your LM folder, cd intp the experiment folder in your sub-experiment folder that you wish to run the decoder on. cd next into the etc folder located there. The next step on the Wiki requires some modification depending on the hours you used for your specific experiment. On the advice of other classmates, for the first step I used the third example to set mine up. Mine ended up looking like    awk '{print $1}' /mnt/main/corpus/switchboard/30hr/test/trans/train.trans >> /mnt/main/Exp/0303/002/001/etc/002_decode.fileids. Just like most others in the class we all used the default value of 1000 for the senone count so my command ended up being nohup run_decode.pl 0303/002/001 0303/002/001 1000 &. Once you execute this file, hit ls -al and you see that it makes a decode.log in your etc directory. I used Filezilla to make a copy of this on my local machine. To move onto the next part, once the decode section is done it is time to do the scoring. When I ran the parseDecode.pl decode.log hyp.trans I received the common error of rm: cannot remove '../etc/hyp.trans': No such file or Directory. This happened to others in the class as well so I went to the next step. I ran the command sclite 001_train.trans -h hyp.trans -i swb >> scoring.log which resulted in a Segmentation fault (core dumped). My hyp.trans file is empty and my scoring file states "Illegal argument: -hyp.trans" so I will go try and understand why that happened.

10/FEB/2018- The result was a successful train and the exact same errors as before. Opening up the decode.log file, there are two things that stand out on that file. INFO: kbcore.c(442): Begin Initialization of Core Models: ERROR: "cmd_ln.c", line 724: Cannot open configuration file /mnt/main/Exp/0303/002/model_parameters/002.cd_cont_1000/feat.params for reading. This is the first error in the decode.log. What it seems like is happening is that it cannot open the configuration file located in my overall sub-experiment folder. This could be due to bad information on the Wiki, or a missing file, either way it is something I will be looking into. The second error in the decode.log is FATAL_ERROR: "mdef.c", line 680: No mdef-file. This could be linked into the above mentioned error, I do not know yet. As far as my scoring.log, it is completely empty this time. The first one I ran showed that I had used an improper argument which is what led me to believe I had do something the wrong way.****As an update, I have successfully ran a train/LM/and decode. What the problem ended up being was that the train and all the files associated had to be created in my Exp/0303/002 file. I was creating a sub-directory inside my folder (Exp/002/001). This made the the files/functions unable to call upon the files it needed to have a successful run. So, the structure ends up being run inside your sub-directory folder.

,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|     | sw2001b |    1      3 |100.0    0.0    0.0   66.7   66.7  100.0 | |-+-+-|     | sw2005a |    2     42 | 76.2   11.9   11.9    7.1   31.0  100.0 | |-+-+-|     | sw2006b |    1     29 | 69.0   20.7   10.3    3.4   34.5  100.0 | |-+-+-|     | sw2007b |    2     39 | 89.7   10.3    0.0    2.6   12.8  100.0 | |-+-+-|     | sw2007a |    1      7 | 57.1   28.6   14.3    0.0   42.9  100.0 |

06/FEB/2018- Work with others to make sure they understand how to get to this point. Compare train results to previous years and continue to improve group communications. Will try to get rest of class onto Discord server. Look into the decoder. I linked our names to our individual logs on the Experiment Group page on the wiki. We will also update our group tasks as well.
 * Plan:

08/FEB/2018- The plan for tomorrow morning is to run the language model and decode and see what I get as a result. I have created the LM dir (empty) to get ready for the morning. If it is something that looks like a usable result, I will move onto the objectives talked about on our page and explained to us by Professor Jonas. If the result is an error, I will look through what went wrong and work with other members to go through and try and understand what was supposed to happen. I will also make sure our groups proposal is on track.

09/FEB/2018- My plan for today is to go through the results of my Language Model set-up and Decoding and try and figure out why I got the results that I did. I may also try and run another experiment this weekend once I get some additional answers. I will also email Jonas about our missing team member and the plan moving forward to getting a replacement if necessary. Making sure our groups project proposal is getting finished to turn in at 6pm Sunday.

10/FEB/2018- The plan for tomorrow is to look into the model_parameters and the No mdef-file aspect of my errors. I believe this is why I am not getting the result of the decode grading. I have passed on my findings to others in the class in the hopes others stumble upon the answer to why this is happening. ****as an update, the plan will be to look into how to know move on to running experiments and how as a group we are able to improve on future experiments.

06/FEB/2018- none so far.
 * Concerns:

08/FEB/2018- My concern at this point is being a group that is the smallest in the class as is and we have one member who has not shown up or messaged us for guidance. If they do decide to participate, I will explain what to do but they will be in charge of getting their own work done.

09/FEB/2018- Getting a hold of our MIA group member. If that person drops the class getting a third member is critical.

10/FEB/2018- None as of tonight.

Week Ending February 19, 2013

 * Task:

13/FEB/2018- The task for today was to make sure that the entire Experiments group had run a successful Trian/Language Model/Decode and scoring and understood the correct process to do this. I met with other members of the class prior to class starting to help them do the same. Being a member of the Experiment Team, I felt obligated to help people through this process as the Wiki is counter-productive in some of the documentation. I also documented step-by-step in the Experiments/0303/002 section of the Wiki the entire process to do this as a reference for the rest of this class and for other capstone classes in the future. We also had to start working on the most important tasks of fixing the addExp.pl file to the specifications outlined by Professor Jonas. We also welcomed our newest member to our group and got them up to speed about what it was that we needed to accomplish.

14/FEB/2018- The task for today was pretty simple, I created a discord channel for just group proposal work and I reached out to the programming group about a few questions I have about working on some of our experiment work and the possible things to look for while testing them. I am also preparing to start our groups rework of the proposal.

15/FEB/2018-The task for today was to run an entire experiment, starting with the addExp.pl -s step and going through the entire process of the creating a 30 hr train, language model, and decode. I also wanted to pay attention to the issues that Professor Jonas laid out for us.

18/FEB/2018-The task for today was to learn how to run a train/LM/decode on unseen data. The directions on the Wiki and from prior Semesters is not entirely clear so documenting my steps as I go is very important for this. I will be working with the modeling group on this as both our groups has a stake in getting this up and working. It was also to be on standby to help edit the class group proposal. I also started work on addExp.pl and seeing where I could add the auto-increment feature. I documented my entire process of running the script addExp.pl when creating an experiment earlier in the day.


 * Results:

13/FEB/2018- The entire Experiment group has run a successful Train/LM/Decode and the the process has been written down and documented. I also helped out fellow classmates before class through this process as well and got most of them through the process (minus one that was having computer issues). I finished putting the steps down in my Experiments/0303/002 Wiki and have pointed this out to classmates who have not finished their train/lm/decode to follow my instructions. We copied the addExp.pl file to each of computers and have started to go through it in order to debug it. $temp = <>; at line 81 is the line that Professor Jonas stated was causing issues with functionality so we will be exploring that. Working on building/fixing an auto-increment aspect to this file is also an important task.

14/FEB/2018- I created and got the discord channel set-up. I am waiting to hear back from the programming group whom I will most likely hear back from tomorrow. I will be working on the proposal work later tonight. I will compare how past semesters, most specifically Sp14 and Sp15 and also the layout proposed by classmates in order to make the best, most precise proposal as possible.

15/FEB/2018- I had a successful 30 hr train and documented it in my Experiments/0303/022 folder. I am unsure if I even noticed the waiting that Professor Jonas stated is caused by line 81 in addExp.pl -s as the whole thing went smoothly. I do know where the auto-increment should be and this will be a focus as we move forward. It is annoying to actually have the Wiki open and be looking at what # dir you have to make.

18/FEB/2018- As of the time of me writing this, I keep receiving a core dump after running through the entire process of testing unseen data. I restarted the entire process in the hopes that this time it will work.I documented the functionality of addExp.pl and found the error that Professor Jonas was referring to. SYSTEM_ERROR: "lm_3g_dmp.c", line 1272: fopen(/mnt/main/Exp/0303/030/LM/tmp.arpa,rb) failed ; No such file or directory this is the error I keep seeing in my decode.log so I remaking that entire directory and starting over. I also helped on working on the group proposal in order to have the best document to submit to Professor Jonas. I posted my work in progress on addExp.pl to the sandbox so my teammates could take a look at it.As an update I successfully had a train/Lm/decode on unseen data which was put into 0303/030 directory on the Wiki.

Sum/Avg |  15    275 | 76.7   18.5    4.7    4.4   27.6   86.7 | |=================================================================|     |  Mean   |  1.3   22.9 | 72.6   23.6    3.8   10.0   37.4   83.3 | | S.D.   |  0.5   25.9 | 23.0   22.8    5.5   19.1   27.5   38.9 | | Median |  1.0   12.0 | 72.6   16.3    0.0    2.8   36.0  100.0 |


 * Plan:

13/FEB/2018- The plan is to increase our knowledge of the scripting language Perl as we go through the addExp.pl file and make improvements. Auto-increment is another desired feature that we will explore. We also have to work on our draft of the class group proposal and make changes outlined for us in class. That is due Saturday night and will be worked on during the week.

14/FEB/2018- After hearing back from the programming group we can start to set-up how to test our changes we plan to make with the addExp.pl files once we come up with changes to try and improve it. Continuing to learn the Perl language is also on list. Working on the proposal as well.

15/FEB/2018- The plan is continue working on the group proposal. We also have to continue to look at how addExp.pl is working and fix the auto-increment aspect and make it work better. Know that I know what to look for and how it impacts the quality of running an experiment, we can make it better.

18/FEB/2018- The plan is to continue to work on getting this test on unseen data to work and to document it. Then its on to the continued work on the script addExp.pl with my teammates.


 * Concerns:

13/FEB/2018- There are no concerns at this time.

14/FEB/2018- None at this time

15/FEB/2018- None at this time

18/FEB/2018- None unless I run into another core dump later today.

Week Ending February 26, 2018

 * Task:

20FEB2018- The task for today was to meet in class, give our status update, fix the class proposal, and continue on with our work. Today seemed to be more of an administration day which is ok and will always be a part of large projects. Another key aspect for today was to continue to work on digging into two scripts we found from last semester to see if they are a viable tool to use.

23FEB2018- The task for today was to continue my Udemy course on Perl in order to better understand the Perl coding language. Once completed I will have a much better understanding what are scripts are doing as they are written in Perl and I will be able to recommend and make better, more knowledgeable suggestions on fixing the scripts in the project. Since the issues of making addExp.pl a better, bug free script, I felt that it was necessary to take a short course in it in order to further my understanding of the language.

24FEB2018- The original task for today was continued work on learning Perl and then working on addExp.pl and debugging it. Starting last night, the Steve from the modeling group started to experience core dumps when trying to run an LDA on unseen data. This happened on multiple attempts leading to my involvement into trouble shooting. My task then became reviewing the errors and scripts from Steve's attempts to see if it was an issue with documentation on executing a test on unseen data or if it was an issue involving the LDA process and documentation. After going through the decode.logs and finding the errors and seeing a pattern, I then ran my own 5hr train/decoding/scoring on unseen data to rule that out.

25FEB2018- The original task for today involved working on addExp.pl and my continuation of learning the language of Perl. Instead, the modeling group found an error while trying to run a train: "sclite command not found" -Majestix Server The question was presented to me and then my task changed for today. Since this is the scoring mechanism for this project, this had to get resolved asap. Also, for a more long-term project is helping the modeling team be able to run a train with an LDA which still hasn't been successful.


 * Results:

20FEB2018- The results were a class that was spent learning about the direction that different groups where going in. We gave our status update as working on fixing addExp.pl and working on trying out two other scripts from last semester. Once we figure out if they are usable, I will add them to my blog and also other places in the Wiki. They seem to be tools to auto create an experiment instead of running through all the steps that we all have been so far. I just don't want people to use something that we have not gone over yet in case it is incomplete. If they are complete we will use one of them to auto create a train on unseen data if time allows for it.

23FEB2018- While not done with my Perl course, I have learned a lot about the language and scripting with it. Once complete, I hope to make addExp.pl and further scripts more efficient and better written because it seems to me there are some bugs in some of the scripts. After completing this I will also know what to look for when debugging because I will have a working knowledge of how the language works. Once I have completed this course and we have made some changes to the scripts, we will get with the systems group and designate a time to test these scripts out on Caesar to test if we corrected the issues properly.

24FEB2018- The results where that i found these specific errors in the 032/046/047 decode.logs from Steve's attempts: - decode.log(032)-ERROR: "cmd_ln.c", line 724: Cannot open configuration file /mnt/main/Exp/0303/031/model_parameters/031.mllt_cd_cont_1000/feat.params for reading -scoring.log empty(expected to be)

- decode.log(046)- ERROR: "cmd_ln.c", line 724: Cannot open configuration file /mnt/main/Exp/0303/031/model_parameters/031.mllt_cd_cont_1000/feat.params for reading -scoring.log empty(expected to be)

- decode.log(047)- /usr/local/bin/sphinx3_decode: error while loading shared libraries: libs3decoder.so.0: cannot open shared object file: No such file or directory - scoring.log empty(expected to be)

I looked into the directories that are indicated in the errors and found that files with those exact names do not exist in Steve's directories. There are names that are similar, but not the same. So i decided to run my own 5hr train/decode/scoring on unseen data to rule out that it was an error in that process. The results where a successful test on unseen data as logged in experiment mnt/main/Exp /0303/049:

,-.    |                            hyp.trans                            | |-|    | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | | Sum/Avg |  65   1054 | 69.4   23.0    7.6    6.6   37.2   89.2 | |=================================================================|    |  Mean   |  1.3   21.1 | 72.8   21.1    6.1   13.1   40.3   87.0 | | S.D.   |  0.5   19.3 | 21.7   19.0    9.7   21.5   26.2   33.2 | | Median |  1.0   15.5 | 72.7   20.0    1.1    3.7   37.8  100.0 | `-' This has lead me to believe it is an error in the documentation in creating an LDA experiment that is the cause of the errors and core dumps that Steve experienced. I will be following up and then looking into that whole process in order to make a better, clearer standard for running an LDA. I will work with the Modeling group to clean this up.

25FEB2018- So, the results were that Sclite and it's pointed on Majestix pointing back to Caesar was wrong. I had reached out to the Systems group because after going through some of the directories, it seemed to be a possible program that was not installed. The systems group resolved the issue and issued this statement in regard to what the issue was: there was no symbolic link back to /mnt/main/local. To be clear. The drive /mnt/main /local was mounted, however; /usr/local should be symbolic linked back to /mnt/main /local as well. It was not. I created the link - and it now works. Basically, the installation of SCLite is on Caesar at /mnt/main/local/usr/local/bin/sclite. Because /usr/local/ wasn't connected, it couldn't find it.

Before I was done with the System's group, I also went through and made sure all the drone servers had their pointers working by running the command: sclite -v This will show what version of sclite is available

I found that all the other drones where ok and Systems resolved the issue. I pushed off work on the LDA because research and resolving the issue above was my main focus.


 * Plan:

20FEB2018- The plan is for continued work om addExp.pl. I believe that another semester created some sort of variable call which isn't working which is creating the error Professor Jonas described. Professor Jonas seems to think its Red Hat not working with Perl. We are looking into that. Testing the two found scripts from last semester to see if they work and help make experiments easier to run, if so they will be pushed out to the class and documented.

23FEB2018- The plan this weekend is to complete this course, continue work on the scripts, and also work with modeling to see how to set up an LDA experiment. They seem to keep getting core dumps which could mean a few different things so that is going to be my plan for the weekend.

24FEB2018- The plan is to work with the modeling group in order to get them through an LDA successfully and clean-up the documentation involved because as of right now there seems to be some unclear directions that are leading to unsuccessful LDA's. I will need to gain a better understanding of what an LDA is and what it is doing when it is being tested.

25FEB2018- I will continue researching and working on addExp.pl, I will continue to learn how to run a successful LDA with the modeling group and update documentation in order for more people to be successful. The fast that we were unable to run sclite on Majestix up until now means that this issue probably has not been reported which is an issue and that we are not working at full capacity.
 * Concerns:

20FEB2018- None as of right now

23FEB2018- None at this time.

24FEB2018- Getting the Modeling group through a successful LDA.

25FEB2018- Same as above.

Week Ending March 5, 2018

 * Task:

27FEB2018- The task for today was to go to class, give our status update, and work together as a group to further our work long. I also wanted to pay attention to what was discussed when it came to running LDA's. We also came into class to find that we needed to prepare and register for the URC, and being labeled as a leader, I had the job to register my group. I also was tasked to help write the proposal for the CCSCNE presentation that was discussed in class. Lastly, we needed to create our own experiment for our experiment group.

02MAR2018- The goal today was to continue learning Perl, looking through prior semester's work to look for any mentioning of why auto-increment was taken out of addExp and why, and continue looking at ways to improve addExp.pl. One of the goals about learning Perl from outside sources such as Udemy is that some of the scripts seem to not have been written well leading to some of the weird issues (that is my opinion). Since a lot of the Perl code is written in not very intuitive way, it is our task to make it better.

03MAR2018- The task today was to go through the createExp.pl script and try to get it to run. According to professor Jonas, this script can be run and it will prompt the user through certain questions and create an entire experiment without the user having to go through step by step. This will be very useful as it will cut down human error when creating experiments.

04MAR2018- The task for today was to go through the script: createExp.pl

To continue to debug and rebuild this in order to create a go to script for users to easily run and create trains/experiments without having to input a dozen+ commands. Running the script would require minimal human input thus cutting down on human error.


 * Results:

27FEB2018- The results of the day was a class was a successful class and status update. Professor Jonas was happy to find what we had fixed the broken aspect of addExp.pl. We have many other things to work on but this was a successful start. I paid attention to the modeling groups status update and paid close attention to talk on LDA experimentation. Once they look into that more then I can offer my help in running them if they need help. There is a lot to look into as far as what drones are set-up to do what. It seems that documentation from prior semesters is very poor and finding out answers is not easy. Once this is taken care of we can better run these experiments. I got our group registered for the URC and that is all set. I also stayed and helped with the CCSCNE abstract created, proofread, and completed. Our group tried to make our own experiment dir. We ran: addExp.pl -r to create the experiment # 0307 When this was completed we found another group had already made the actual dir on Caesar and not on the wiki completely doing this out of order. This is another example of people not following the SOP for procedures already in place.

02MAR2018-The result was that I learned that this script and some of the others we have been working on where not documented well. Going through this and fixing it remains a top priority.

03MAR2018- The results were that we found some information from previous semesters that we will be following up on tomorrow. What we have found is that there is nothing on if auto-increment was taken out of addExp and why it was. This is due to very poor documentation and people not taking their logs seriously. This is also why this project is in various different states, because semesters work on the project and don't leave proper documentation in order for a clean transition to the next semester. It is my goal to hopefully have better, more complete logs than semesters past.

I created a sub-experiment on our new directory by running: addExp.pl -s experiment 0308 sub-experiment 005

This created my section on the Wiki. I then created the sub-directory folder in 0308 with the folder 005 and then cd into it. I then ran: createExp.pl -t

I got a bunch of errors doing this. Looking back at what professor Jonas stated a few weeks ago, he stated that he had changed some directory names or locations and I believe that this script is affected by that. I have to spend tomorrow looking for the locations of the errors and see if this fixes the errors. If that is the only issue I can then look into doing the same thing and create a script to do this for unseen data.

04MAR2018- We have worked 2/3 of the script so far. At the time of this log we have fixed.

sub makeTrain sub makeLM

The last aspect of this is the decode. Once this gets completed the user will be able to run createExp.pl and it either make aspects of a experiment or the entire thing. Either way this will cut down on the time it takes to conduct experiments and will further along the project.

/mnt/main/Exp/0308/009 is the location of this experiment -Conducted on Idefix


 * Plan:

27FEB2018- To get our own experiment dir squared away, run createExp.pl and copyExp.pl to see their functionality. We are also working on adding auto-increment to the addExp.pl file, which is proving to be complicated. It seems to be that a lot of the code was not properly done creating making changes kind of difficult. We shall see.

02MAR2018- The plan for this weekend is to now run through some of these scripts and see if they are working and how to improve them. If they work as they state, they will really help make experiments easier, and less time consuming procedure wise. I will document all my work in my logs and the experiment file.

03MAR2018- The plan for tomorrow is to look at the errors that I received today and try and find the file locations to fix createExp.pl. This could take some time but I am hoping that it will go pretty fast.

04MAR2018- The plan for tomorrow is to continue to work on createExp.pl and work on the decode section. Working in conjunction with Arias, we will complete this hopefully tomorrow and then with some final testing be able to pass along to the class to be utilized by the various different groups.
 * Concerns:

27FEB2018- Getting people to follow SOP.

02MAR2018- None as of right now

03MAR2018- None at this Time

04MAR2018- None at this time

Week Ending March 12, 2018
06MAR2018- The task for today was to have a successful class, give an informative status update, and plan out the next week's worth of work. Based off the conversations from the status updates, we will offer our help to groups that need us to take a look at some of the scripts.
 * Task:

07MAR2018- The task for today was to start a train as per the instructions on the Wike without scripts. The purpose of this is two-fold. The first is that we want to create a baseline experiment to be able to test against the ones that we can create with the script createExp.pl. The second is to test the decode process in general because apparently it is not being done correctly and since the only person in the class who has experience looking at this has just reviewed our experiments, we have wasted a large amount of time.

10MAR2018- The task for the last couple of days and today was to post the results from running a decode the proper way and let the class know. I ran a 5hr train and did the decode based off of those parameters. I also started testing copyExp.pl to see its functionality and if it was a working script. I also helped translate some lines of code with the modeling group as I have learned some Perl and they had a question.

11MAR2018- The task for today was to take a look through previous semester's work and look for any discussion on why the their experiment groups made certain changes to the copyExp.pl script and the makeTest.pl scripts. copyExp.pl-Currently is broken. makeTrain.pl-Currently creates the wav dir but does NOT copy over the contents from the source dir.


 * Results:

06MAR2018- The results were a successful class, a successful status update and a clear plan forward. After a group discussion, we came up with a clear plan to continue editing the scripts on our agenda: createExp.pl-> Almost complete, currently in debugging/test mode copyExp.pl- >Pending Once these are ok and good to go, the group will be updated. I will also help out the modeling group with learning Perl if necessary based upon their needs.

07MAR2018- The results are I am in the process of running a simple 5hr train Obelix and will update the class via Discord once I have run through the entire process and we will see if it is something wrong with our process or something wrong in the script.

10MAR2018- The decode on Obelix was successful which is located at /mnt/main/Exp/0308/010: ,-.    |                            hyp.trans                            | |-|    | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|    | Sum/Avg | 4172  60215 | 72.5   19.6    7.9    7.4   34.8   87.6 | |=================================================================|    |  Mean   |  1.3   19.1 | 75.6   18.6    5.8   15.3   39.7   88.0 | | S.D.   |  0.5   16.5 | 18.2   15.4    7.7   28.7   32.6   29.9 | | Median |  1.0   15.0 | 75.0   16.9    2.4    4.2   33.3  100.0 | `-'

Today I started testing the script copyExp.pl and the various flags that are included with the script. It did not work, with a continued error of: /mnt/main/scripts/user/copyExp.pl: line 1: =begin: command not found /mnt/main/scripts/user/copyExp.pl: line 2: Copy: command not found /mnt/main/scripts/user/copyExp.pl: line 1: =begin: command not found /mnt/main/scripts/user/copyExp.pl: line 2: Copy: command not found /mnt/main/scripts/user/copyExp.pl: line 1: =begin: command not found /mnt/main/scripts/user/copyExp.pl: line 2: Copy: command not found /mnt/main/scripts/user/copyExp.pl: line 1: =begin: command not found /mnt/main/scripts/user/copyExp.pl: line 2: Copy: command not found /mnt/main/scripts/user/copyExp.pl: line 1: =begin: command not found /mnt/main/scripts/user/copyExp.pl: line 2: Copy: command not found /mnt/main/scripts/user/copyExp.pl: line 1: =begin: command not found /mnt/main/scripts/user/copyExp.pl: line 2: Copy: command not found /mnt/main/scripts/user/copyExp.pl: line 1: =begin: command not found

I tried each of the different flags that are in the script and received the same error for all three. It also did not print off the print statements that it is supposed to. I followed the way that comments in the code said for the input to be. Based on the error it did not like the copy command. I had to hit CTRL-C in order for this to stop. I also ls -al in the dir I was running this in /mnt/main/Exp/0308/014 to see if anything had been created which it had not.

11MAR2018- The results where that I figured out that copyExp.pl was only creating the wav dir but there were no scripts to copy the content from the source directory. The modeling group is the group interested in using this script and will find out if they want me to update this script. I ran copyExp.pl in my local IDE which got it to print to the print statements at the top but the same script on the system has the same error as listed above. I could not CTRL-C out of the error and had to kill the command line to stop it. I brought this up to my group.


 * Plan:

06MAR2018- The plan is to continue working on the scripts mentioned above. These are very important to creating experiments with minimized human error. This will aid in experiment accuracy and effectiveness. I will also help out with the modeling group with finding out why we are not completing experiments correctly because from the sound of it, it may be a scripts issue. Until then we do not know what the issue is.

07MAR2018- The plan is to continue monitoring my 5hr train and update the class via discord once complete. From there we will problem solve the issue.

10MAR2018- The plan for tomorrow is to continue to debug and work on copyExp.pl. This will most likely turn into a group project but until then I will work to problem solve it.

11MAR2018- The plan is to wait and hear if my services are needed for makeTest.py and to continue work on copyExp.pl.


 * Concerns:

06MAR2018- My concern is that we are almost at spring break and apparently are not even running things correctly. This is due to the lack of proper documentation which seems to be a systemic issue for this entire project. This is a massive failure because in order to properly replicate the work of previous semesters we need their exact steps. This is something I hope our group is fixing with documenting everything because this lack of proper documentation has seriously wasted a lot of time on this project.

07MAR2018- My concern is that there is no Quality Control and if there is any its way past when actions and routines have become the norm. I completely understand making people learn what it is they are doing but without supervision, it becomes a mess.

10MAR2018- My concerns are that this is the second script in a row from last year that the Professor has stated was working and when we go to run it, it does not work at all. I am starting to question if some of these scripts even worked last time an experiment group worked on them because there are a lot of errors in the code. I find it frustrating that we struggle to learn a new computing language using bad coding which adds to the confusion. These scripts should be written in a way that can deal with changes in code location and I am seeing that may not be the case.

11MAR2018- I hope that documentation is much better this semester because finding answers for this project are difficult. This impact the project in a negative way.

Week Ending March 26, 2018
20MAR2018- The task for today was to meet with my new team, Guardians of the Galaxy, to have a successful class, and to cement our objectives moving forward. This will set us up for the remainder of the semester and the tasks that need to be completed.
 * Task:

23MAR2018- The task for today was to do individual work for my team so we can all better understand what Automatic Speech Recognition systems do and the various parts that go into them working better. We want to do this not only to better understand what it is we are working with, but to also better help in the creation of better modeling results. I also volunteered to test a few different virtual meeting programs, i.e. Skype Business (Provided by School), before our teams virtual meeting tomorrow night. After testing, a recommendation and standard will be pushed out to the group to ensure everyone is on the same page for tomorrow.

24MAR2018- The task for the day was to continue looking into speech systems and look for ways to improve the system that we have. The goal of splitting into two different teams is to drive each team to develop a better modeling scheme than the other team and to improve upon last semester. We have a team meeting tonight which will bring us all together in order to start doing this.

25MAR2018- The task for today was to research and look into the Wiki to learn about LDA (Linear Discriminant Analysis) and why it is important and how/if we can change it to produce a better WER. This includes looking into the Wiki to find how past groups have used it and also doing research on what it does as a tool and why this matters to us as a group. I am also documenting this information so we can go over it as a group and see what we may try to change.


 * Results:

20MAR2018- The team meeting was successful and we now have a set time and day for a weekly team meeting. This will help us meet outside of school to work on the project and continue forward. Class was successful and the status update was not big because we are coming off of break. I will be working on copyExp.pl for the remainder and then helping my team more forward.

23MAR2018- The results were a successful training day today. Our team had some required videos and readings that we wanted everyone to watch and read so we are all working with the same base knowledge. Tomorrow night we will go through that and work on how we want to move forward. I tested the downloaded version of Skype Business and the Web Based version of Skype Business with Camden to see which version would work better for not only communication, but for file sharing and group work. The installed version is much better and much easier to share documents in a clean, useful manner. The web based version lacked many of the nice features that the installed version offered. After that, I pushed up to the team which version we had chosen and Camden sent instructions on how to install and get it up and running.

24MAR2018- The research is going very well and there is a lot of content that we as a team want to go through. We are trying to all have a base understanding that is strong enough to help develop the model while also working on the individual group tasks as well. Once we get this started I can then start focusing on fixing some of the scripts I have described. This video is a very good aid in understanding the overall information about speech. As we get further into this, if I come across more links I will start posting them here.

25MAR2018- The results for my research so far are that LDA is very important as Sphinx3 was created to use tools such as LDA to get more accurate WER. Even though we are on two different teams I am going to post some links that I found that helped me look into LDA and some aspects of the sphinx_train.cfg file and the options in the file.

Older but informative PDF on LDA

Explains how Sphinx3 was built to use things like LDA

[https://www.researchgate.net/profile/Rong_Hu23/publication/224750854_Fast_convergence_speech_source_separation_in_reverberant_acoustic_environment/links /56ba146208ae23328103d49d.pdf?origin=publication_detail] Info on convergence

I am still researching LDA so these are just some of the things I have looked into. There are many different aspects of why we need to get this to work. I have also identified certain parts of the sphinx_train.cfg file that we can change in order to manipulate results. I will bring these up to my team at the next meeting.


 * Plan:

20MAR2018- Work on copyExp.pl, team objectives, and also possible work on makeTrain.pl

23MAR2018- The plan for tomorrow is to continue looking into how to improve our project. I have made myself available a half hour before our team meeting in case people want to test their Skype before the meeting. I will also be working on copyExp.pl over the weekend as well.

24MAR2018- The plan is to have our team meeting tonight. I will be available a half hour earlier so that people who feel the need to test their Skype before the meeting can do so. I will also start working on debugging the scripts tomorrow and Monday and see how far I can get with that.

25MAR2018- The plan is to continue my research and looking at scripts that are directly related to LDA to look for things that can be changed and worked with. I will also be working on copyExp.pl tomorrow for the classwork and see how far I get with that.


 * Concerns:

20MAR2018- None

23MAR2018- The state of the student created code is not good and makes me wonder if it even worked at all last time capstone ran.

24MAR2018- The $1.3 Trillion Omnibus Budget Deal.

25MAR2018- None

Week Ending April 2, 2018
27MAR2018- The task for today was to have a successful class and group status update with the news that we had completed fixing copyExp.pl. During the time in class to work with our groups, I met with Camden to go over my finding so far concerning LDA and to see if what I was finding was on track with what the team needed. I also met with Chris as we are both looking into LDA and RNN for team Guardians to compare where we are for research. The rest of today and tonight will be reserved for more research.
 * Task:

30MAR2018- The task for today was to continue my research into LDA and RNN. This is to look for aspects of these two theories to change in the experiment process in order to be successful. Changing different parameters in these two processes will change the WER. I also had a secondary task come up today and that was to run a simple train on miraculix to test for errors. There are some bugs that we are testing for.

31MAR2018- The task for today was to originally look into running a train on unseen data and continue research into LDA and RNN. My team just acquired the idefix drone so I was then tasked out to run a baseline train in order to see what the result was and to ensure we had no bugs. We also have a team meeting tonight that I will be participating in. I also documented my successful train in experiment /mnt/main/Exp/0308/032 on miraculix.

01APR2018- The task for today was to have a successful Easter Sunday and check in with my team throughout the day to see what needed to be worked on.


 * Results:

27MAR2018- The results where a great class update. We have completed fixing copyExp.pl and it functions as its supposed to. I updated Camden and Chris about my findings about LDA. We plan to looking into my findings based on information already found by Camden which means they are viable options. I will also look into RNN going forward as well.

30MAR2018- I have done a lot of reading into the two different processes in order to be familiar with them for tomorrows team meeting. As soon as I am done writing this log I will run a train/lm/decode in miraculix and see if I get the same errors as a few other team members have. I will post the results in the log for tomorrow.

31MAR2018- I had two successful trains on seen data for baseline exeriments on two different drones. Their results are the same for WER and posted below.

SYSTEM SUMMARY PERCENTAGES by SPEAKER- Miraculix

,-.         |                            hyp.trans                            | |-|         | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|         |=================================================================|          | Sum/Avg | 4172  60215 | 73.1   19.1    7.8    7.4   34.4   87.5 | |=================================================================|         |  Mean   |  1.3   19.1 | 76.0   18.3    5.8   15.4   39.4   87.9 | | S.D.   |  0.5   16.5 | 18.1   15.3    7.7   29.1   33.0   30.1 | | Median |  1.0   15.0 | 76.2   16.7    2.4    4.2   33.3  100.0 | `-'

Successful Completion

SYSTEM SUMMARY PERCENTAGES by SPEAKER -Idefix

,-.    |                            hyp.trans                            | |-|    | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|    |=================================================================|     | Sum/Avg | 4172  60215 | 73.1   19.1    7.8    7.4   34.4   87.5 | |=================================================================|    |  Mean   |  1.3   19.1 | 76.0   18.3    5.8   15.4   39.4   87.9 | | S.D.   |  0.5   16.5 | 18.1   15.3    7.7   29.1   33.0   30.1 | | Median |  1.0   15.0 | 76.2   16.7    2.4    4.2   33.3  100.0 | `-'

Successful Completion

Both have the same WER so that means they are a success and they were replicated on two different machines. Now I can start looking into working with LDA.

01APR2018- Easter was very enjoyable and I am to continue to work on LDA.


 * Plan:

27MAR2018- The plan for me is to continue looking into LDA and RNN. We will also continue work on createExp.pl. These are the most pressing issues at the moment.

30MAR2018- Continue research on LDA and RNN and run a train/lm/decode on miraculix.

31MAR2018- Have this meeting tonight, continue work on LDA and RNN and continue working on scripts.

01APR2018- Work on LDA and make sure we can do it successfully.


 * Concerns:

27MAR2018- None

30MAR2018- Fake news.

31MAR2018- The Red Sox need more runs.

01APR2018- None

Week Ending April 9, 2018

 * Task:

03APR2018- The task for today was to have a successful status update, learn what we need to do concerning LDA, and continue working to get LDA working properly. I have continued looking through the logs and other documents looking for hints on getting it to work. This is a very difficult task as the documentation is very poor.

04APR2018- The task for today was to try the decode on my LDA 5hr train from the other day with some changes we made. This was to rule out one of our hypothesis. I am also continuing to conduct research on running and making a LDA successful.

07APR2018- The task for today was to read through prior logs to see if there was anything else relating to LDA and how to get it running. I also worked with the other class leaders to ensure that URC posters where completed along with the CCSCNS poster.

08APR2018- The task for today was to plan out some tests to work on for trying to get LDA to work. Camden had posted some articles for me to read through in combination with research that I had come up with as well. I have somethings to run on our machines in order to test for the LDA capabilities.


 * Results:

03APR2018- The results where a successful class and a plan to move forward concerning Guardians work. I will be reaching out to team members after I finish a few theories on a LDA experiment that so far has resulted in a core dump.

04APR2018- The results where another core dump which is what I was anticipating. We were testing to see if the error of a missing mdef file was because it was not being initialized properly and being recognized as an executable C file. Since the Wiki has almost no easily accessible information on LDA, I am continuing my research to try and find something that will help this project along.

07APR2018- I have still found very little concerning LDA which makes me wonder if it was ever done in the first place. Considering that I have attempted every instruction on how to get it to work it is obvious that this is not going to work by the books. I am working with my team to try and correct that.

08APR2018- The results where that I have put the plan together but have to wait until tomorrow in order to run my tests. This is because of other tests/experiments being run on the machine I need to use. This allows me to work on this all day tomorrow.


 * Plan:

03APR2018- GET LDA WORKING

04APR2018- GET LDA WORKING

07APR2018- GET LDA WORKING

08APR2018- GET LDA WORKING


 * Concerns:

03APR2018- Nothing

04APR2018- The lack of information on LDA in the Wiki is disturbing.

07APR2018- The lack of documentation on this project is infuriating. For a group research project, there has been such a lack of consistency on this project that makes it almost useless.

08APR2018- None

Week Ending April 16, 2018

 * Task:

10APR2018- The task for today was to have a successful class, inform team of some information I had discovered during research, and to learn the tasks moving forward. I also ran a train according to what our team is waiting for.

11APR2018- The task for me today was to continue work on finding out the status of LDA and working on UDD on BL and ORE. I will be completing that tonight.

12APR2018- The task for today was to score the decodes I had running from last night and also LDA stuff. Working with my team to continue pressing forward in order to come up with the best model.

15APR2018- The task for today was to continue with my work with LDA stemming from my work yesterday. Between running the trains and decodes and then comparing them and updating classmates on the results. It seems very few people are actually taking the time to make sure that LDA works. The results below are from a 5hr LDA on seen data (0309/043) and a 30hr LDA on seen data (0309/044).


 * Results:

10APR2018- The results where a successful class. We had some quality team time which we got a lot done. I ran my train successfully and will wait to complete that. I worked with members of my time to answer some questions I had come across and my answers where sort of answered. I have my plan for the week for my team work.

11APR2018- I am running the two different processes tonight as we speak. This is in conjunction with Team policy. I have also come to a conclusion on the current status of LDA for our Team as well.

12APR2018- I know have the answers for scoring unseen decodes and have it working. 0309/034, 0309/035, 0309/038 all were successful. I also had a conversation with classmates and the professor about LDA to discover that the incorrect version of Python has been installed on the machines going way beyond our semester and that means none of the machines can run LDA. That means results from previous semesters are being called into question when it comes to LDA results.

15APR2018- 0309/043 05hr LDA on Seen Data

,-.    |                            hyp.trans                            | |-|    | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|    |=================================================================|     | Sum/Avg | 4172  60215 | 75.2   12.6   12.2    2.7   27.6   80.2 | |=================================================================|    |  Mean   |  1.3   19.1 | 77.7   12.8    9.5    6.8   29.1   80.7 | | S.D.   |  0.5   16.5 | 18.1   12.9   10.9   19.1   24.9   36.4 | | Median |  1.0   15.0 | 79.4   10.0    6.7    0.0   25.9  100.0 | `-'

0309/044 30hr LDA on Seen Data

,-.    |                            hyp.trans                            | |-|    | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|    |=================================================================|     | Sum/Avg | 3992  57805 | 63.3   21.2   15.6    3.0   39.7   86.6 | |=================================================================|    |  Mean   |  1.3   18.7 | 67.5   20.2   12.2    6.9   39.3   86.6 | | S.D.   |  0.5   15.9 | 21.7   15.7   13.3   18.7   26.4   31.8 | | Median |  1.0   15.0 | 66.7   19.7    9.1    0.0   35.3  100.0 | `-'

The issue that I am having is that apparently there are two different flags, -lda and -ldadim that need to go into the manual decode instead of using the run_decode.pl script. This is a work on process and could possibly be the key to getting this to work. There is another way to run the train, nohup scripts_pl/RunAll_CDMLLT.pl & which I need to run on 30hr to compare which according to the professor should bring about a more accurate result.


 * Plan:

10APR2018- GET LDA WORKING

11APR2018- UDD

12APR2018- Continue working on LDA and other team tasks.

15APR2018- Just ran a 30hr LDA using the nohup scripts_pl/RunAll_CDMLLT.pl & and the decode will not be done until late. late tonight. Once that is completed we will make some decisions.


 * Concerns:

10APR2018- Facebook seemed dishonest at its Senate meeting but thats just me. I mean why bother having civil liberties when we give them all away to big brother for free?

11APR2018- Lets not start WWIII over Syria.

12APR2018- Fake News.

15APR2018- Working with the "flags" for LDA

Week Ending April 23, 2018

 * Task:

17APR2018- The task for today was to have a successful class and to have a successful meeting with the Guardians team. I had some information on possibly advancing the research/experimentation with LDA that needed to be brought up to the team. I also worked with Steve to continue testing possible answers to the LDA and LDADIM flags.

20APR2018- The task for today was to edit some of my wiki experiment entries to ensure correctness and to help fellow teammates to run trains/LM/decodes. I am helping out because I have run all three of the various kinds that we have even though they may not be correct (LDA). I am also helping try and pin down the LDA flags in order to see if LDA is something that can be done in future capstone classes.

21APR2018- The task for today was to come up with a plan for testing certain LDA aspects tomorrow. SO far I have come up with testing on two additional machines which I think do not support LDA and also some parameters to change in the mllt.py file that could make an impact. I have been working with a few other classmates to keep looking into research about the LDA flags and their use.

22APR2018- The task for today was to work with some of the theories on manipulating the mllt.py file to try and get LDA working. This is continued work in trying to solve the puzzle of LDA. I also have have been reading through the python files that are related to LDA and how the process works. This is a very complex relationship between all these files.


 * Results:

17APR2018- The results are that we had a successful class and I had a successful meeting with my team. The issue is that the way the lda flag option is not descriptive and this makes it hard to try and find exactly what it is calling. The place where I have tracked both down are located in the python/sphinx/mllt.py and other lda titled python files required for LDA. Now it is a question of figuring out exactly what it is. While this may not impact this semester, its important for future iterations of capstone in order to strive for more accurate results.

20APR2018- The results where that I have corrected a few different entries to ensure that while most of us know what needs to happen, someone would have a better understanding on how to replicate what I did. I have also helped a few teammates with processes and questions for running trains to ensure that we are running them the correct way. We are still working on the LDA flags and this is something that is spanning both teams.

21APR2018- The results where that I have come up with a plan for testing tomorrow. I will also be including Camden into conversations about LDA to see if we can track down exactly how things are working. At this point it is looking like LDA and our work to get it accomplished will most likely benefit the iteration of capstone. This is ok as it is truly the best and most accurate way to decode and have scored.

22APR2018- The results where that I currently have a train running with changes to the LDADIm field. The experiment is 0309/068. It is not complete yet. I also have found a piece of code that may help us understand the LDA and various flags associated with that. The file is mllt.py.

if ldafn != None: # Compose this with the LDA transform if given lda = s3lda.open(ldafn).getall[0] ldadim = mllt.shape[1] ldamllt = dot(mllt, lda[0:ldadim]) s3lda.open(mlltfn, 'w').writeall(ldamllt[newaxis,:]) else: s3lda.open(mlltfn, 'w').writeall(mllt[newaxis,:])

This is just a place to start and I have passed it on to a few other classmates so we as a class can see if this impacts what we are trying to get working.


 * Plan:

17APR2018- The plan is to continue work trying to get the LDA flags to work. I also have team work to continue and assist other members in their work especially when we move into unseen data decodes. I have some things to look into and continue my work.

20APR2018- Update information and continue work on LDA flags.

21APR2018- The plan is to work tomorrow on running some very specific experiments to test a few theories of mine about the flags. I will also be here to support my teammates on anything our team needs done as well.

22APR2018- I will continue working with others to keep digging on LDA and continue running the train and post the results in the correct area.


 * Concerns:

17APR2018- Go Sox

20APR2018- "Taxation is Theft"

21APR2018- I Want to Believe.

22APR2018- Go Sox!

Week Ending April 30, 2018

 * Task:

24APR2018- The task for today was to have a successful class and to have a class update. We also met with our teams to continue our work. We talked about the progress made with LDA and where to continue the research. We figured out a few different things to look into based off of what was discussed in class.

27APR2018- The task for today was to complete an unseen decode on an experiment that we needed to run a baseline on. This is in conjunction with work for my team. I have also made myself available to help people run things when I am unable to be at a computer. I have also been working with Danielle on LDA work when errors have popped up.

28APR2018- The task for today was to score the 300hr unseen data decode and post it to the appropriate places. I also had a discussion with all the people trying to get LDA working, specifically concerning the flags and a possible solution was presented so I restarted an LDA decode I had saved to see if we could get it to work. I also have been in communications with team members about tasks that need to be accomplished in order for our team to finish strong.

29APR2018- The task for today/last night was to score my decode for the LDA flag experiment and also to be on call for people working on last minute experiments. Seeing as most people on my team are doing trains/decodes with my instructions I have made sure to be available to others if they have any questions.


 * Results:

24APR2018- The results are that we have been given some new things to look into for LDA. This is most likely going to be something that is handed to next iteration but at least they will have something to work on for the future.

27APR2018- The results are that I have a 300hr unseen decode running on idefix currently. I have also provided help to a few different members of my team in order to help them and our team be successful. Danielle has made a break through with the randomness and LDA stuff which has been an ongoing project.

28APR2018- The results for the baseline 300hr unseen decode are posted below:

,-.    |                            hyp.trans                            | |-|    | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|    |=================================================================|     | Sum/Avg | 4173  61273 | 53.7   36.4    9.9    9.1   55.4   91.2 | |=================================================================|    |  Mean   |  1.3   19.3 | 61.1   31.4    7.6   16.3   55.3   91.3 | | S.D.   |  0.5   16.5 | 21.7   18.7    9.4   30.6   33.2   26.3 | | Median |  1.0   15.0 | 59.5   33.3    4.5    5.6   53.8  100.0 | `-'

This is consistent with what we have had, this was to compare against something else. I also have a decode running with the LDA flag pointed at a specific file and it has not failed yet so we have yet to see what the results will be. It is only a 5hr so it we like the results we will have to try it at other intervals. Most likely, LDA will be something that next class has to deal with but it would be nice to hand them stuff to actually make a difference.

29APR2018- The results were an actual successful decode using what we determined was the LDA flag. The results are posted below:

0309/052 LDA decode with LDA flag

SYSTEM SUMMARY PERCENTAGES by SPEAKER

,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|     |=================================================================|      | Sum/Avg | 4172  60215 | 76.7   11.6   11.7    2.7   26.0   78.8 | |=================================================================|     |  Mean   |  1.3   19.1 | 79.2   11.7    9.1    6.5   27.2   79.4 | | S.D.   |  0.5   16.5 | 17.6   12.5   10.6   18.1   23.9   37.4 | | Median |  1.0   15.0 | 81.3    8.7    6.3    0.0   25.0  100.0 | `-'

Successful Completion

So the result was really good for a 5hr train/decode. We will now try it on a longer experiment to see how successful it will be.


 * Plan:

24APR2018- Continue to work on LDA and work on team work.

27APR2018- Continue working on LDA, unseen data.

28APR2018- Continue working on LDA flags and score my LDA decode when it is done.

29APR2018- Continue team work.


 * Concerns:

24APR2018- Fake News

27APR2018- Im liking America's Team NFL draft pick so far. go Cowboys.

28APR2018- rrrrrreeeeeee

29APR2018- Michelle Wolf isnt funny.

Week Ending May 7, 2018

 * Task:

01MAY2018- The task for today was a successful class.

02MAY2018- The task today was to run and help run a series of 5hr LDA experiments utilizing the LDA flag, randomness taken out of the MLLT.py file. We wanted to see of these changes would result in successful experiments or errors.


 * Results:

01MAY2018- A successful class. Learning what needs to be done to close out the project.

02MAY2018- The result was a successful LDA experiment with the results posted below:

SYSTEM SUMMARY PERCENTAGES by SPEAKER

,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|     |=================================================================|      | Sum/Avg | 4172  60569 | 76.5   11.4   12.1    2.2   25.7   79.4 | |=================================================================|     |  Mean   |  1.3   19.2 | 78.9   11.7    9.5    5.8   27.0   79.8 | | S.D.   |  0.5   16.5 | 17.8   12.6   10.8   17.3   23.5   37.1 | | Median |  1.0   15.0 | 80.7    8.8    6.9    0.0   24.1  100.0 | `-'

Successful Completion

According to the professor, with the randomness factor taken out, this would lead to accurate results because it would be a consistent result. I am waiting to see what the others score as and if they are all the same using different decode methods than this will be a pretty good result. The issue still remains about the LDADIM flags which have default values of 0, 0.


 * Plan:

01MAY2018- To move forward. One more successful class- J'Aden.

02MAY2018- The plan is to continue the work with LDA and team work for the remaining of the semester.


 * Concerns:

01MAY2018- None

02MAY2018- None