Speech:Spring 2017 Jonathan Cleary Log

From Openitware
Jump to: navigation, search


Week Ending February 7th, 2017

Task

2/1 - Add everyone in class to the COMP790 Slack group that Alex created. Speak to other models group about improvement plans.

2/2 - Write a log entry.

2/4 - Read old logs to better understand what past semesters have tried and accomplished, as well as read about Sphinx in general. Familiarize myself with the file structure of Caesar. I also answered questions on and off Slack on a variety of questions.

2/5 - Win the Super Bowl.

2/6 - Read through the documentation for creating experiments as well as the General Project Information on this wiki.

2/7 - Finish reading the General Project Information wiki page and help anyone on Slack that needs help.

Results

2/1 - Added the entire class to the Slack group. After announcing to the class about the new communication method, a good portion of the class started using it to communicate.

2/2 - Wrote a log entry.

2/4 - I have a better understanding after reading the past students' logs. In particular, James's log (from Spring 2016) was well detailed and outlined what the previous year had wrote about. It seemed that the previous year, had just started using the language model in order to achieve better results. Previously in class, our group spoke about improving the language model and I am now understand that that course of action seems like it would produce the best results. I used the Tree command to look at the structure of folders/files on caesar. I also used the Nano utility to look at the code in the addExp.pl script.

2/5 - Won the Super Bowl.

2/6 - I gained a better understanding in how to get all the folders set up and run a train. Looking though the experiments, that were already run, it looks like only Vitali has run a successful experiment thus far. On slack, one of the group members asked for assistance in getting everything set up. Here is the instructions I gave them (I'm not sure that it is correct, until I run it myself)

I think this is what you need to do:

You need to cd into /mnt/main/scripts/user

$ cd /mnt/main/scripts/user

The following step is the 1st and 2nd step of that "Run Train Setup Script" wiki page. It runs a script that asks you questions and then creates the directories on the server and updates the Experiment's wiki page.

The addExp.pl doesn't actually make the directories like I thought, though I don't see a reason why it can't/shouldn't.

run the script in there called "addExp.pl"

$ ./addExp.pl

After this go to that [/Speech:Run_Train_Setup_Script | Run a Train Setup Script] wiki page:

Now start on step 3 (I'm going to try to translate what those directions say).

Now that the previous script created your file structure... cd into your sub-experiment directory.

$ cd /mnt/main/Exp/0299/001

Now look on the Run Train Setup Script wiki page and at step 3. I think the following is how you would execute that

$ ../../../scripts/user/makeTrain.pl -t switchboard 30hr/test

Skip step 4 on the wiki.

Step 5: cd to subdirectory if you aren't already there. I think you are (check with the command "pwd")

if not:

$ cd /mnt/main/Exp/0299/001

and run the script "genFeats.pl -t"

$ ../../../scripts/user/genFeats.pl -t

Step 6: now we can actually run the train, the previous was just setup. You can ignore the part to check the "top" command to check for already running trains, for now.

execute this command to run the train

$ nohup scripts_pl/RunAll.pl &

you should be able to see the train running with the "top" command. I don't know what it would look like, but you should be able to see something with "top"

$ top

to quit "top"

$ q

at this point you should be able to disconnect from caesar.

$ exit

2/7 - Vitali and I helped some people that needed assistant in running an experiment. I also read through more of the documentation on the General Project Information, and fixed a bunch of spelling mistakes. I didn't fix every mistake I saw, such as missing commas, or misused colons/semi-colons. I did notice that on the Corpus page, the author says to use the MD command to create a directory, however, that command is an MS-DOS command and not a bash command, so it wouldn't work. I didn't fix that yet as I don't want to overstep my bounds.

Plan

2/1 - Continue to steer the Slack conversation towards improving the project.

2/2 - To write a log entry.

2/4 - Read up more on language and acoustic models and how they can be improved in Sphinx.

2/5 - Do more reading of past years' logs.

2/6 - In class, make sure everyone knows how to setup and run an experiment

2/7 - Figure out how the rest of the class is doing tomorrow and go into depth with Vitali on how running his experiment went.

Concerns

2/1 - Derailment of conversation on Slack.

2/2 - Not writing a log entry.

2/4 - There is likely a reason that past semesters did not change much with the language model. My concern is that it will prove to be difficult to improve the language model.

2/5 - No concerns, as of now.

2/6 - I'm concerned I gave out the wrong instructions.

2/7 - Communication.

Week Ending February 14, 2017

Task

2/11 - Help other students with issues they were having on Caesar.

2/12 - Checking in!

2/13 - Watch this youtube video: tutorial on RNNLM toolkit. Read this document [1]

2/14 - It was Valentine's Day, a day in which I knew what I had to do. I had to write a log, a log that would take up space, but ultimately, not mean anything. I also needed to finish watching that dry, boring, drab, exhaustive, and ultimately fruitless video. It was the very same video that I, in a past log, tried to finish. This time I knew I needed to succeed in finishing that video and earning it an additional view.

Results

2/11 - On Saturday, MJ was having trouble running an Experiment on Caesar. She used Slack to communicate the issues that she was having. The first issue was running the addExp.pl script. Vitali and I communicated to her how to execute the script. The next issue that she had was creating the sub experiment folder inside the 0298 folder. She was getting a permission denied error. I ssh'ed into Caesar and tried creating a folder in my group's folder (0295); I too, received the same permission denied error. Vitali tried to create a folder inside of the 0295 folder; he did not receive the permission denied error. I quickly realized that the issue was that the creator of the original folder was the only person that had permission to make changes inside the folder, including creating folders. I then went in as the superuser and changed the permission of MJ's folder using "chmod -R 777", which gives all users the ability to read/write/execute within that folder. MJ then tried to create a folder in her 0298 folder, which was successful. Knowing that this issue would affect other groups, I changed the permissions of the other folders for our semester, so that other users could modify their respective folders.

2/12 - Checked in!

2/13 - As other teammates have mentioned, this video is very dry and looooong. I got about 40 minutes in before I realized that I had lost focused. I will talk with other teammates and see whether it is beneficial to finish the video.

2/14 - I finished the video, much to the surprise of myself and, I'm sure, the uploader. The speed functionality in youtube proved to be meaningless as I had a difficult time, as it was, in understanding the speaker's accent. As the video came to a close, the autoplay feature counted down to play another video. I could only imagine what youtube would suggest next, as I clicked 'cancel' in order to stop the playback of the next video. I clicked the 'like' button with pity, adding another 'like' to the video that so few others had watched, let alone 'liked'. I hoped that this would bring a smile to the lips of those involved in the production (if you could called it a production) of the video. If there was a smile, I know that it grew on the faces of those with deep seated masochistic tendencies, as what other kind of person would produce and upload a video as mind numbingly dull as that of the video in question. After having succeeding in watching the video (a task as difficult as bring the one ring back from whence it came and with no Samwise Gamgee urging me on), I sat for a few minutes to digest the contents of the video. Had I learned anything? I had. Could I apply what I had learned? I am not sure. I will need to confer with other members of my group in order to attain whether the over hour long video proved to be of any use.

Plan

2/11 - Continue helping students with problems they have on Caesar, lessening the time on mundane tasks, and increasing the time spent improving results.

2/12 - Check in.

2/13 - Check with teammates to see whether it is beneficial to finish watching the video. Find easier to understand documents.

2/14 - Never produce a video like the one I watched.

Concerns

2/11 - Not responding to student problems quick enough.

2/12 - Checked in.

2/13 - My concern is that I didn't retain the pdf that I read, or any of the video I watched.

2/14 - I am concerned that I will have nightmares or flashbacks of video I watched.

Week Ending February 21, 2017

Task

2/15 - I watched this [2] as well as part 2 to get a better understanding of how neural networks work. Since we had access to our drones, we wanted to log in to environment of our own. Initially we had trouble logging as our normal username/password didn't work on this server. I looked into how to set up an RSA public/private key and gave out instructions to the other people in our class, but I still thought that our accounts weren't set up. The way that I was trying to set up the public/private keys is how you would do it in a normal client/server architecture, but that is not really the architecture used with the drone machines. If I had thought about how our home folder was linked from caesar to the drone, I think I could have figure out how to set up the RSA public/private keys. By the end of this day, I was still not able to log into the drone machine.

2/16 - After I got instructions from Andrew (given by Professor Jonas), I was able to set up the public/private key to log into the drone Idefix. I helped troubleshoot some problems that Greg was having while running his experiment. Ultimately we got stuck on the decode portion of the experiment, most likely because some dependencies weren't copied over to the drone machine.

2/18 - checking in

2/19 - Since the drone machine wasn't being used, Alex decided to jump on the chance to run his first experiment and see if he could get around the problem that Greg was having (sometimes it's easier to start from the beginning, rather than troubleshoot something that is in progress and something someone else had started).

2/20 - The task that I (stupidly) volunteered for today was to proof read the Final Proposal in order to give it the same tone, fix spelling/grammar mistakes, and fix the formatting so that all groups use the same formatting.

2/21 - My task today was to make another pass through the proposal and make updates, as needed, to the changes that the groups, made based on suggestions made after yesterdays discussions. After talking with Professor Jonas, I needed to make sure that the proposal is in a singular voice, that of the class as a whole. All references of "we"/"our" should be from the class's perspective and not that of the individual group's.

Results

2/15 - I learned more about neural networks and refreshed my knowledge of how public/private keys work in a client/server architecture and came to understand how the symbolic links between caesar and the drone machines actually work.

2/16 - Greg wasn't able to successfully complete the experiment even with the rest of the group doing some remote troubleshooting, though as stated before, this was most likely a result of missing dependencies that haven't yet been copied over to the drone machine.

2/18 - checking in

2/19 - Not sure yet.

2/20 - I fixed a lot of problems with spelling/grammar, but there proved to be a lot of work in uniforming the phrasing that people used. The biggest problem that I didn't fix, but thought should be fixed was the the use of words like "us", "our", "we", etc. I think the proposal would have been better if we agreed beforehand to not use those terms. For instance a sentence like "We will determine if it is worthwhile to install G++, and if it is backwards compatible with GCC." would be changed to "A determination will need to be made as to if it is worthwhile to install G++, and if it is backwards compatible with GCC." I think these changes would make the individual groups less individualistic. Another idea, that I got from my Dad (who has read many project proposal from vendors), was to write using terms like "we", "our", and "us" in the Overview of the project to signify that this is the class or the company's voice and then in the individual groups sections use "they" or "this group" to continue in the class's voice. I talked it over with the group via Slack and ultimately, most likely out of ease of re-write, we decided to keep all the "we", "our", "us" in the proposal. There still needs to be re-writes of some of the group's sections by adding who is going to accomplish tasks in the implementation timeline and merging tasks with the implementation timeline; I have let each group know the changes that need to occur. I am going to take a look again tomorrow night to go over it again, and make changes to the sections that hopefully the other groups have updated. There are a few sections that I may or may not re-write again.

Because more groups used bullets in the goals section, we decided to just change the everyone's goals to a bullet format. My concern here is that the goals will to closely resemble the implementation timeline without the dates. I hope that the groups re-write the goals on a high level of what they want to do.

Alex who set up the Slack channel for the class sent me this screenshot :)

2/21 - I am not sure how many changes I made, but it seemed like a lot, although this might just be that I read the proposal many many times. I re-wrote the main overview section to try to touch on all the groups and their respective roles in bettering the project as a whole. Also, some of the bullets in the Plan section needed to be fixed to more align with the purpose of bullets, short succinct statements.

The main change that I made throughout the proposal was to try to use a singular voice to express the desires, needs, and expectations of the class, and not that of the individual group. This was tough to accomplish in certain sections. There are still certain things that need to be updated and I've let the groups know about these changes, but I fear they will not be updated in time. Mainly they are that 3/5 groups don't have individuals assigned to tasks in the Plan section of their proposal.

Plan

2/15 - The plan was to login into Idefix and poke around to see the difference between caesar and the drone machines that were now disconnect from caesars local folder. I also wanted to learn more about neural networks, as that is where the best results are found.

2/16 - The plan was to have Greg run a successful experiment so that we could make sure that we could run successful experiments on the detached drone machine Idefix, and assist as needed.

2/18 - checking in

2/19 - See the results that Alex gets with his experiment.

2/20 - My plan is to fix the spelling/grammar mistakes, and alert the groups through Slack to change what I can't easily change (changing paragraphs to bullets or vice versa). I also plan to ask the groups through slack, their thoughts on certain questions that I have.

2/21 - Read through the updated sections (as of yesterday) and make changes as needed. Also as I learned from the professor all references of the "we"/"our"/"us" should be from the perspective of the class and not the individual group. I need to make these changes, and make sure that all of the proposal seems as if it is from same person.

Concerns

2/15 - No concerns really.

2/16 - No concerns really.

2/18 - checking in

2/19 - No concerns really.

2/20 - A big concern that I have is in having to re-write large portions of other group's work. Another concern is that people will not have updated their page by today's date, which I announced in class is when I was expecting the re-writes to be finished by. My penultimate concern that I have, however, is in deferring to other's written proposals, in fear of being blamed or by messing up with the way I decide to edit something.

2/21 - My concern is that there are still changes to be made and that they will not be finished by the time that this is due 2pm 2/22? midnight 2/22?. Still as with yesterday,s concerns, is that I made the proposal worse by trying to proof read it. Another concern is with a certain overview that contains a tone that I don't think is fruitful; I forgot to mention this yesterday, but I hope they have enough time to change this.

Week Ending February 28, 2017

Task

2/25 -My task for this week is to install Numpy and Scipy along with Vitali. Vitali has already put the uncompiled source on Idefix. The problem is that we don't have the right tools on Idefix to compile from source either library. Scipy, I believe, requires a GCC compiler, a Fortran compiler, and perhaps others. The best way to install it would be to connect Idefix to the internet and use pip to install the packages. I also found an already compiled wheel file, but we don't have wheels on the server to execute it.

2/26 - Still working on installing Numpy and Scipy, although now the work has shifted to installing the dependencies that Numpy and Scipy require. As far as I can tell, Numpy and Scipy either need to have a 64-bit cpu architecture or use python 2.7. https://pypi.python.org/pypi/scipy/0.18.1 Upgrading python seems the lesser of two hassles.

2/27 - checking in

2/28 - Vitali installed the stuff that I had been working on once we got internet access. I wanted to make sure that since we had internet access we had all the stuff we needed to now ie installing g++ just in case.

Results

2/25 - So I just uploaded the file needed to install wheels, but then I quickly found out that we don't have pip installed...you need pip in order to install wheels, you need wheels in order to install python packages like numpy and scipy. It's dependencies all the way down.

2/26 - In order to install Numpy and Scipy, I needed to install pip and wheels. I was able to install pip, and I believe that I don't need to install wheels, but instead can use a command like this to install .whl packages:

python pip-9.0.1-py2.py3-none-any.whl/pip install --no-index some_wheels_file.whl

Since the Numpy and Scipy packages can be obtained in a wheels package, I thought I was ready to install them, however, it turned out to not be that simple. When I tried to install one of the wheels files (either Numpy or Scipy), I forgot that I needed to be sure of the correct architecture. It turns out, as far as I can tell, that we either need python 2.7 or have a 64-bit architecture on the machines. So I set my sights on installing python 2.7. I found some good instructions on how to install python 2.7 while offline here: http://www.linuxfromscratch.org/blfs/view/svn/general/python2.html. When I tried to install from those instructions, it stated the following:

configure: error: no acceptable C compiler found in $PATH

So now I set my sights on installing GCC...Here (https://koji.fedoraproject.org/koji/buildinfo?buildID=862021) I found a GCC rpm install file. Of course, this wouldn't be that easy, as when I tried to install it I got a number dependencies that it requires:

   error: Failed dependencies:
       binutils >= 2.24 is needed by gcc-7.0.1-0.10.fc26.i686
       cpp = 7.0.1-0.10.fc26 is needed by gcc-7.0.1-0.10.fc26.i686
       libasan.so.4 is needed by gcc-7.0.1-0.10.fc26.i686
       libatomic.so.1 is needed by gcc-7.0.1-0.10.fc26.i686
       libcilkrts.so.5 is needed by gcc-7.0.1-0.10.fc26.i686
       libgcc >= 7.0.1-0.10.fc26 is needed by gcc-7.0.1-0.10.fc26.i686
       libgmp.so.10 is needed by gcc-7.0.1-0.10.fc26.i686
       libgomp = 7.0.1-0.10.fc26 is needed by gcc-7.0.1-0.10.fc26.i686
       libisl.so.15 is needed by gcc-7.0.1-0.10.fc26.i686
       libmpc.so.3 is needed by gcc-7.0.1-0.10.fc26.i686
       libmpfr.so.4 is needed by gcc-7.0.1-0.10.fc26.i686
       libmpx.so.2 is needed by gcc-7.0.1-0.10.fc26.i686
       libmpxwrappers.so.2 is needed by gcc-7.0.1-0.10.fc26.i686
       libubsan.so.0 is needed by gcc-7.0.1-0.10.fc26.i686

The previous page that I linked to install GCC lists most of those dependencies. I uploaded the ones listed to the server, to install later, but I've done I can today.

TL;DR: Dependencies are indented underneath their respect parent dependencies.

   Numpy & Scipy
       pip (to install wheels packages)
       python 2.7
           GCC
               binutils >= 2.24 is needed by gcc-7.0.1-0.10.fc26.i686
               cpp = 7.0.1-0.10.fc26 is needed by gcc-7.0.1-0.10.fc26.i686
               libasan.so.4 is needed by gcc-7.0.1-0.10.fc26.i686
               libatomic.so.1 is needed by gcc-7.0.1-0.10.fc26.i686
               libcilkrts.so.5 is needed by gcc-7.0.1-0.10.fc26.i686
               libgcc >= 7.0.1-0.10.fc26 is needed by gcc-7.0.1-0.10.fc26.i686
               libgmp.so.10 is needed by gcc-7.0.1-0.10.fc26.i686
               libgomp = 7.0.1-0.10.fc26 is needed by gcc-7.0.1-0.10.fc26.i686
               libisl.so.15 is needed by gcc-7.0.1-0.10.fc26.i686
               libmpc.so.3 is needed by gcc-7.0.1-0.10.fc26.i686
               libmpfr.so.4 is needed by gcc-7.0.1-0.10.fc26.i686
               libmpx.so.2 is needed by gcc-7.0.1-0.10.fc26.i686
               libmpxwrappers.so.2 is needed by gcc-7.0.1-0.10.fc26.i686
               libubsan.so.0 is needed by gcc-7.0.1-0.10.fc26.i686

I also read through the tools group logs about their progress with GCC and they pointed to looking at the 2016 logs about that semesters progress with installing GCC. That information can be found here: (Week Ending March 29, 2016) https://foss.unh.edu/projects/index.php/Speech:Spring_2016_Jonathan_Trimble_Log. From that log, it looks like they were having trouble installing GCC to the point where they wanted to connect the drone to the internet to use something like yum install gcc, which would take care of everything including the dependencies. Of course if we went the route of giving the drone internet access, we wouldn't need to install GCC but rather just first try to install numpy and scipy with pip install numpy and pip install scipy. If that didn't work, as I don't think it would be, we would just install python 2.7 with pip install python2.7.13 or whatever the correct pip command is and then install numpy and scipy.

I think we should try the route of installing GCC dependencies and then if that doesn't work upgrade python/install numpy scipy the normal way (normal meaning having internet access).

2/27 - checking in

2/28 - Installed g++. Really really easy with internet access.

Plan

2/25 - Chat with Vitali tomorrow and see if he has has a better way than to now figure out how to install pip. I was just wondering if these machines had pip installed, and then I remembered that Idefix doesn't have internet anyway, to use curl.

2/26 - Install GCC (required to install python 2.7). GCC requires dependencies that need to be installed as well. The initial dependencies that GCC said it needed are the following:

2/27 - checking in

2/28 - instal g++.

Concerns

2/25 - My concern is that we might have to go to the servers themselves and give Idefix temporary internet access, although that would prove really disruptive for the other groups.

2/26 - My concern is that after installing all these dependencies, what I initially needed to install wouldn't work.

2/27 - checking in

2/28 - that this internet connection is fleeting.

Week Ending March 7, 2017

Task

3/4 - The task this week was to look into implementing and getting NDA finally working with the python 2.7.

3/5 - checking in.

3/6 - checking in.

3/7 - Yesterday Vitali tried running an experiment using Python2.7, however it failed. Today I need to figure out what the problem that Vitali had while using python 2.7.

Results

3/4 - I began researching (googling) other people getting LDA working on Sphinx3. Unfortunately, I couldn't find much information other than short forum posts of people having difficulty setting up LDA to work with sphinx. It was also difficult to get information specific to Sphinx3. Many times in the forums that I would read, the author didn't specify which version of sphinx they were using. This was perhaps due to when the forum was initially created it would have been obvious the version of Sphinx being used.

3/5 - checking in.

3/6 - checking in.

3/7 - I read through the decode.log and came to the same conclusion as Vitali. For some reason a file was missing that was supposed to be generated when using Python 2.7, that is usually generated when using Python 2.6

Plan

3/4 - My plan for today is to run an experiment if I can find the right information after researching LDA and Sphinx3.

3/5 - checking in.

3/6 - checking in.

3/7 - Read through the decode.log for hints at what the underlying problem with using python 2.7 actually is.

Concerns

3/4 - Not being able to figure out what the problem and difference is between using Python2.7 and Python2.6.

3/5 - checking in.

3/6 - checking in.

3/7 - My concern is that the version of Sphinx that we are using doesn't really implement LDA in any easy way.

Week Ending March 21, 2017

Task
[SPRING BREAK]
Results
[SPRING BREAK]
Plan
[SPRING BREAK]

Concerns
[SPRING BREAK]

Week Ending March 28, 2017

Task

3/26 - checking in.

3/27 - checking in.

3/28 - From conversations with the rest of the group, both the LDA and the RNN implementations are proving difficulty. I need to figure out what we can do about this.

Results

3/26 - checking in.

3/27 - checking in.

3/28 - From talking to the rest of the group and a bit a previous research seemed to point to differing problems for LDA and RNN. It seemed like the problem with the LDA implementation is using Python 2.7. Python 2.7 was needed because numpy and scipy packages could only be installed with Python 2.7 on 32-bit systems. Numpy and scipy can be installed and used with python 2.6 but would only work if the system is 64-bit. Fortunately, the server (hardware) has a 64-bit processor, however, the version of redhat that was installed is a 32-bit version. Looking into the RNN problem, it seemed like it also needed 64-bit environment. The group talked over upgrading the server, but we ultimately decided that we need to give it another week of testing and learning to see whether it is absolutely necessary to upgrade the operating system.

Plan

3/26 - checking in.

3/27 - checking in.

3/28 - I need to figure out what the actually problem is with LDA and RNN

Concerns

3/26 - checking in.

3/27 - checking in.

3/28 - My concern is that we will need upgrade the OS on idefix in order to implement LDA and RNN which seemingly require a 64-bit environment. Another concern that I have is that Caesar is a 64-bit environment, while the drones (at least idefix) is a 32-bit environment; the concern is that this will cause problems and inconsistencies when we start doing LDA and RNN experiments on Caesar.

Week Ending April 4, 2017

Task

4/1 - Begin running experiments with changed small changes in the config file for the Rebels team group.

4/2 - checking in.

4/3 - checking in.

4/4 - Start working on the URC poster.

Results

4/1 - The first experiment that I changed values in, didn't work. I had problems during the decode. I got the error "some error". I looked in the logs and I am not sure what the issue is, but I'll need to confer with my group/team to see what the exact issue my experiment is having.

4/2 - checking in.

4/3 - checking in.

4/4 - I completed a rough draft of the post that basically just outlines how the poster will look and the data that will go in each section.

Plan

4/1 - Run experiments with only one change per experiment in order to isolate which changes have the greatest effect and which changes work together.

4/2 - checking in.

4/3 - checking in.

4/4 - Begin the URC poster.

Concerns

4/1 - I am not sure what a lot of the values in the config files actually due and how they are supposed to affect the experiments. Also in order to be scientific about this process, I would need to change one value and see if that improves the result, and then make a different change and see if that improves the WER. I then would have to try an experiment with both changes and see if that still improves the WER. After this I would have to change a third variable and see the effect. Then I can't simply just add this to the other two changes, but instead, I would have to try this third change with only the first change and then try it only with the 2nd change and then with all three changes. I will then need to see which of these three changes improves the WER. This makes testing much longer to accomplish.

4/2 - checking in.

4/3 - checking in.

4/4 - My concern with working on the URC poster is that we won't have enough space on the poster to cover both LDA and RNN and some explanation in general about speech recognization.

Week Ending April 11, 2017

Task

4/8 - Begin work on the URC poster to both explain how modeling is used in speech recognition and our achievements of implementing RNN and LDA.

4/9 - Finish work on the URC poster with group members, and submit poster.

4/10 - checking in.

Results

4/8 - We are using a 3 column layout with some card elements for RNN & LDA. Right now there are just placeholders icons.

4/9 - I continued work on the URC poster, I wrote the section pertaining to the acoustic model, language model, and scoring. The last 3 sections were finished by my group members. I had some problems exporting the pdf, as there was white text that showed up when exporting, not when editing. I ended up exporting it as a powerpoint file and then exporting that as a pdf, which seemed to do the trick. I zipped the pdf and the pptx file and sent them to Professor Jonas.

4/10 - checking in.

Plan

4/8 - Create the poster using Google slides and invite the other group members.

4/9 - Finish the poster and submit it to Professor Jonas.

4/10 - checking in.

Concerns

4/8 - My concern is that there will not be enough space on the poster to fit in all the elements of modeling and speech recognition.

4/9 - I do not have any concerns for today. I think we can finish the poster and submit it a day early no problem.

4/10 - checking in.

Week Ending April 18, 2017

Task

4/12 - Attend a hangout with the Rebel group to go over the parameters that are available for tweaking when running experiments.

4/13 - Run an experiment with config options that were explained in the hangouts meeting.

4/16 - Checking in. Greg has been running the Rebel group's 300hr baseline experiment.

4/18 - Checking in. Read logs.

Results

4/12 - Spent 30-45 minutes on hangouts talking through the tweak-able parameters and other problems/struggles with running experiments and their results.

4/13 - I was running into a problem with my decode, but I looked in the logs and was able to see that I had forgotten an important portion of the decode process.

4/16 - Checking in. From "top" I can see that the 300hr is still running.

4/18 - Checking in. Read logs.

Plan

4/12 - Join the hangout call and take notes on what is discussed.

4/13 - Do some more research outside of the information/websites that Greg has already provided and run an experiment with different parameters changed.

4/16 - Check with Greg to see if he needs any help with the 300hr, although there is not much to do, but let it run.

4/18 - Checking in. Read logs.

Concerns

4/12 - The only concern I have about the hangout call is with my internet going down, or people talking over each other in the hangout.

4/13 - I don't have any concerns, other than if I make the parameter change go outside the bounds of what is possible and therefore error out the experiment.

4/16 - The 300hr takes such a long time, that I am afraid that we won't have enough time to run another before before our group's write-up is due.

4/18 - Checking in. Read logs.

Week Ending April 25, 2017

Task

4/22 - Checking in. Read Logs

4/23 - Checking in. Read Logs

4/24 - Try to research the sentence tag problem and if it is a problem.

4/25 - Vitali looked into the "s" tag problem and found that they do indeed make a difference, not for the better. Although the glimmer of hope is that the "s" tags have been throwing off the RNN because it has been treating the tags as words.

Results

4/22 - Checking in. Read Logs

4/23 - Checking in. Read Logs

4/24 - There is just not enough information out there for sphinx. Some of the most useful information about anything comes from the people that use it and struggle with it, and not necessarily those who created it in the first place. I am having trouble finding information on removing tags and whether tags get automatically removed in another process. The only reference materials that I could find are: http://www.speech.cs.cmu.edu/SLM/toolkit_documentation.html and http://cmusphinx.sourceforge.net/wiki/tutoriallm. The first site mentions the necessity of a .ccs file that would contain the tags used, but of course the first site is about the differences between sphinx 1 and sphinx 2, so I am not even sure if the .ccs file is used in Sphinx 3.

4/25 - Vitali looked into the "s" tag problem and found that they do indeed make a difference, not for the better. Although the glimmer of hope is that the "s" tags have been throwing off the RNN because it has been treating the tags as words.

Plan

4/22 - Checking in. Read Logs

4/23 - Checking in. Read Logs

4/24 - Try to find a website or forum with information from real users of Sphinx 3, instead of only the official documentation.

Concerns

4/22 - Checking in. Read Logs

4/23 - Checking in. Read Logs

4/24 - My concern is that the most up to date information and the only information that is being revised and tuned is related to either PocketSphinx or Sphinx4.

4/25 - Vitali looked into the "s" tag problem and found that they do indeed make a difference, not for the better. Although the glimmer of hope is that the "s" tags have been throwing off the RNN because it has been treating the tags as words.

Week Ending May 2, 2017

Task

4/27 - Begin the outline for the Rebel's Team Report thing that is due on the last day of class.

Results

4/27 - Began skeleton of the report, as we still have time to improve our best WER experiment. Currently it just has filler text and I created the table and we began filling it out as we believe it will be.

Plan

4/27 - Write as much as we can about what we know so far, such as the usage of LDA and and certain cfg settings.

Concerns

4/27 - No concerns, as I originally thought this was due next week, but now we have more time to improve the WER of our best experiment.

Week Ending May 9, 2017

Task


Results


Plan


Concerns