Speech:Spring 2012 Chad Connors Log


 * Home
 * Semesters
 * Spring 2012
 * Proposal
 * Report

Week Ending Feburary 6th, 2012

 * Task:

Item 1: Reading over WIKI page from previous classes and reading class mates logs. Familiarizing myself with using FOSS Wiki for the first time

Item 2: Looking around Caesar and developing a feel for what is loaded on it and become more familiar navigating around in the Unix Environment. Downloading Sphinx and trying to run in Ubuntu. Looks like might need to go with Suse I will report back after further evaluation.


 * Results:

After talking it over with Aaron it seems the school servers are all on Suse so in order to avoid conflict down the road will try it on there instead of Ubuntu. It also seems the version of sphinx I have is 4.0 so I need to find 3.0


 * Plan:

Downloading Suse and look for version 3.0 of sphinx


 * Concerns:

I worry that I don't know enough about the topic at hand that I won't be able to put all together in the allocated time of the semester.

Week Ending Feburary 13th, 2012

 * Task:

 Reformatted page based on Professor Jonas's recommendation Downloading Suse and looking for Sphinx 3.0  Read over other students logs  Installed Open Suse  Looked for Sphinx 3.0 on google and on CMU site seemed to not have luck. Will proceed with 4.0 in next few days. Read more about sphinx and how it works in relation to speech and the program it's self. Below are some good links.  

http://cmusphinx.sourceforge.net/wiki/tutorial

http://cmusphinx.sourceforge.net/wiki/tutorialconcepts

http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4


 * Results:

Suse was installed successfully and is working. I am learning its ins and outs. It looks like Windows but its folder management system seems different from Windows, Ubuntu and Mac OS X. I learned more about sphinx from looking over the website and the above links. I feel like I am starting to grasp the general concepts better than before hand. Using Sphinx though isn't so straight forward as I use it and Suse more I am sure I will start to get more familiar with the system and how programs are run etc.


 * Plan:

I will work more with sphinx next week to fully start learning the programs and research more into speech from last weeks lessons and thoroughly reading through cmu website again.


 * Concerns:

My concerns are the same as they were in the beginning. I just worry about the overall scope of the project being met in such a short time frame

Week Ending Feburary 20th, 2012

 * Task:

 Redesign Wiki page for the rest of the semester  Install Sphinx 3</li> Skype meeting with team and Professor Jonas Friday at 3pm</li>  Begin to get a grasp of sphinx and how to perform my task's </li> </ul>


 * Results:

Tuesday Results  Went through my Wiki Page and set up the rest of the semester so it should be more straight forward now to edit and stay on top of wiki entries </li>  Thanks to class mates work found sphinx 3 and brought it into Suse. Noticed that Suse handles desktop dropping differently than Ubuntu, OS X, and Windows so it took a few minutes to find the file again. I expanded the package which was a .tar file. It seems to expand fine. I ran an auto load file which was a text but seem to start the application install process which took a bit because there were hundreds of files in the expanded folder. I kept running into install problems around 66% where it would say it was missing files so I began to manually search for them to no availability. After talking to Brice and Aaron they had figured out how to install it and put up a guide for it. I will follow there directions in the next few days to finalize completion of install. </li> </ul>

Friday Results   Myself, Aaron and Brice met with Professor Jonas over Skype to discuss a method of attack. We discusses some more detailed unix command that we were not familiar with such as grep and how it could work. He also walked us through where to find important files and a general method for how we need to accomplish our goals either through detailed unix commands or beginning to work on Perl scripts. I don't believe any of us really know Perl at this point so that will be a future challenge I am sure. We also discussed what our goal should be for proposal and Professor Jonas had to some good recommendation for that also </li>  I also went through and browsed through other students logs to see what kind of progress we were making on all fronts </li> </ul>

Sunday Results  Read over class mates logs</li> Read over past class logs again to pick up some hints that they have left for us. As I learn more on the topic it starts to make sense a bit more. </li> </ul>

Monday Results  Installing sphinx but and running into problems still with it. I am going by the guidelines that Aaron and Brice posted I am having problems getting the c compeller to install correctly I will work on it through the night and hopefully have it up by class tomorrow.</li> </ul>
 * Plan:

Next week we will be setting up the proposal and finalizing all the details of what we need to accomplish by group member with dates so it will be easier to stay on task.


 * Concerns:

After talking to Professor Jonas about what we need to accomplish I am just worried with a month to pull off the amount of stuff we need to do. Should be challenging but look forward to giving it a whirl.

Week Ending Feburary 27th, 2012

 * Task:

 Work on and finish proposal</li> Get sphinx up and running </li> <li>Begin working on tasks of proposal</li> </ul>


 * Results:

Friday <ul> <li>I have not been able to get sphinx to install on Suse. I spent many hours trying to figure out why I would get an error on the last part when it came to the c compiler. I decided to reinstall Suse thinking it might have been missing repositories. I still get the same error on Sphinx when trying to install saying its missing the necessary disk. I tried mounting the ISo of Suse and then tried mounting the Vmware Tools disk and nothing seems to work as of now. I am going to side track on this now and start working on the proposal .</li> </ul>

Saturday <ul> <li> Read logs </li> </ul> Sunday <ul> <li>Read over classmates logs and and the proposal. It seems to be mostly done from the whole class with definite timeline and direction.</li> <li>Working on my part of the proposal have some ideas on how it should go but will be emailing professor Jonas to make sure I am the right ballpark. After I hear back I will be putting up my part of the proposal for monday. </li> </ul> Monday <ul> <li>Did my part of the proposal. I had been working out ideas the last few days but after hearing back from Professor Jonas was able to research some more on my topic and complete my part. </li> <li> Read over logs and the current state of the proposal I noticed there are some empty spot so I am going to look over it and see if there is anything I could contribute in the other sections.</li> </ul>


 * Plan:

I will now begin to work on the the tasks I have outlined in the proposal, which for next week will include looking into wav files and transcripts


 * Concerns:

None this week

Week Ending March 5, 2012
<ul> <li>Work on verifying wav files and transcripts which we will need for train and decode</li> <li>Research into the dictionary files currently on Caesar and what I will need to do go get them up to pace</li> </ul>
 * Task:


 * Results:

Tuesday <ul> <li>Reviewing the proposal from the group for the final submission tonight</li> <li>Looking through caesar for the current dictionary included to see what i will be doing with them. I found the first one under /speechtools/SphinxTrain-1.0/train1/etc - The my.dict file. The second example I found was under /speechtools/SphinxTrain-1.0/etc and is the time.dict file. I performed a cat command on both and then copied the results into a spreadsheet to compare the results. They are similar except time.dict contains 500 entries and train1.dic contains 520. The third dictionary file I found was my.dict which is found under /speechtools/SphinxTrain-1.0/train1/etc. This file contains 508 entries. They also do not contain the phonetic spelling next to the word where as the other items included do contain it. An example is ABOUT - AH B AW T </li> </ul> Thursday <ul> <li>Read group and classmates logs</li> </ul> Sunday <ul> <li>Read group and classmates logs</li> </ul> Monday <ul> <li>Attempting to look up the current dictionaries that are in Caeser to look for additional files beyond the 3 that I found. I feel silly but I can't seem to find the /speechtools directory right now? I will look more into it to see if its been moved to another spot. FIX- I later found it. For some reason my home directory on a CD was going back to the main area it needed the ~ to get back to the correct directory.</li> <li>I will be emailing Professor Jonas to clarify some questions about the dictionary such as if we will need the filler dictionary. </li> <li> Looking through the data preparation group to see where they are at on the transcripts and files and email them if necessary</lI> </ul>


 * Plan:

After getting some clarification my goal for next week will be about setting up the dictionary and seeing what we I need to do specifically to get it ready for the train. It seems we need to shorten the dictionary with a pearl script.


 * Concerns:

The usual I have had all semester learning what I need to and getting it down on a short time period.

Week Ending March 19, 2012
<ul> <li> Develop and further work on dictionary</li> <li>Check on status of other prep work and files needed</li> </ul>
 * Task:


 * Results:

Friday <ul> <li>Read classmates logs</li> </ul> Sunday <ul> <li>Read classmates logs</li> </ul> Monday <ul> <li>Trying to learn Perl scripting language as I will need it to work on the dictionary. I found this site http://www.scribd.com/doc/7058102/How-PERL-Works and have been messing around with it on my mac following the basic examples it sets up so that I gain familiarity. </li> <li>I was looking around last years class for examples and on Nicks log found this script

I will attempt to run it on perl after I have a better understanding of how it works. This was found in Nicks May 3rd Log. It doesn't seem to format correctly I will redo it later tonight so it can show up the right way </ul>


 * Plan:

Create a more recent dictionary than the one from last semester


 * Concerns:

Not sure I will have my part done in time

Week Ending March 26, 2012
<ul> <li>Get Dictionary Working</li> </ul>
 * Task:


 * Results:

Friday <ul> <li>Read other students logs</li> </ul> Saturday <ul> <li>Read group and classmates logs</li> </ul> Monday <ul> <li>Needed a master dictionary, that is one with thousands and thousands of words to use as a way to sort through the transcripts and make sure the word existed in the master dictionary so that it could be placed into the new dictionary feature a smaller selection to help with our train and decode. I found a large file on the CMU site under https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/cmudict/cmudict.0.6d  I placed this file into Caesar under the directory of caesar:/mnt/main/corpus/dist per Professor Jonas recommendation (believe it should be /dict) </li> </ul> <ul> <li>I have created a perl script on my local hard drive based on Nicks script from last year. I have also loaded the cmudict.06 and copied transcripts I found on Caesar to a text file. I then attempted to run the perl script it seems to run as it gives me the print out of what is defined in the script but does not give me the mutual words. Which it should do per its design. I have been reading up on Perl a lot lately but it's hard to learn a language on the fly so doing my best with it given the circumstances. I will continue to work through the night to try to get it down for tomorrow's class</li> </ul>
 * Plan:

Work on the dictionary script until it works properly


 * Concerns:

Finishing

Week Ending April 2, 2012
<ul> <li>Create Dictionary</li> <li>Create directions for data preparation</li> </ul>
 * Task:


 * Results:

<ul> <li> I have been working all week on getting the script that Nick created last year up and running. I have been very unsuccessful so then began to look more into perl scripting as it is completely new. I found some useful sites with some nice over view and looking at the script it looks like it should work as intended BUT it has not for me. I have run the script to create the new dictionary and it just prints the first line of the script which describes what it does. I have tried removing this section and and seeing how the script handles it. It does nothing essentially, as it seems to run but is not outputting the text into the output file as it is suppose to. I will continue to work on this all night and morning if I must to get it right, but have just been extremely flustered and overwhelmed all week trying to get it working. </li> <li> I continued to work all week and all through the night until 5am today and I'm still not having any luck. It seems to always be the same problem of not finding the dictionary or the transcript file. I have been reading lot of ebook and online tutorials over the last few weeks to get familiar with Perl and worked on creating a bunch of unrelated perl scripts to try to help my knowledge of why I was getting an error on nicks script. After running into the same problem I began looking into some different Perl and Unix help forums online and got some feedback from users on that along with another script idea.</li>

<li> Even using this script I am an running into the same problem of it not finding the other files per the die command. Insufficient arguments: Need word file, and Dict file names at create.pl line 5. Line 5 contains the @argv array and the die command which keeps coming up. The files are names dictfile and wordfile just as the script says. I am now heading to class and hoping to have some better luck today </li> </ul>


 * Plan:

To finish


 * Concerns:

Finishing

Week Ending April 9, 2012
<ul> <li>Finish working on Dictionary </li> <li>Began working on Train and Decode </li> </ul> Thursday <ul> <li> After spending countless hours last week I was baffled to why my script was not working and tried a bunch of things to get it to work to no available. Luckily Ted took a look at it and noticed I wasn't designating the other files in the command field in the terminal once I did that I finally got output. The downside is its not outputting to another file but once it does I should be in good shape. I have also begun working on starting some Train and decode setting up train6 on Caesar. I have been following the guidelines outlined in the Summer 2011 under Bryce's recommendation so far I can get through most of the train but am getting some error messages toward the end. We are meeting as a group tomorrow over Skype so hopefully I will sort it out by then. </li> </ul> Friday <ul> <li>Had a Skype meeting with the team and walked through the train and decode. We went through and Professor Jonas helped us pickup a few pointers we had been doing wrong such as not copying enough .sph files to the item. After an hour and half I had to leave and was unable to finish the meeting sadly I was already ahead of this on my own the day before so I was unable to find out the error I was getting involving line 39 when running the scripts. I will look into it more tomorrow to figure out what that was about though. </li> </ul> Sunday <ul> <li>Working on fixing my dictionary script so that it outputs to a correct file output</li> </ul> Monday <ul> <li>Working on updating my part of the wiki on Dictionary setup</li> </ul>
 * Task:
 * Results:


 * Plan:

Move forward and began test and decode


 * Concerns:

Time

Week Ending April 16, 2012
<ul> <li>Finalize dictionary</li> <li>Begin testing and training in my new group</li> </ul>
 * Task:


 * Results:

Thursday <ul> <li>After recommendation from Professor Jonas in class on Tuesday I was finally able to get my dictionary script to output its contents to an output file by using an output command in unix the |tee command. It has worked excellent. I am now just working on fixing the Sed command that he has recommended so that I can remove the parts of the dictionary that come up not on file which mostly include "s" which are in the transcripts. After weeks and a lot of hours I happy that this is finally coming together. </li> </ul> Friday <ul> <li>Had a Skype meeting with Damir and Michael to discuss our method or attack on the train and decode. We are all running through a train and and decode in the next few days then will meet up on Monday to discuss our progress. </li> <li>Damir asked me to draft a write up of how I did it the train as I have gotten further then they were able to in the last few weeks. I will base it off the summer 2011 writeup with the changes that I did to it</li> <li> I emailed and received a message back from Professor Jonas on the Sed command which should help finalize my dictionary creation script. I will work on it tomorrow and hopefully finish it up completely. </li> </ul> Saturday <ul> <li>I just drafted up what I did during the train when I ran it last week so that Michael and Damir would have a better idea how to do it. I basically just copied the page that is on the wiki but then just put notes about specific things I did differently or how it should be renamed etc. </li> <li>After hearing back from Jonas I worked in the sed command into the dictionary it works great at getting rid of the stray s in the script. I am going to work on it and see if there is away to remove the file names from the dictionary so each .sph file doesn't show up as not in the dictionary. I will work on it more tomorrow </li> </ul> Monday <ul> <li> I volunteered to do the solution section of the poster so I just drafted it up and put the section on the wiki page</li> <li> I will check in with my group to see how we are doing on the train and decode </li> <li> I am going through the dictionary to check for missing words and clean up file names that are included </li> </ul>
 * Plan:

Work on getting train and decode up by the end of the semester


 * Concerns:

Just if we will finish in time. The usual

Week Ending April 23, 2012

 * Task:
 * Train and Decode

Tuesday Thursday Friday Monday
 * Results:
 * Met with my group and performed some train exercises on Traubadix. I had gotten further in the past then they did so I did as they observed on my machine.  I typed up the actual commands that we put in and I will be putting it on the group wiki page now.
 * Read Logs and worked on fine tuning my dictionary stuff
 * Read Logs and worked on fine tuning my dictionary stuff for the second day
 * Updated group Wiki page and worked on running some more test


 * Plan:

Keep working on testing


 * Concerns:

Finishing

Week Ending April 30, 2012

 * Task:
 * Train and Decode


 * Results:

Tuesday Thursday Saturday Monday
 * We were split up into new groups. I am part of the new train and decode group.  We are currently continuing the research of Group 1 and trying to figure out why we have encountered errors.
 * Read through student logs to and group page to see where we are at
 * Read through logs and checking on group status to see where we are at
 * Check on groups progress we seem to be stuck at this point I am not great at perl scripting but taking a look to see if I can help at all to further our tests
 * Read emails from Aaron about the other teams progress, seem to be behind so looking into that and see if I can help out with anything


 * Plan:

Keep working


 * Concerns:

Time

Week Ending May 7, 2012

 * Task:
 * Training and Decoding


 * Results:

Tuesday Wednesday Sunday Monday
 * We worked on Training and figured out and worked some ways to get rid of the dreaded line 48 error message by changing the run all.pl scripts variable inputs.
 * Read logs and looked over group progress
 * Read student logs
 * Looks like the group was able to do a successful train, I wish I was more help to them in the last week. I emailed John to see if there is anything I can help with the wiki page or loose ends.
 * Review for the final reviews


 * Plan:

Give our findings to future groups. Good luck guys!


 * Concerns:

None