Difference between revisions of "Speech:Spring 2012 Chad Connors Log"

From Openitware
Jump to: navigation, search
(Week Ending April 2, 2012)
(Week Ending April 9, 2012)
Line 349: Line 349:
 
==Week Ending April 9, 2012==
 
==Week Ending April 9, 2012==
 
;Task:
 
;Task:
TBD
+
<ul>
 
+
<li>Finish working on Dictionary </li>
 +
<li>Began working on Train and Decode </li>
 +
</ul>
 
;Results:
 
;Results:
 
+
<ul>
TBD
+
Thursday
 +
<li> After spending countless hours last week I was baffled to why my script was not working and tried a bunch of things to get it to work to no available.  Luckily Ted took a look at it and noticed I wasn't designating the other files in the command field in the terminal once I did that I finally got output.  The downside is its not outputting to another file but once it does I should be in good shape.  I have also begun working on starting some Train and decode setting up train6 on Caesar.  I have been following the guidelines outlined in the Summer 2011 under Bryce's recommendation so far I can get through most of the train but am getting some error messages toward the end.  We are meeting as a group tomorrow over Skype so hopefully I will sort it out by then. </li>
 +
</ul>
  
 
;Plan:
 
;Plan:

Revision as of 16:01, 5 April 2012


Week Ending Feburary 6th, 2012

Task

Item 1: Reading over WIKI page from previous classes and reading class mates logs. Familiarizing myself with using FOSS Wiki for the first time

Item 2: Looking around Caesar and developing a feel for what is loaded on it and become more familiar navigating around in the Unix Environment. Downloading Sphinx and trying to run in Ubuntu. Looks like might need to go with Suse I will report back after further evaluation.

Results

After talking it over with Aaron it seems the school servers are all on Suse so in order to avoid conflict down the road will try it on there instead of Ubuntu. It also seems the version of sphinx I have is 4.0 so I need to find 3.0

Plan

Downloading Suse and look for version 3.0 of sphinx

Concerns

I worry that I don't know enough about the topic at hand that I won't be able to put all together in the allocated time of the semester.


Week Ending Feburary 13th, 2012

Task
  • Reformatted page based on Professor Jonas's recommendation
  • Downloading Suse and looking for Sphinx 3.0
  • Read over other students logs
  • Installed Open Suse
  • Looked for Sphinx 3.0 on google and on CMU site seemed to not have luck. Will proceed with 4.0 in next few days. Read more about sphinx and how it works in relation to speech and the program it's self. Below are some good links.

http://cmusphinx.sourceforge.net/wiki/tutorial

http://cmusphinx.sourceforge.net/wiki/tutorialconcepts

http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4

Results

Suse was installed successfully and is working. I am learning its ins and outs. It looks like Windows but its folder management system seems different from Windows, Ubuntu and Mac OS X. I learned more about sphinx from looking over the website and the above links. I feel like I am starting to grasp the general concepts better than before hand. Using Sphinx though isn't so straight forward as I use it and Suse more I am sure I will start to get more familiar with the system and how programs are run etc.

Plan

I will work more with sphinx next week to fully start learning the programs and research more into speech from last weeks lessons and thoroughly reading through cmu website again.

Concerns

My concerns are the same as they were in the beginning. I just worry about the overall scope of the project being met in such a short time frame


Week Ending Feburary 20th, 2012

Task
  • Redesign Wiki page for the rest of the semester
  • Install Sphinx 3
  • Skype meeting with team and Professor Jonas Friday at 3pm
  • Begin to get a grasp of sphinx and how to perform my task's


Results

Tuesday Results

  • Went through my Wiki Page and set up the rest of the semester so it should be more straight forward now to edit and stay on top of wiki entries
  • Thanks to class mates work found sphinx 3 and brought it into Suse. Noticed that Suse handles desktop dropping differently than Ubuntu, OS X, and Windows so it took a few minutes to find the file again. I expanded the package which was a .tar file. It seems to expand fine. I ran an auto load file which was a text but seem to start the application install process which took a bit because there were hundreds of files in the expanded folder. I kept running into install problems around 66% where it would say it was missing files so I began to manually search for them to no availability. After talking to Brice and Aaron they had figured out how to install it and put up a guide for it. I will follow there directions in the next few days to finalize completion of install.

Friday Results

  • Myself, Aaron and Brice met with Professor Jonas over Skype to discuss a method of attack. We discusses some more detailed unix command that we were not familiar with such as grep and how it could work. He also walked us through where to find important files and a general method for how we need to accomplish our goals either through detailed unix commands or beginning to work on Perl scripts. I don't believe any of us really know Perl at this point so that will be a future challenge I am sure. We also discussed what our goal should be for proposal and Professor Jonas had to some good recommendation for that also
  • I also went through and browsed through other students logs to see what kind of progress we were making on all fronts

Sunday Results

  • Read over class mates logs
  • Read over past class logs again to pick up some hints that they have left for us. As I learn more on the topic it starts to make sense a bit more.

Monday Results

  • Installing sphinx but and running into problems still with it. I am going by the guidelines that Aaron and Brice posted I am having problems getting the c compeller to install correctly I will work on it through the night and hopefully have it up by class tomorrow.
Plan

Next week we will be setting up the proposal and finalizing all the details of what we need to accomplish by group member with dates so it will be easier to stay on task.

Concerns

After talking to Professor Jonas about what we need to accomplish I am just worried with a month to pull off the amount of stuff we need to do. Should be challenging but look forward to giving it a whirl.

Week Ending Feburary 27th, 2012

Task
  • Work on and finish proposal
  • Get sphinx up and running
  • Begin working on tasks of proposal
Results

Friday

  • I have not been able to get sphinx to install on Suse. I spent many hours trying to figure out why I would get an error on the last part when it came to the c compiler. I decided to reinstall Suse thinking it might have been missing repositories. I still get the same error on Sphinx when trying to install saying its missing the necessary disk. I tried mounting the ISo of Suse and then tried mounting the Vmware Tools disk and nothing seems to work as of now. I am going to side track on this now and start working on the proposal .

Saturday

  • Read logs

Sunday

  • Read over classmates logs and and the proposal. It seems to be mostly done from the whole class with definite timeline and direction.
  • Working on my part of the proposal have some ideas on how it should go but will be emailing professor Jonas to make sure I am the right ballpark. After I hear back I will be putting up my part of the proposal for monday.

Monday

  • Did my part of the proposal. I had been working out ideas the last few days but after hearing back from Professor Jonas was able to research some more on my topic and complete my part.
  • Read over logs and the current state of the proposal I noticed there are some empty spot so I am going to look over it and see if there is anything I could contribute in the other sections.
Plan

I will now begin to work on the the tasks I have outlined in the proposal, which for next week will include looking into wav files and transcripts

Concerns

None this week

Week Ending March 5, 2012

Task
  • Work on verifying wav files and transcripts which we will need for train and decode
  • Research into the dictionary files currently on Caesar and what I will need to do go get them up to pace
Results

Tuesday

  • Reviewing the proposal from the group for the final submission tonight
  • Looking through caesar for the current dictionary included to see what i will be doing with them. I found the first one under /speechtools/SphinxTrain-1.0/train1/etc - The my.dict file. The second example I found was under /speechtools/SphinxTrain-1.0/etc and is the time.dict file. I performed a cat command on both and then copied the results into a spreadsheet to compare the results. They are similar except time.dict contains 500 entries and train1.dic contains 520. The third dictionary file I found was my.dict which is found under /speechtools/SphinxTrain-1.0/train1/etc. This file contains 508 entries. They also do not contain the phonetic spelling next to the word where as the other items included do contain it. An example is ABOUT - AH B AW T

Thursday

  • Read group and classmates logs

Sunday

  • Read group and classmates logs

Monday

  • Attempting to look up the current dictionaries that are in Caeser to look for additional files beyond the 3 that I found. I feel silly but I can't seem to find the /speechtools directory right now? I will look more into it to see if its been moved to another spot. FIX- I later found it. For some reason my home directory on a CD was going back to the main area it needed the ~ to get back to the correct directory.
  • I will be emailing Professor Jonas to clarify some questions about the dictionary such as if we will need the filler dictionary.
  • Looking through the data preparation group to see where they are at on the transcripts and files and email them if necessary
Plan

After getting some clarification my goal for next week will be about setting up the dictionary and seeing what we I need to do specifically to get it ready for the train. It seems we need to shorten the dictionary with a pearl script.

Concerns

The usual I have had all semester learning what I need to and getting it down on a short time period.

Week Ending March 19, 2012

Task
  • Develop and further work on dictionary
  • Check on status of other prep work and files needed
Results

Friday

  • Read classmates logs

Sunday

  • Read classmates logs

Monday

  • Trying to learn Perl scripting language as I will need it to work on the dictionary. I found this site http://www.scribd.com/doc/7058102/How-PERL-Works and have been messing around with it on my mac following the basic examples it sets up so that I gain familiarity.
  • I was looking around last years class for examples and on Nicks log found this script
    #!/usr/bin/perl
     
    if( $#ARGV != 2 )
    {
    	print "Compares the list of words in a file to the words in a dictionary and outputs the words available with pronunciations\n";	
    	print "perl GenerateDictionary WordFile DictionaryFile OutputFile\n";
    	exit;
    }
     
    open( WORD_FILE,  "$ARGV[0]" );
    open( DICT_FILE,  "$ARGV[1]" );
    open( OUTP_FILE, ">$ARGV[2]" );
     
    @theDICT = <DICT_FILE>;
    close( DICT_FILE );
     
    while( <WORD_FILE> )
    {
    	my($line) = $_;
     
    	chomp($line);
     
    	foreach( @theDICT )
    	{
    		$tmpLine = $_;
     
    		@items = split( / /, $tmpLine );
     
    		if( @items[0] eq $line )
    		{
    			print $line."\t".$tmpLine;
    			print OUTP_FILE $tmpLine;
    		}
     
    	}
     
     
    }
     
    close( WORD_FILE );
    close( OUTP_FILE );
    exit;


    I will attempt to run it on perl after I have a better understanding of how it works. This was found in Nicks May 3rd Log. It doesn't seem to format correctly I will redo it later tonight so it can show up the right way

Plan

Create a more recent dictionary than the one from last semester

Concerns

Not sure I will have my part done in time

Week Ending March 26, 2012

Task
  • Get Dictionary Working
Results

Friday

  • Read other students logs

Saturday

  • Read group and classmates logs

Monday

  • Needed a master dictionary, that is one with thousands and thousands of words to use as a way to sort through the transcripts and make sure the word existed in the master dictionary so that it could be placed into the new dictionary feature a smaller selection to help with our train and decode. I found a large file on the CMU site under https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/cmudict/cmudict.0.6d I placed this file into Caesar under the directory of caesar:/mnt/main/corpus/dist per Professor Jonas recommendation (believe it should be /dict)
  • I have created a perl script on my local hard drive based on Nicks script from last year. I have also loaded the cmudict.06 and copied transcripts I found on Caesar to a text file. I then attempted to run the perl script it seems to run as it gives me the print out of what is defined in the script but does not give me the mutual words. Which it should do per its design. I have been reading up on Perl a lot lately but it's hard to learn a language on the fly so doing my best with it given the circumstances. I will continue to work through the night to try to get it down for tomorrow's class
Plan

Work on the dictionary script until it works properly

Concerns

Finishing

Week Ending April 2, 2012

Task
  • Create Dictionary
  • Create directions for data preparation
Results
  • I have been working all week on getting the script that Nick created last year up and running. I have been very unsuccessful so then began to look more into perl scripting as it is completely new. I found some useful sites with some nice over view and looking at the script it looks like it should work as intended BUT it has not for me. I have run the script to create the new dictionary and it just prints the first line of the script which describes what it does. I have tried removing this section and and seeing how the script handles it. It does nothing essentially, as it seems to run but is not outputting the text into the output file as it is suppose to. I will continue to work on this all night and morning if I must to get it right, but have just been extremely flustered and overwhelmed all week trying to get it working.
  • I continued to work all week and all through the night until 5am today and I'm still not having any luck. It seems to always be the same problem of not finding the dictionary or the transcript file. I have been reading lot of ebook and online tutorials over the last few weeks to get familiar with Perl and worked on creating a bunch of unrelated perl scripts to try to help my knowledge of why I was getting an error on nicks script. After running into the same problem I began looking into some different Perl and Unix help forums online and got some feedback from users on that along with another script idea.
  • #!/usr/bin/perl
    use strict;
    use warnings;
     
    @ARGV > 0 or die "Insufficient arguments: Need word file, and Dict file names";
    my ($wordfile, $dictfile) = @ARGV;
     
    open my $d, "<", $dictfile or die "Cannot open $dictfile: $!";
    open my $w, "<", $wordfile or die "Cannot open $wordfile: $!";
     
    my %dict = map split( ' ', $_, 2 ), <$d>;
    close $d;
     
    while ( <$w> ) {
      for my $word ( split ) {
        if ( exists $dict{ $word } ) {
           print "$word: $dict{ $word }";
           next;
        }
        print "$word is not in the dictionary\n";
      }
    }
    close $w;


  • Even using this script I am an running into the same problem of it not finding the other files per the die command. Insufficient arguments: Need word file, and Dict file names at create.pl line 5. Line 5 contains the @argv array and the die command which keeps coming up. The files are names dictfile and wordfile just as the script says. I am now heading to class and hoping to have some better luck today
Plan

To finish

Concerns

Finishing

Week Ending April 9, 2012

Task
  • Finish working on Dictionary
  • Began working on Train and Decode
Results
    Thursday
  • After spending countless hours last week I was baffled to why my script was not working and tried a bunch of things to get it to work to no available. Luckily Ted took a look at it and noticed I wasn't designating the other files in the command field in the terminal once I did that I finally got output. The downside is its not outputting to another file but once it does I should be in good shape. I have also begun working on starting some Train and decode setting up train6 on Caesar. I have been following the guidelines outlined in the Summer 2011 under Bryce's recommendation so far I can get through most of the train but am getting some error messages toward the end. We are meeting as a group tomorrow over Skype so hopefully I will sort it out by then.
Plan

TBD

Concerns

TBD

Week Ending April 16, 2012

Task

TBD

Results

TBD

Plan

TBD

Concerns

TBD

Week Ending April 23, 2012

Task

TBD

Results

TBD

Plan

TBD

Concerns

TBD

Week Ending April 30, 2012

Task

TBD

Results

TBD

Plan

TBD

Concerns

TBD

Week Ending May 7, 2012

Task

TBD

Results

TBD

Plan

TBD

Concerns

TBD