Speech:Spring 2011 Nick Log


 * Home
 * Spring 2011
 * Proposal
 * Report

Week Ending March 8th, 2011
Installing speech software on the Automatix, Methusalix, and Verleihnix speech servers.
 * Task:

The speech software including ShinxTrain, CMU Toolkit, and Sphinx were installed successfully.
 * Results:

Become familiar with using speech tools to proceed with the project.
 * Plan:

None at this time.
 * Concerns:

Week Ending March 22nd, 2011
Learn about training and decoding using the Sphinx toolkit. Add information to the Capstone proposal.
 * Task:

I added a couple of paragraphs to the 'Decode on Development' section. This part is in section 3 which is critical to tasks we are working on now. I learned a little bit about how to edit items on a wiki page using DocuWiki. This will hopefully be helpful with updating the status of the project. I will be working on creating scripts to streamline some of the redundant processes involved in using the Sphinx toolkit. This will make things more specific to our project. I plan to set up a Sphinx sandbox in a virtual environment so that I can do localized tested without risking the Caesar system.
 * Results:
 * Plan:

I was unable to access the Caesar server and there was no notification as to why the server was offline. Repeated attempts, even while on the UNH network campus showed the system as offline but it was pingable.
 * Concerns:

Week Ending March 29th, 2011
Update the wiki with my past Progress Reports. Use the CMU Toolkit to set up a language model.
 * Task:

I reviewed the CMU Toolkit website about building a language model toolkit. A language model can be built from both a Vocabulary data set and ID 3-Grams or text can be used to build the Vocabulary data and the ID 3-Grams. I tested the tools located with the change_log.txt file because it was geographically convenient. Firstly, the text file is converted to a word frequency file (.wfreq) with the following command. ‘../bin/text2wfreq < change_log.txt > change_log.txt’. I am in the directory with the text file present. Looking at the file with ‘nano’, I it contains a list of every word in from the file with a count indicating how many occurrences were found. Next, a Vocabulary file is created from the word count file with the following command. ‘../bin/wfreq2vocab < change_log.wfreq > change_log.vocab’ This file contains a list of every unique word from the word frequency file. In order to get an Id 3-gram file, the vocabulary or text file is necessary. Looking back I believe the text file would have been better. I used the vocabulary file that was created to make an ID 3-gram file. The command is as follows. ‘../bin/text2idngram –vocab change_log.vocab > change_log.idngram’ This seemed to hang on me for more than 10 minutes so I canceled it. It would be better to run this as a background process to see if it finishes. The screen simply displayed ‘Allocating memory for the n-gram buffer’. If this file is complete, the Language Model can be built with the following command. ‘../bin/idngram2lm –idngram change_log.idgram –vocab change_log.vocab –binary change_log.binlm’ As you can see it requires the idgram file and vocab file. I updated wiki with previous progress reports.
 * Results:

I want to discuss what would be a valid source file of text for creating the language models. I also need an updated vocabulary file.
 * Plan:



None at this time.
 * Concerns:

Week Ending April 5th, 2011
Thursday:
 * Task:

Updated Wiki with last weeks log. Reviewing contents of transcript file for parsing.

Friday: C.Reekie, Admin 16:40, 1 April 2011 (UTC) Fixed your image link (must include .jpg or .png after image files)

Thank you. Veonix84

Server glitched a bit Friday evening. Read through some of my classmates logs.

Saturday: Copied the transcript file to verleihnix, so I have a local copy to work with when writing a perl script to format it.

Sunday: Not much to report today. Discussed with Brian ways to parse the transcript file and do decoding tests.

Monday: Wrote a perl script to remove all the non-word characters from the transcript.

The file ms98_icsi_word.text contains transcripts from switchboard. The server was responding slowly, so I sftp'd using filezilla and copied the transcript file to my local linux machine.
 * Results:

Using examples from the Internet, I was able to clean up the transcripts so they can be fed into the CMU Toolkit to create a language model based on the switchboard transcripts. The script is currently located where the transcript file is and on my local machine. Below is the code:

C.Reekie, Admin 03:34, 5 April 2011 (UTC) Added syntax highlighting There are a few spaces remaining in the front of each line but I do not think that will impact the creation of the language model. Create an ID 3-gram file from the resulting script. Use the ID 3-gram file along with a vocab file to create the language model.
 * Plan:

None
 * Concerns:

Week Ending April 12th, 2011
Thursday: Reviewed methods of mounting an SFTP folder localing in Linux.
 * Task:

Friday: Trouble logging in. N/S

Saturday: Tried using curl to set up an sftp connection to Caeser but it failed trying to connect to 192.168.10.1 and using the name caeser. Uploaded sshfs to Caeser and to my host machine. COMMENT: Mike-jonas 20:25, 10 April 2011 (UTC) I just fixed Caesar's name...note that we misspelled it in the network configurartion by having it be caesEr and it is supposed to be caesAr...oops...this may not affect what you were trying to do. Question, what were you trying to do? Mount a file system using some sort of sftp daemon? Is that like NFS? We definitely will need some sort of Network File System on caesar so we can get at all the directories from the other 9 machines.

Could not configure it. If someone is more familiar with this process, give it a try please. It is on the 192.168.10.10 machine, under /media

Sunday: Copied the parsing script to verlehnix for testing.

These two scripts were used from the /media folder to convert the words from the transcriptions into vocab files.

Create a word frequency file ~/speechtools/CMU-Cam_Toolkit_v2/bin/text2wfreq  trans.wfreq

Create a vocab file ~/speechtools/CMU-Cam_Toolkit_v2/bin/wfreq2vocab  trans.vocab

Create an id 3 gram ~/speechtools/CMU-Cam_Toolkit_v2/bin/text2idngram -vocab trans.vocab -n 3  trans.idngram

This seemed to hang on me before with even a small file.

I am going to let the process go for a while to see if it finishes in a reasonable amount of time.

Quick google of the text2idngram found that i was using the syntax wrong. I also decided to specify n gram of 3.

And it went very quickly, here is the output.

Format this if you can, oh wiki overlord.

COMMENT:C.Reekie, Admin 03:56, 12 April 2011 (UTC) Your telling me,your welcome!

Finally the language model.

I decided to create a language model with arpa and binary format. We can result the difference late.

~/speechtools/CMU-Cam_Toolkit_v2/bin/idngram2lm -idngram trans.idngram -vocab trans.vocab -arpa trans.arpa

~/speechtools/CMU-Cam_Toolkit_v2/bin/idngram2lm -idngram trans.idngram -vocab trans.vocab -binary trans.binlm

create a trans folder to contain all the items mkdir trans

make a tarball of the files tar -cp trans.* trans.tar

send it to caeser under the media folder sftp 192.168.10.1 put trans.tar /media

The files are on Caeser under /media/data/trans

The language model files are the trans.arp and trans.binlm

Monday:


 * Results:

Finished creating the language model for training.


 * Plan:

Intend to write a script to do all steps shown above.


 * Concerns:

None

Week Ending April 19th, 2011
Create a single script that will generate a language model.
 * Task:

Friday: Read through teammates logs. Very interested in Brian's find of Torque.

Saturday: Nothing to log at this time

Sunday: Finished creating language model script.

This script is currently located on Caeser under /media/data/trans called CreateLanguageModelFromText.perl

To execute: perl CreateLanguageModelFromText.perl inFile outFile

Requirements are the script I wrote, ParseTranscript.perl has to be in the same directory as this script. There is a variable called $folder which specifies where the CMU Toolkit bin directory is. This way we can change it's location without changing much code.

This script will need to change or branched to receive a vocab file as the project progresses.

Monday: Learned that I need to link the word instances from the transcripts to the sphinx dictionary containing the Phenomes *spelling*.

I successfully wrote a script to generate a language model from a text file containing transcripts. This relies on the CMU toolkit and a script that creates a vocab dictionary file.
 * Results:

Work with the rest of the team in generating models or improving this process.
 * Plan:

Update: Post Tuesday meeting. Modify the process to include the pronounciations of the words from a dictionary containing the phenomes *spelling*.

None
 * Concerns:

Week Ending April 26th, 2011

 * Task:

Friday: Created an very basic outline and wrote a general overview for the final report.

Sunday: Not much to report today. Will hopefully work on the scripts tonight. Read through some of the logs.


 * Results:


 * Plan:


 * Concerns:

Week Ending May 3rd, 2011
Saturday:
 * Task:

Repair transcript processing script - The script currently leaves some blank spaces in the beginning Create a script to produce a Dictionary with word pronunciations. Create a Language model with the word pronunciations

Materials: Dictionary from Sphinx's website. Transcripts. Script to process transcripts.

I keep forgetting that a perl script is executed with the word script *script name*. Way to be.

Successfully create a dictionary file. It took a long time to execute on Caesar but looks good. Next step is to create the Language model.

I could not find out where the dictionary with the pronunciation is used. Looking at each function. text2wfreq, wfreq2vocab, text2idngram, and even idngram2lm

Looking at Sphinx's website, they mention the dictionary being used to train though.


 * Results:

Wrote a script to combine a words file with a dictionary file to generate a unique words dictionary.

Created a unique words dictionary file.

Possibly generate a new language model using a dictionary file.
 * Plan:

The dictionary file does not appear to be used to generate a language model.
 * Concerns: