Speech:Spring 2011 KC Log


 * Home
 * Spring 2011
 * Proposal
 * Report

Week Ending March 8th, 2011
My task for the week of March 1st, 2011 through March 8th, 2011 was the same as the rest of the team. Each team member was assigned a machine that we were supposed to become familiar with and run tests on. In addition, the group that I am a member of, the Building Models group, needed to revise our portion of the proposal, that had previously been submitted. More detail needed to be added, as well as the length of time each detail should take. My group consists of James, Nick, and myself. Scott, who is in the documentation group was depending on our revision in order to complete the final proposal for the Capstone project.
 * Task:

In regards to the first task of the week, I googled a website in order to learn about using models from SphinxTrain. The website I visited was: http://cmusphinx.sourceforge.net/sphinx4/doc/UsingSphinxTrainModels.html This site offered information on using new models. It stated that to use new models usually required three steps: ·	Defining a dictionary and a language model ·	Defining a model and a model loader ·	Configure a frontend (optionally) Because a dictionary and model have not been decided upon, I knew that the dictionary and model would not be on my machine, Obelix. So, I decided to navigate around the speechtools directory and read some readme files, to learn more about the program. I came across the SphinxTrain README.txt file. This file referred the reader to other websites. I followed the link http://cmusphinx.org and looked through the website. This week’s second task of revising the proposal began with James making the first revision to the document. He e-mailed me a copy, I reviewed the copy and found a couple of discrepancies. I revised the copy and sent my revised copy to the group members. The changes were agreed upon, and the Building Models group submitted the revision to Scott for entrance into the formal document.
 * Results:

At this point and time I do not have a set plan for the coming week. I anticipate learning more about the Sphinx software and communicating with my Building Models group as to what our next step will be. I am still feeling lost as to the set up of the servers. I am not a strong Unix or networking person, therefore I am intimidated by the process. I like learning but am a little overwhelmed.
 * Plan:
 * Concerns:

Week Ending March 22, 2011
The task for the previous week was to present a final draft of sections 3.1 Preparing Switchboard Data and 3.2 Configuring Speech Tools, which are part of section 3.0 Building Models of the Capstone Proposal. The two aforementioned sections needed to be completed before section 3.3 Training Models and section 3.4 Testing could be completed. There were slight communication issues during the week. However James, the leader of the Data Set group, is quite knowledgeable in regards to the process needed to create the training set. He was able to provide the necessary documentation to complete the sections of the draft. The draft was submitted late, unfortunately due to another communication error. A meeting was to have taken place Monday, March 14, 2011 between James and Mike, at which point I believe James thought the draft was due. However, I could be incorrect on that point. The draft was completed thanks to James’ work. James also assigned tasks to the remainder of the group. These tasks are to be completed in the coming weeks. The tasks for the coming weeks are broken down such that: ·	James will be creating the training and test sets ·	Nick will be creating the script that will parse transcriptions from Switchboard to Sphinx ·	I will be creating the script that will call on an application to down sample audio files ·	Scott will be creating script that will generate new experiment directories according to the experiment directory structure.
 * Task:
 * Results:

As mentioned above, my assigned task is to create Perl script that will call on an application to down sample audio files. This task is to be completed by next Tuesday, March 29, 2011. I will start the task on Friday, March 25, 2011. My estimated completion date is Monday, March 28, 2011. In order to complete my task I will be doing research on the Internet. I will need to gain more knowledge of the Perl language, as well as do research on sampling audio files. No current concerns. As always, I hit roadblocks, but do not foresee them. I will address any issues, as best I can, as they arise. As always, I welcome programming suggestions, from any group member that may have an idea as to the best approach to take.
 * Plan:
 * Concerns:

Week Ending March 29th, 2011
My task for the week was to determine what the file format and the sampling rate of the switchboard language models is and what file format and sampling rate is required by the Sphinx software. Once these were determined than I was supposed to find a command line tool that would convert the given file into the needed file format. Next, I was to write a Perl script to use the tool. In addition to the above task, I was to attend a class taught by Mihaela Sabin. Her class is providing a software product to a client. The Capstone Project is to be the client, so both Mike and I presented the needs of the project to the class. The class is to design a user interface that will allow the user to view the experimental database information in a user friendly way. The class is currently in the requirements elicitation stage.
 * Task:

The CMU LM switchboard uses .wav files that are sampled at 8kHz. The Sphinx software requires a .wav file that is sampled at 16kHz. I perused a number of websites to find a tool that would convert the .wav file from 8 to 16 kHz. While looking for a tool I found that on the website: http://cmusphinx.sourceforge.net/wiki/tutorialam?s[]=frequency&s[]=wav&s[]=file there is directions of how to “Configure Sound Feature Parameters”. A command line tool is not needed, all that is needed is adjustments to the “etc/feat.params” file settings. In addition, parameters need to be configured in the “etc/sphinx_decode.cfg” file. Once these changes have been made, then the next step is to copy the sphinx_decode.cfg file from the an4 directory to the Capstone directory. Then edit the file and change any file names from an4 to the project name. This could be done using a Perl script. I feel the presentation that Mike and I presented to Mihaela’s class went well. I am now the liaison between the class and the Capstone project. The students have my e-mail address and will be e-mailing me with any requirements questions.
 * Results:

The parameter settings still need to be changed in the Sphinx software. Also, I still need to write the Perl script that will change the file names from an4 to Capstone in the Capstone directory. I will have these two items accomplished by next Tuesday, April 5, 2011. I have no concerns at this time.
 * Plan:
 * Concerns:

Bmq29You might be able to try this KC, but you would have to pull all of the sph files over to windows. http://library.rice.edu/services/dmc/guides/linguistic/converting-sph-audio-files-to-wav

Week Ending April 5th, 2011
My task as I understood it was to write a Perl script that would convert the CMU Switchboard audio file into the Sphinx audio file format. I believed that it required a change from an 8 kHz wave file to be converted into a 16 kHz wave file.
 * Task:

Wednesday: N/A

Thursday: Entered previous weeks' log into wiki.

Friday: Read all team members previous weeks logs to become more knowledgeable of overall tasks to be completed this week. Found that Scott would like some help with the user interface to the database so I will e-mail him and help him out.

COMMENT: Mike-jonas 14:32, 2 April 2011 (UTC) KC, read my comment on Scott's log...he is not working on a database, I think he's got a better understanding now.

Saturday: N/A

Sunday: SSHed into Ceasar, tried to find the transcript file, but failed. Believe I found the tool to convert .sph audio file into .wav file. Will continue work on both these items on Monday. C.Reekie, Admin 00:29, 5 April 2011 (UTC) Are you referring to the transcriber.jar? if so its at SpeechTools/sphinx4-1.0beta5/bin

Bmq29KC. Look at my work this week. Should help you out. Also, having trouble getting into Caesar, but it's in something like /media/data/Switchboard/transcripts/transcripts_words.text.

Monday: Found the transcript file on Ceasar and read the README file, understand it now. Started working on the Perl script to parse the file, then found Brian had already created the Perl script. Great job Brian.

Later in the week it was brought to my attention that the CMU Switchboard transcription files are in .sph format and that I was suppose to write a Perl script to convert the .sph file format into a .wav file format. Mike e-mailed me the location of the transcription files on Ceasar. I tried to access the files but ran into difficulty. Originally, I thought that all the members of Capstone had gotten the same e-mail so I sent out a group request to see if anyone else had difficulty, and/or how they got to the file. Sorry, for any confusion that may have given any other team members. Mike e-mailed me back and explained that I needed to put the path name in quotes. This I do not understand, but it worked. I accessed the file, and I read through the README file associated with it. That file helped a lot. However, I never would have know that the file was in .sph format if Mike had not told me. After reviewing the README file I was about to start the process of writing the Perl script. Before doing so went to the wiki pages and read through the wiki logs again. In Brian’s log I saw that he had been working on the same problem and had solved it. GOD bless Brian. I read through his code and knowing that he is a very good programming I am quite sure it will work, if not at least it is a starting point to begin. After the code has been tested then we can decide if the code needs to be tweeked. After reviewing the code I did do some research into Perl. I think that there may be a quicker way to perform Brian’s code using the Parse::RecDescent class. But that would take farther investigation, I cannot say for sure that using Parse::RecDescent would be better.
 * Results:

I would like to work with Brian and James and do a test run using Brian’s code and also hopefully run the mini train set if possible. I know that I am still lacking in my knowledge of the UNIX operating system. I have written down commands that I think are useful, but there is still a lot that I do not know. Not being able to find the transcription file pointed that out to me.
 * Plan:
 * Concerns:

Week Ending April 12th, 2011
My task was to work with Mihaela Sabin’s class and respond to any inquiries they may have. I tasked myself with gaining more experience with Linux. I wanted to run a speech test on Oberlix and hopefully run a test on my own laptop. Wednesday:
 * Task:

Thursday:

Friday:

Saturday: On Friday, I was in contact with the leader of the SpEAK database group (that is the name of the project that is creating a user interface for Capstone in Professor Sabin's class). We have set up a meeting time on Tuesday to meet, he will be asking me questions about Capstone's database. I will need to get specifics from the database group in order to relay accurate information, such as the database tables and their fields.

Sunday: Read through other member's weekly reports.

Monday:

Tuesday: Yesterday, I attempted to copy files from Ceasar to Obelix. I was unsuccessful, but will try again tonight.

I was in contact with both Mihaela and Michael Tierney. Michaela asked if I had Skype, I told her no but I could download it. I was unsure however if I could use it for a conference call because I have a very poor satellite Internet connection and Skype requires a high-speed Internet connection for a conference call. I offered to download it and test it out with her, should she have a conference call scheduled. I’m not sure if she understood my request. I think she thought I was going to test it with Michael Tierney. I do not see the need to spend time downloading a program and using disk space on my computer if she does not have a conference call need, so I will have to clarify this with her. I was also in communication with Michael Tierney who is in charge of the database group for the SpEAK project. He would like to meet me tomorrow, Wednesday April 13, 2011 to discuss the database group needs. I have worked on the database format, but still need some input from other members of the group. To familiarize myself more with the Linux system, I explored Ceasar a little more. I asked Brian at the last Capstone meeting how to access different directories and he helped me out. So I explored the actual .sph files and their headers.
 * Results:

I would like to work closer with a more experienced Linux user. Hopefully Nick will have time to help me finish loading the Linux virtual machine on my computer so that I can run speech recognition tests locally from my laptop.
 * Plan:

Although I have put some thought into my discussion with Michael Tierney, regarding the SpEAK database group, I should have started earlier, because I may have questions for other team members. Because I have waited hopefully I will not be putting a burden on a fellow team member with any questions that I might have.
 * Concerns:

Week Ending April 19th, 2011

 * Task:

Wednesday: N/A

Thursday:

Friday: Met with members of Mihaela's class on Wednesday, discussed the needs of the SpEAK project with them. Wrote up suggestions for a web page for their project today.

Saturday:

Sunday:

Monday: N/A

Tuesday:

Did this and that. Accomplished this and that. Next will do this. None this week.
 * Results:
 * Plan:
 * Concerns:

Week Ending April 26th, 2011

 * Task:

Wednesday:

Thursday:

Friday: n/a

Saturday: n/a

Sunday:

Monday: Tried to transfer files from Idefix and Ceasar to Obelix, but was unsuccessful. Worked with the dictionary on Idefix and the Create uniq words file. On Obelix, I explored files in the "speechtools/sphinxbase-0.6.1/test/regression" directory. I had difficulty opening up one .sh file. I tried to open the "tutorial-check.sh" file and recieved the error:
 * tar (child): an4_sphere.tar.gz: Cannot open: No such file or directory
 * tar (child): Error is not recoverable: exiting now
 * /bin/tar: Child returned status 2
 * /bin/tar: Error is not recoverable: exiting now
 * /bin/rm: cannot remove `an4_sphere.tar.gz': No such file or directory
 * find: `/tmp/temp1539/an4': No such file or directory

I was able to open all the other .sh files, all of them ran properly.

Did this and that. Accomplished this and that. Next will do this. None this week.
 * Results:
 * Plan:
 * Concerns:

Week Ending May 3rd, 2011

 * Task:

Wednesday:

Thursday:

Friday:

Saturday: Read through team member logs.

Sunday:

Monday: Working on Perl script for parsing the transcript files.

Did this and that. Accomplished this and that. Next will do this. None this week.
 * Results:
 * Plan:
 * Concerns:

Week Ending May 10th, 2011

 * Task:

Wednesday:

Thursday:

Friday:

Saturday: Worked on Perl script for parsing transcript files.

Sunday: Worked on Perl script. I think I've got it, however I'm not sure if I included everything that the decoder needs, I could not find the spec. James, or anyone, if you know if I am missing something please let me know, I can easily add it to the output file. If this solution is ok with James, then I will try to put the file on Ceasar. Chris showed me how to sftp, and I will try to do it myself, but no one should hold their breath for that, I may need help.

Here is the code:

Code ends here. Sorry, formatting is not very good.

Monday: Did this and that. Accomplished this and that. Next will do this. None this week.
 * Results:
 * Plan:
 * Concerns: