Speech:Spring 2012 Matt Vartanian Log


 * Home
 * Semesters
 * Spring 2012
 * Proposal
 * Report

Week Ending Feb 6th, 2012

 * Task:


 * Saturday (2/04/2012) - Installed VMWare and Ubuntu Linux on personal laptop to familiarize myself with the structure of the file system
 * Tuesday (2/07/2012) - Read other log files


 * Results:

I still need a lot of practice using command line Linux. Navigation of the file system is slow and confusing. Learning where Sphinx files are in the system.


 * Plan:

Spoke with Jon on Monday. We're going to try to catalog the Sphinx files in the system.


 * Concerns:

Not knowing how to use command line Linux is a huge slow down.

Week Ending Feb 13th, 2012

 * Task:


 * Friday (2/10/2012) - Read through other logs and proposal.


 * Results:

Tbd


 * Plan:

Tbd


 * Concerns:

Tbd

Week Ending Feb 20st, 2012

 * Task:

Tbd


 * Results:
 * Saturday (2/18/2012) - Read through other logs and proposal.


 * Plan:

Tbd


 * Concerns:

Tbd

Week Ending Feb 27th, 2012

 * Tasks:
 * Saturday (2/25/2012) - On Saturday I plan on reading other logs and starting on the proposal for the software team.
 * Sunday (2/26/2012) - On Sunday I am going to work on writing another paragraph for the software team's portion of the proposal and edit the existing draft.
 * Monday (2/27/2012) - On Monday I am going to try to mimic running the script we reviewed in class again, and get a better understanding of how the training system works. I am also going to read about soft links in my Unix book and see how we can apply the concept to our directory structure in order to create easier access to training files.

I'm going to focus on writing the proposal this week and on doing a better job with the format of my log files. I plan to work Saturday, Sunday, and Monday on the project because those are the only days I have available this week. The introduction to the proposal is something that needs work and I hope to contribute to that on Monday as its an important part of the process and would help me to catch up on my participation in the project overall.
 * Plan:

as well. I made very minor edits to the introduction and plan to work on the introduction and software sections of the proposal more extensively tomorrow. I am also going to speak with Jon tomorrow about the content of the software section of the proposal because there is a paragraph there that I'm not understanding entirely, and I can't derive its meaning from my class notes.
 * Results:
 * Saturday (2/25/2012) - Read through other logs and proposal. Spoke with Jon on the phone about status of the proposal, edits, and other work for the week that can be done involving the train process.
 * Sunday (2/26/2012) - I did a lot of editing of the wiki tonight. The rough draft of the proposal was hard to follow, so I worked on its readability and completeness. I wrote a fair amount of new material there
 * Monday (2/27/2012) - I didn't get a chance to run the script we reviewed in class, but I did some more editing of the proposal and added another paragraph that gives some more information about the nature of the timeline. I read through other parts of the proposal as well and see some holes, including the introduction. I'm going to continue reading/editing these where I can so I have a better idea of what each group is intending to work on in the following weeks.

The introduction to the proposal should be a broad description of the project in general, which requires knowledge of all the various parts. This is not an easy task because of the focus we've had on our groups thus far. I hope to combat this by finding time this week to thoroughly read through other project members' logs and actively explore the parts of the system they've been working with in an effort to gain a better understanding of the project as a whole.
 * Concerns:

Week Ending March 5th, 2012
This week I plan to work on Speech Tools and locating them on Majestix. I also plan to edit the entire proposal. I'm going to try to follow the schedule Jon and I have outlined for the next portion of the project up until March 19th. First part of that timeline involves creating directories on Majestix and locating the existing speech tools files.
 * Tasks:
 * Edit the proposal and fill out incomplete sections. Some of the proposal needs to be rewritten to match 3rd person perspective throughout. Other parts of it need to be edited for clarity. I planned to work on this after class on Tuesday. The entire document should flow nicely and professionally, and should be written in the future tense.
 * Locate Speech Tools and prepare their corresponding directories in their new home on Majestix. Actually, this won't be a new home, but rather a new install location that we plan to link to using soft links. I will also need to research how soft links work some more and continue to practice with Linux.
 * Plan:
 * Results:
 * Tuesday (2/28/2012) - I spent about 7 hours working on the proposal today after class. I collaborated with Jon and created a plan for editing the proposal as a whole for uniformity. Jon's role in this was to create uniform formatting for each section of the proposal. This included having a link to each contributor's section of the wiki at the start of their section of the proposal, and having a bullet list timeline that looked the same in each section. My goal was to improve the overall readability of the entire proposal. I edited for grammar, spelling, and content. I read through each section and made changes throughout for clarity. Additionally, I did my best to place the entire document in the future tense (as it is a proposal of what we are GOING to do), and converted it to third person by replacing "we" and "our" with "the team" or individuals' names where appropriate. I also co-authored the mini-train and full-train sections with Jon and wrote the introduction.


 * Sunday (3/04/2012) - Today I read logs. I reviewed our plans and schedule laid out in the proposal.
 * Monday (3/05/2012) - I wanted to spend some time today looking through Majestix to see where the speech tools currently are but I was asked for a password when sshing from Caesar. I'll see how Jon made out with this tomorrow. Instead I researched soft links and how the command works. I read through the man pages to see what options follow and the syntax of the program itself. I also did some research online at http://www.cyberciti.biz/faq/creating-soft-link-or-symbolic-link/ and read through some of "The Complete Reference" Unix textbook on safari books. I accessed this through the school's library database. It's a fairly useful reference and I'll keep it on hand for the remainder of the project.


 * Concerns: Unable to connect to Majestix. I need to be able to log in there and see what the structure is currently. Some of the directories I was expecting to see on Caesar aren't there. Namely: LOGS, SCRIPTS, and DOCS under the root directory.

Week Ending March 19th, 2012
Though I'm behind in recording my logs, my plan this week has been to create links to files on /mnt/main as they relate to installed files on Majestix which should allow for easier access for all students. I had lots of work going on this week but plan to catch up on Monday.
 * Tasks:
 * Install speech tools and located files on Majestix.
 * Create links to files on /mnt/main
 * Review logs
 * Plan:
 * Results
 * Saturday (3/17/2012)- Read other logs.
 * Sunday (3/18/2012)- Performed installation of speech tools on my virtual OpenSUSE.
 * Monday (3/19/2012)-


 * Concerns:

Week Ending March 26th, 2012

 * Tasks:
 * Plan:
 * Results
 * Thursday (3/22/2012)- Read logs
 * Saturday (3/24/2012)- I read 2 chapters of the Linux Administration Handbook. I learned about cron jobs and the crontab file. Also read an introductory chapter. Linux command line is a good skill to have and I have a hard time just putting in commands I've seen without knowing how they work or what they're doing. Guess I'm curious like that. The man pages are nice but I'm still figuring out how to interpret the definitions they provide.

Week Ending April 9th 2012

 * Tasks:
 * Read through guide in previous semester's wiki about running training and decoding. I will implement these steps on my own this week. Since I work from 9am-5pm this week, I will not be able to attend the Skype meeting for our group. I will ask Brice what the meeting covered and if there's anything I should know prior to running a train and decode session.
 * I plan on writing more in the installation guide in the information section of the wiki. The purpose of this section is to explain to future classes how to get sphinx in a usable location given our current network setup. The guide will be specific to our system and contain information about the install files, where they extract to, and how the make files work.
 * Review logs


 * Plan:

As of Saturday, most of my concerns are related to linux. I don't know where the script I ran created my task called "mattTrain". I don't see it anywhere in the directory the script created and the instructions from the summer semester don't say where to find it, as far as I could tell. If I were more proficient in linux I would simply perform a search, but I don't know how to do that. I will ask Brice about this in an email. I'm also unsure of the next step of the summer instructions (step 4 of setting up the task directory) which provides a copy command that doesn't work because there is no destination specified.
 * Results
 * Friday(4/6/2012)- Read through the summer notes and steps about training and decoding. They seem pretty easy to follow. I will be running through these on Saturday this week and seeing what kind of results I can get. I am anxious to finally get a chance to use the tools we have installed.
 * Saturday(4/7/2012) - I started running the scripts for training and decoding today. I completed all of the following tasks as root on caesar.
 * First, I created a directory which I named train8 inside of /root/speechtools/SphinxTrain-1.0/. In the train8 directory I ran a script called setup_SphinxTrain.pl with the "-task" option, and named my task mattTrain. Upon execution of the script, I saw many commands which appeared to replace files in the python//sphinx/ directory with files from the python/build/lib directory. With some quick research I found the .pl file to be a perl script. This interested me since I'm quite familiar with PHP and have heard that the languages are similar.
 * I took a look at the setup_SphinxTrain.pl file to see if I could get a better understanding of what it's actually doing. First thing I saw were some use statements, which I assume refer to some library functions or files to be included by the script. One of these was called "strict" which I believe in perl means that you have to at least declare variables before using them. I don't think they have to be initialized, however, so they're still loosely typed. At a brief glance, I noted that the script first creates an associative array of key/value pairs which refer to the options passed to it when the script is run. The user can choose to put the script into either "update" or "force" modes, and the script uses these optional parameters to determine whether the user wants to update previously existing files or completely overwrite them, respectively. It uses a simple 'if elsif else' statement to decide which message to display depending on the update or force mode set.
 * The script checks to see where the executable file is located. If it doesn't exist, it assumes that SphinxTrain hasn't been compiled and throws an error.
 * Overall, what I could determine about this script's function is that it is mostly just creating directories and copying files from the sphinx_pl (python scripts) directory to the current location, and to where the sphinx executable sits.
 * Next I had to familiarize myself with the vi editor. I've used it in the past but didn't remember how to change from command and insert modes. I needed to use vi to change a line in sphinx_train.cfg to reference the task name I passed as the mandatory option of the setup_SphinxTrain.pl execution statement. Basically sphinx_train.cfg needed to know where my task was located.
 * Sunday(4/8/2012) - I emailed Brice to see if he knows where the task directory gets created. I couldn't seem to find it anywhere in the "train8" directory which I created under /root/speechtools/SphinxTrain-1.0/ . I need to know where the directory gets created in order to copy the sphinx_train.cfg file there. I spent some more time looking around in the script itself, seems like the -task option does in fact get stored in a variable which is part of the options array. I couldn't see in the script where that directory gets placed, though.
 * Monday(4/9/2012) - I got a response from Brice in email. He explained how the task option was supposed to work but I still am unable to find where it's created in the filesystem. Hopefully in class tomorrow we'll have time to talk about this as a group. I read through the rest of the summer notes about training and decoding just to get an idea of what I'll be working on next. I am still planning on writing the installation guide notes, and will have to find some time next week to proceed with that as well.
 * Concerns:

Week Ending April 23rd 2012

 * Tasks:
 * This week I will continue work on the train
 * Review logs
 * Communicate with team
 * Record new findings in team log
 * Add content to the installation page


 * Results
 * Tuesday(4/17/2012)- During and after class I got pretty far with the train. Several of us worked together and communicated how to get past some parts. I'll list the particular details on our group log.

I didn't document the work we did as a group on Tuesday very well. I may have missed some important details about the steps I had to take to make the summer notes' steps work on our current system configuration.
 * Monday(4/23/2012) - I tried to continue the work we started last Tuesday. Details are in the group log. I also added to the installation guide and tried to make more accessible to those who've never used the Caesar network before. It should target an audience unfamiliar with the project, and I did my best to present it as such.
 * Concerns:

Week Ending April 30th 2012

 * Tasks:
 * This week I will continue to edit the information sections and write my section of the report
 * Communicate with team
 * Add content to the report section


 * Results
 * Tuesday(4/24/2012)- I spent a few hours editing the information section of the wiki. There were lots of grammatical errors and the content was very hard to follow. It now flows much more easily and targets an audience which isn't well versed in speech recognition software.

I didn't get a chance to start my section of the report. I have to do a bit of research to write it. As a result, I spent some time doing research on the history of speech.
 * Wednesday(4/25/2012)- Did some research on early speech use, up until present. I specifically looked for applications which use speech and their implementations. I want to write the speech section in a introductory style which is more general and open to an audience with no experience with speech recognition software.
 * Concerns:

Week Ending May 7th 2012

 * Tasks:
 * This week I will continue to edit the information sections and write my section of the report
 * Complete Overview of Speech section of the report
 * Edit all portions of the report for a more finalized draft


 * Results
 * Tuesday(5/1/2012)- I spent a few hours editing various parts of the information section of the wiki. While this isn't our main goal at the moment, the information section should be accurate, legible, and useful to future semesters. I think the readability is very important and I touched up a few parts and collaborated with Bethany on how to fix up another section.


 * Wednesday(5/2/2012)- I took some time to revise my opening of the Overview of Speech and plan to continue working on it. My goal is to have a final draft for the overview by Monday night.


 * Sunday(5/6/2012)- I wrote some more of the Overview of Speech and edited other sections of the report. I will continue this work on Monday. I am going to edit the entire report to make sure the flow of the entire document follows past tense and third person perspective.
 * Concerns: