Speech:Spring 2015 Russel Sweet Log


 * Home
 * Semesters
 * Spring 2015
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Week Ending February 3, 2015

 * Task:
 * Meet with team on hangouts to create/edit a rough draft on a proposal
 * Create a personal weekly schedule for activities that i will be involved in, and add it to the proposal by the 3rd.
 * Work on learning more about the system, the actual technology, and commands.
 * Results:

2/1 - Read some previous logs, and worked on finding resources for Linux commands.

2/2 - We met on the 2nd on hangouts and a draft of the proposal was drawn up by Stephen and Krista. We also decided that by tomorrow night (2/3), we should have personal schedules that we will add into the proposal, and deliver to the proposal group.

I also spent some time reading other peoples logs from previous years, and read the data groups logs from previous years to get a better sense of what our main objective is. This year, it seems that the data group is in charge of organizing the data into a cleaner structure.

I also spent some time researching how to use SSH, how to execute linux commands, and how to navigate the linux bash environment. As of right now i am still a little uncomfortable with my grasp on the bash commands, but i think once i start actually using the commands, it might become easier over time. That is about it.

2/3 - Today I worked on my personal timeline, as well as asked Dakota for help with connecting the server. Our group has decided that we will not connect until tomorrow during class, because we don't want to chance messing something up in the system while we are root. We will also be meeting together via hangouts to finish our draft of the proposal, figure out what specific tasks each person will have, and finish the personal timelines. once I am finish with mine, i will post it here. As far as personal research, i have gone through and made a cheat sheet of a bunch of linux commands. I used the unix commands that are on this wiki, along with a few others that may or may not be helpful. If i find any new helpful commands, i will post my findings. I will also post everything i needed to do in order to connect to the server, because i found it confusing and hard to find the information needed in order to establish the initial connection to Caesar with putty.

My only concern right now is not knowing a lot of bash commands, and i fear that i will have to devote a lot of practice to simply learning those commands before i can start actual work on my other duties as a data group member.
 * Concerns:

EDIT: Another concern that has popped up is the fact that i feel like i need to tread carefully on this project, but i don't want to tread so carefully that i don't get anything done. in order to remedy this, i think i will try to set up a dummy system that i can use to practice commands freely, then use that knowledge to work in the Caesar environment without hesitation.

Week Ending February 10, 2015

 * Task:
 * Access Caesar and do initial account setup
 * Get familiar with the current file structure.
 * Update any documentation about the file structure if needed.
 * Work on a draft of a new file structure.

2/5 - Yesterday and Today i set up the software needed to SSH on both my machines I use. I also was able to create my account, change my password, and start exploring Caesar and the slave servers. I have to admit, being able to finally get into the server, play with some commands, and see what actually needs to get done was helpful in both calming down my anxiety, and helping me create a better understanding of the task my group and i have. Hopefully our group will meet up this weekend and discuss specific jobs for each member, as well as the larger scope of our group.
 * Results:

For now I am going to play around with commands and get used to navigating the server with commands. I have a few resources on how to create soft links and managing files with commands, since alot of our group work involves these actions.

2/6 - Read some logs, and poked around on websites for helpful commands to know. I also finished installing ssh secure shell on my tower at home.

2/9 - Read some other logs learned some of the file structure

2/10 - met with my team mates, and we discussed a bunch of different things. we first discussed about assigning tasks to the different group members, and we came up with a few different tasks. We still need to hash it out, and we plan on doing that in class tomorrow. We also are still very confused with our main objective. As of right now, it seems like we are in charge of soft linking, and managing the corpus. in class tomorrow we plan on meeting with the other groups and hopefully getting a better idea of what we need to do. As for personal log data, i finally was able to get the ssh connection to work with my old tower at home. I had installed it, but for some reason, my connection was horrible to the server. it turns out that i was using other server software that i had forgotten about that uses SSH, which was taking up the port. I switched some settings around, and everything works now. I am hoping that tomorrow will shed some more light on the objectives of everyone, and i can start putting in some real work. I would be a liar if i said that i wasn't nervous, and that i was 100% confident in my skills to navigate the system. But, i'm hoping that some experience with the server and a clear goal will remedy this.
 * Plan:


 * Concerns:

Week Ending February 17, 2015

 * Task:
 * Start learning how to soft link further.
 * start work on soft-linking all of the wav files from the experiments so they can still function.

2/12 I researched the commands needed to softlink with the unix cli
 * Results:

2/14 Sadly, i was swamped with homework from other classes and actual work, so i wasn't able to accomplish some of my goals regarding the project, but i was able to do a little bit of research on data structuring and organization, as well as set up a nifty SSH client on my android phone. I use JuiceSSH, which can be downloaded from the google play store for free, and its actually a really good SSH client.

2/17 Today our group will be meeting online to discuss a few important details. The main detail is the task of actually assigning out tasks that we will perform for the rest of month, or until we form into larger groups. Another task i want to complete is the creation of an experiment, and hopefully the successful execution of an experiment. Stephen has gone ahead and created a directory for the data team to play with experiments. I personally think i will recreate a successful experiment instead of creating one from scratch. Dakota has also gone ahead and started working on soft-linking. I also have a couple of questions for Jonas tomorrow regarding the layout of the wiki, and if there are any rules we should be following in terms of layout. Some people are breaking their logs up to show the important details, where others like myself just create long text blocks of the day's progress. I know that wiki standardization was on the to-do list of the project, but i didn't know how high on the list it was.

EDIT: I just met with my group, and it looks like i have been tasked with continuing Jared's work on analyzing the switchboard data to determine how much data is there. Hopefully i can come up with a solid answer.


 * Plan:


 * Concerns:

Week Ending February 24, 2015

 * Task:


 * Results:

2/18 Today I worked on redoing the proposal with Krista. We changed the timelines to a more tasked based version, and we changed the passive vocabulary to a more affirmative style. I also did some poking around the logs of previous semesters for any information regarding my task, which is to analyze the switchboard corpus and find how how long it is.

2/19 Just checking in today. Also, editing the wiki on my phone seems to bring weird errors. I wonder why. I'll wait till next class and ask around if other people are having issues like I am.

2/23 Today i am using this post as a check in. I do want to note that my group met online today to discuss some general upkeep topics, and discuss our plan going forward. tomorrow i will attempt a train based of Stephen's log.

2/24 I had planned on doing a test train this evening, but i had fallen asleep at my computer. I will try to run a train before class on wednesday.
 * Plan:


 * Concerns:

Week Ending March 3, 2015

 * Task:


 * Results:

2/25 I helped Dakota and Krista work on soft linking the switchboard corpus so the links to the .sph files would work correctly. In order to get the soft linking done, i had to the /switchboard/dist/flat directory, and there are a ton of files that have broken links. Basically, we would correct these links. The command i used was " ln -fs /mnt/main/corpus/switchboard/dist/disk23/swb1/sw04746.sph /mnt/main/corpus/switchboard/dist/flat/sw04746.sph ". This specific example corrected the link for the file sw04746.sph. In order to reuse the command, i simply had to adjust the end file name, as well as make sure the /disk##/ directory was correct. this is a slow process, but it works. 2/26 Today Krista found a way to take the file names from the directory we were working on, and put them in excel to help automate the process. She had finished the /mnt/main/corpus/switchboard/dist/flat directory with amazing speed. Dakota and Krista seem like they will be able to finish the soft linking soon. As for my tasks, i will start to focus on working towards finding the length of the full switchboard corpus.

2/27 Checking in

3/3 Today I attempted a train, but i ran into a few difficulties. I didnt have time to figure it out because i had a meeting for a class and i was on my phone, so i figured i would try again in class tomorrow. Luckily steven has figured out how to fully train and decode, and we will be walking the rest of the data group through the train process.


 * Plan:


 * Concerns:

Week Ending March 10, 2015

 * Task:


 * Results:

3/4 Today i worked on the Data Group logs, and i helped Krista and Dakota with the clean/train directories. The problem that had come up was the fact that each sub directory of switchboard had a train directory and clean directory. Professor didn't want the clean directory because it was unnecessary. the fix we implemented was renaming the clean directory train, and deleted the original train directory. before we did that, we transfered the necessary components from the old train over to the new train.

within the first_5hr directory in switchboard, there are two directories called mono and monoRaw. We investigated these two directories, and both seem to be just filled with soft links linking back to the train directory. As a group, we decided to delete the mono and monoRaw directories.

Another Activity i want to log is the fact that i had to reset my password. I guess the systems group deleted all of the users passwords, so i simply had to reset my password with the "passwd" command.

3/6 I did some research on how i could get a good answer on how long the switchboard corpus actually is. I haven't found much yet, but i will keep digging.

3/7 Today i did a little more digging, but i really am using this as a check in. I am very swamped with midterms and midterm projects, as im sure everyone else is. Graduation will taste only sweeter after all of this.


 * Plan:


 * Concerns:

Week Ending March 24, 2015

 * Task:


 * Results:

3/11 I finally have access to a working computer again. My tablet needed to be factory reset, and my desktop pc has been in disrepair for a while. I dont have much to report on so far, except the fact that i got a friendly email from professor Jonas. He has offered me some helpful information regarding my logs, and from this post forward, i hope to follow his advice. After our meeting in class, The data group has been split in half, and me and Stephen are now in the bruins group. even though we have split off, the data group must still meet and perform important tasks that were given to us by professor Jonas. It seems that because of this extra workload, we don't get to have a spring break this year. Hopefully the tasks can be accomplished swiftly, so that i can focus on my bruins group.

3/13 As of right now, The data group and i plan on meeting digitally sometime monday evening. Monday is when we will divide up the work, and find the best way possible at accomplishing our tasks. As for work towards the bruins group, I plan on teaching myself how to train while i am at work. I know the tutorial was wrong, so i have no idea how i will teach myself how to perform the train until the tutorial is fixed, but i'm sure with some careful exploration can create some results.

3.14 (PIE DAY!!!!) Today i have some free time, so i am poking around the old experiments designated by professor Jonas (0001-0142) and seeing what needs to be deleted. Jonas sent an email to the data group saying that the wav and feat directories needed to be deleted. I think i will create a delete directory in each one and place the wav and feat directories in the new delete directory, because last time we deleted some directories he specifically said to delete, he changed his mind. This way if he changes his mind, no harm is done.

I also went ahead and looked at the switchboard corpus. For no specific reason, I cd'ed into the first_5hr directory, and saw that all of the files are pointing to a directory that just doesn't exist anymore. (/mnt/main/corpus/dist/: No such file or directory.) I dont think i will worry about this today, but i will bring it up with my group during our meeting monday night and see what they say.

I just finished making some progress on the cleanup of the Exp directory. i managed to finish the directories from 0001-0080. I will continue tomorrow, but right now i have to get ready for a party.

3/16 My group met on hangouts, and i assigned Stephen 0081-0120, and Dakota got the rest. I taught them what i was doing and how to do it.


 * Plan:


 * Concerns:

Week Ending March 31, 2015

 * Task:

3/27 Today i am going to start working on my part of the bruins attack strategy. First, to test my knowledge of running trains and what not, i am going to be using the tutorials on the wiki and my notes i took from our group meeting to make a 5 hr train. this will solidify my skills working with the trains and decode, and i can then start doing longer and more extensive trains. -EDIT- So after i set the directory up, i ran the nohup scripts_pl/RunAll.pl. & script, but yet it displays the error "Command not found". it seems that this script and others such as the decode scripts are having this exact same error. Our group is looking into the error, and hopefully a solution will be found soon. until then, i can not run a train or decode.
 * Results:

3/28 Yesterday, i tried running a 5hr train to simply learn how to run the train. Sadly, i was encountering errors with the scripts, and i could not get past the first step, which was the prepare experiment directory step. I worked with Sam, and tried a ton of different fixes to get it to work. This morning, i tried once more, but failed. I decided to log into root, and to my surprise, it looks like everything is working perfectly. My running theory of why it was not working while i was logged into my account is the possibility that my account just did not have the correct permissions. it is running right now, and i will edit this log once the train and decode are fininshed. ---EDIT--- Well, after plenty of learning and experimentation, i got a full train and decode with the first_5hr. here are my results | Sum/Avg | 3506 42940 | 74.9   18.5    6.6   16.8   41.9   94.0 | |=================================================================|     |  Mean   | 43.8  536.8 | 75.4   18.5    6.1   19.2   43.8   94.7 | | S.D.   | 20.2  247.5 |  7.4    5.9    2.7    9.3   12.2    6.5 | | Median | 40.0  486.0 | 76.7   17.6    5.4   17.7   42.3   96.9 | `-' I am hoping to start communicating with my partners and see what i can change when i start working towards finding a better baseline.

3/30 just checking in. Apparently there are problems with scripts, logging in and many other issues that are chasing problems for everyone in my team. I'll try to start a new train and decode, but I dont know how long it will take or if it will even be successful. Hopefully the professor will help wednesday.

3/31 this is another checking in. For some reason I can get in with my phone's ssh client, but I can't get access with my desktop client. I also see that the passwords have been changed again. I never received an email about that, but I'll change my password tonight and hopefully it won't get reset again. I'm going to attempt to start my train tomorrow during my break, but there have been wild issues across the board about scripts not working or trains failing, so hopefully I won't be plauged with those issues.


 * Plan:


 * Concerns:

Week Ending April 7, 2015

 * Task:

4/2 This morning i have started the decode for my first 256hr train, as you can see here by using the top command: 958 rrn34    20   0 93036  80m 1100 R 99.9  0.3  12:39.06 sphinx3_decode I will be checking my decode process periodically today to make sure it is working fine, and i will hopefully be contacting my partner to discuss the next step of action. I know a 256 hr train will take forever, so i want to get one or two going at one time to increase productivity. --EDIT-- So far so good. My decode process has been running for a good portion of the day so far. I figured i would share. PID USER     PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 958 rrn34    20   0  218m 208m 1100 R 100.0  0.9 283:48.31 sphinx3_decode
 * Results:

4/3 Today my 256 hour finished, and it looks like the baseline for a 256 is 61.5%. Thats not good, but its also good to know that we have a baseline. my group has been emailing back and forth, and we now have a steady attack strategy in terms of lowering the baseline. i will post my reults in the 0276/001 experiment directory. I plan on continuing with our attack strategy after i get out of work.

4/6 today I started a 5 hr train to test some new config changes to raise the base line. I will finish it tomorrow morning while I'm at work. I also have gotten some group emails regarding a new way to run small trains, so after I score this latest train I will attempt to try the new method of running smaller trains.


 * Plan:


 * Concerns:

Week Ending April 14, 2015

 * Task:

4/10 Today i am going to run another train with some new variables, and see if i can get some more improvements. I am also testing the waters and using miraculix today. so far, i am just setting up the directory for the train, but it is taking forever. I mean, gentrans is taking like 10 minutes, and it's only about half way complete. I am the only one on miraculix right now, and my processes are the only one running on miraculix as well. Hopefully this is just a hiccup, but if it continues to stay this slow, then i don't think i will stay using miraculix or any of the other drone machines for much longer.
 * Results:

4/11 Today i am finishing a train i started yesterday, and i altered some values in the sphinx_train.cgf file. After i altered the file to my liking, i started the generate feats command and got this error: ERROR: "wave2feat.c", line 655: Cannot read /mnt/main/Exp/0276/004/wav/sw2199A-ms98-a-0078.sph FATAL_ERROR: "wave2feat.c", line 90: error converting files...exiting My guess is that there are some soft link issues regarding the /wav/sw2199A-ms98-a-0078.sph file. I am going to go ahead with my train anyways for now, but ill do some poking around and investigating while it is running.

4/13 Checking in today

4/14 Today i am going to try to finish my train and decode. I know there has been a lot of problems, and a lot of relearning. I have spent the past few days looking at different configurations of the training and decode process, and seeing what helps the baseline. I have also spent some extra time trying to learn how to do the new decode process. I wont lie, didn't figure it out immediately. I can't just pick something up like that. But, i have made progress, and i think i will be able to run a successful adjusted decode after my train finishes today.


 * Plan:


 * Concerns:

Week Ending April 21, 2015

 * Task:

4/15 Today i am going to talk to Sam and the group about a possible new attack strategy regarding training and decoding. Also, as of right now, my train is still running. I will decode it as soon as i know that it finished. After we talked, we met in the ECL, and we built bridges out of kinects, as a "team bonding exorcise".
 * Results:

4/17 I am about to check Caesar if my train has completed. If it has, i am going to start a decode, and see if i can lower our baseline.

420 Checking in

4/21 My Trains and decodes keep failing on the current experiment i am working with... the past few days i have been playing with the thought that my trains or decodes were just using bad variables, but now i think my train was just done wrong or corrupted or something... im going to retrain tonight, and see if that helps my progress.


 * Plan:


 * Concerns:

Week Ending April 28, 2015

 * Task:


 * Results:

4/24 I am going to try to run a train/decode on a drone machine to see if i can improve the real time factor. Also, Caesar is being overloaded with processes right now (currently i think there are like 12 running), so being on a drone machine will help.

4/26 checking in

4/27 Today i check my train and decode, and i guess my train was killed/halted in the middle of the gentrans, so it's worthless. it is a little too late now to start a new train, so i think im going to recycle some older experiments.

4/28 Checking in. My log has started to get more and more barren as time goes on.
 * Plan:


 * Concerns:

Week Ending May 5, 2015

 * Task:


 * Results:

4/30 The group met for a short while to talk tactics on our strategy, then we went to the BBQ and the commencement fair and enjoyed a little break. It was much needed.

5/1 I started a train last night, hoping to create a lower WER before Wednesday. I am also using miraculix, because i do not want my processes being killed by someone accidentally on Caesar. Last week i was trying to run a train, and my process simply halted in the middle of the train. I think either the "plug was pulled", or they were killed to make room for other processes on the server. either way, if i stay on miraculix, no one should bother to touch my trains.


 * Plan:


 * Concerns: