Speech:Spring 2016 Brian Anker Log


 * Home
 * Semesters
 * Spring 2016
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Week Ending February 9, 2016
2/3: Determine Team Requirements. Look through the previous semesters log files. Determine key information pertaining to my team(DATA), such as broken links, missing data, and any other information that will be important to us during the project.
 * Task:

2/6: Read the two remaining logs for the data group in Spring 2015. Putty into Caesar and start exploring and documenting the file structure.

2/7: Read through the current semester logs to see how others are doing, what they have accomplished, and if they have any concerns.

2/8: Document the caesar file structure and learn unix commands.

2/3: I read through the two student logs in Fall 2014, skimmed through a few students logs in summer 2015, and then read through Krista Cleary's(DATA2015) and Stephen Griffin(DATA2015) logs in detail. There is still much to read, but I cannot continue, as I need to head to bed for work. Krista's log was pretty informative, detailing that she and her team worked on updating soft links, removing/renaming .wav files, fixing switchboard software so all directories are the same, and cleaning up the existing directories. Stephen's log was VERY detailed and informative giving a lot of information about what he did during the semester. He stated that the data team objectives for 2015 was to deal with word alignment, transcript and audio files(this was stated early in the semester). Through most of the semester Stephen worked mainly on creating trains and decoding information. I still have a lot to go through, but reading this information was very beneficial. Note: I did not learn any "key" information, such as intimidate tasks our team should work on.
 * Results:

2/6: Read the two last logs in the Data group from Spring 2015. Successfully used the VPN to putty into caesar.

2/7: Looks like there were a few people who already dove pretty deep into this project, which is great. Some have ran trains, and experiments, while others have found problems in file structure. I did notice that a lot of people have not written anything to their logs though and it's already Sunday..

2/8: I documented almost the entire Caesar structure, or at least directories that I felt were important. This was very beneficial to me because now I am much more comfortable with working in unix. While I was doing this, I also followed a unix tutorial to learn new and useful commands.

2/3: Within the next few days I will be navigating through Caesar's file system(mnt/main/corpus) and documenting file locations and descriptions for my own benefit. This will help me get a better grasp on file structure. I will also be continuing reading the logs from previous semester. 2/6: Next I plan to keep playing around with caesar and use online resources to better understand the unix os, and commands to make my life easier. Sometime within the next week I would like to review soft links and find all of them on caesar.
 * Plan:

2/7: Same as, 2/6.. I didn't have much time today because I wasn't feeling very well, so I still have a lot more work to get done tomorrow..

2/8: Learn more about unix, keep playing around with caesar, learn about soft links, further educate myself on sphinx.

2/3: None at the moment.
 * Concerns:

2/6: Time.

2/7: Time to figure all this out. I feel like I'm spending too much time reading logs and trying to research to be as knowledgeable as possible before attempting to make any changing or run scripts. I learn best by experimenting myself. So, I really need to start digging deep into this sooner than later..

2/8: Nothing at the moment.

Week Ending February 16, 2016
2/10: Play around with more unix commands, Try converting a .sph file to a wav file, look into soft links, try to understand training process, and work on proposal. 2/13: My goal for today is to start working on the group proposal(Created in Office 365 by Thomas Rubion), read the logs of my peers to see what they are up to and play around with unix. 2/14: Today I will be looking into soft links to better understand them, how they are created, found, and removed. 2/15: Read the logs of my classmates to see what they have accomplished. 2/10: Successfully converted a .sph to a .wav using sox! Instructions: Download sox, download the .sph file to your machine from Caesar using ftp(FileZilla works), create an environment variable to run sox anywhere in cmd. then use the command "sox infile.sph outfile.wav" to convert the file(infile.sph is the file you want to convert, and outfile.wav is the file you are creating(name it anything)). 2/13/: Started working on the group proposal. Still needs more work, but its a good start. Read peer logs to see what everyone is up to. Looks like a select few have been going ahead and training. 2/14: I did some searching on the internet and easily found how to create and remove soft links. This is done by using the command "ln -s [/path to file] [link name]". I was playing around with creating soft links in my home directory. I was able to successfully create a soft link and then open the file in the soft link directory. I also looked into how to view broken soft links in unix and saw that many websites had referenced the same command "find -L . -type l". I used this command on the main directory to see if I would find any broken soft links and I got the following output: [bja2020@caesar main]$ find -L. -type l find: `./backup': Permission denied find: `./root/test/sphinxbase': Permission denied ./root/tools/SphinxTrain-1.0/train_dirs ./Exp/0251/009/wav ./Exp/0251/009/data ./Exp/0251/010/wav ./Exp/0251/010/data
 * Task:
 * Results:

I will revisit this information with my team. 2/15: I read the logs of my classmates and was surprised that about half of the people haven't seemed to have done much of anything. Maybe they have and they just didn't write in their logs, but seeing how this is a team effort I was quite surprised. 2/10: Work more with unix commands, look into soft links, try understanding the training process, and work on proposal. 2/13: Talk to my team about revising and updating the proposal I made, look into training, look into soft links, keep messing around with unix. 2/14: I would really like to understand the training process and run a train sooner than later. I read that Sam Sweets logs were very helpful with this, so I hope to read through this process next time. 2/15: Same as 2/14. I would like to go through the training process next time I log in. Today I was very busy and didn't have time. 2/10: Two of my team members were out this week. 2/13: Same as 2/10. 2/14: I feel that there isn't much of any communication going on between groups. I will try to start openly communicating with others to help jump start the process. 2/15: Effort from all 20 students in the class.
 * Plan:
 * Concerns:

Week Ending February 23, 2016
2/17: Out of the 256hr audio utterances, my team was assigned to convert and compare 120 utterances to their correlating word files. We did this by running the command "head -n 30K 001_train.trans | awk '!(NR%250)'" in Caesar. This command took every 250th utterance out of the first 30,000 utterances to equal to 122 utterance files. We then split them up into 30-31 each. We then exported this information to a text document so we could share it with each other. 2/18: Today I planned to go through all 30 utterances and compare them to their word files. 2/20: Check in with others and read peer logs to see how they are doing.
 * Task:

2/17: I copied the first 30 utterance files a folder in my home drive by using the command "cp - i sw....sph $home/SPH_Files_For_Transfer" for each utterance. Then I used filezilla to copy those to my desktop. After copying them to my desktop I used sox to convert all of them to a .wav using sox. 2/18: I went through and compared all 30 files and have some interesting results.. I only sampled 30 of about 250,000 utterance files and found one very minor pronunciation issue and three utterance or audio files that were not consistent their word files. Either an utterance had an added word or the word file had an added word. That's 3/30 bad files, and 1/30 that have pronunciation problems.
 * Results:

Pronunciation Issues: Utterance: AND AND IT'S THEIR MISSION AS THEY DO AS THEY GO DOOR-TO-DOOR AND THEY GO OUT INTO THE PUBLIC AND THEY ACTUALLY HAVE THE UH TEENAGERS SERVING TWO YEARS LIKE YOU WOULD SAY LIKE IN AN ARMY AND TWO YEARS IN GOING AROUND AND DOING MISSIONARY TYPE WORK (sw2015B-ms98-a-0040) Notes: The "and and" sounds like "an and". Again this is minor, but if there are a lot of these small issues it could be a problem. Added/Missing Words: Utterance: NO I DON'T THINK IT'S A MONETARY THING (sw2064A-ms98-a-0080) Note: "Thing" was not said in the utterance. Utterance: UM-HUM (sw2082A-ms98-a-0011) Note: Sounds like "uh mum, mmmm". "AND THEN" was said at the end of this clip, but is not recorded in the text" Utterance: YEAH  (sw2092B-ms98-a-0009) Note: Has other persons voice saying what sounds like "charts" before I hear "Yeah". 2/20: Read logs from peers. 2/21: Update the proposal format on Wiki.

2:17: Next time I log in I plan to go through all the utterance .wav files and their correlating word files. 2/18: Next I plan to speak to my team mates about what I have found and then read logs of others. 2/20: Keep playing around with unix and look into running a train. 2/21/: Updated the proposal format on Wiki. 2/17: None 2:18: If I found errors on the first shot, how many more will there be? The Data Teams portion of this project is overwhelming. 2:20: None 2/21: None.
 * Plan:
 * Concerns:

Week Ending March 1, 2016
2/24: In class our team found of how to grab the next set of data by using the command " head -n 60k| tail 30k | awk'!%125'". We also decided we would sample 245 files per week instead of 122 files. This week I will be transferring over my 61 files, converting them to wav files, and then listening, comparing, and documenting the set of data. 2/26: Listen to the 61 audio files(utterances) to check for errors. 2/27: Document my findings. 2/29: Review peer logs. 2/24: I copied all 61 files into a directory in my home drive called "Week2_SPHFiles" using the command cp /filepath/{file1,file2,etc} $home\Week2_SPHFiles. This worked flawlessly. I was able to get the file names with the commas by using excel and its concatenate function. After copying the files to my computer I used Justin's script to convert all 61 files to .Wav files. 2/26: The first 15 wav files were good with the exception of one that had incorrect words associated with the audio file. 15-61 were completely wrong, the audio was heard as "you know you can't even buy a loaf of bread in this country" in all of my remaining files. 2/27: sw2325B-ms98-a-0049: Text"Uh-HUH" Should be"Uh I mean she loves". sw2334B-ms98-a-0081.sph, sw2335B-ms98-a-0025.sph,sw2335B-ms98-a-0098.sph,sw2336B-ms98-a-0071.sph,sw2337A-ms98-a-0088.sph,sw2338B-ms98-a-0065.sph,sw2339A-ms98-a-0032.sph, sw2339B-ms98-a-0125.sph,sw2340B-ms98-a-0101.sph,sw2341A-ms98-a-0040.sph,sw2342A-ms98-a-0051.sph,sw2343A-ms98-a-0042.sph,sw2343A-ms98-a-0144.sph, sw2344A-ms98-a-0063.sph,sw2345A-ms98-a-0043.sph,sw2346B-ms98-a-0035.sph,sw2347B-ms98-a-0043.sph,sw2348B-ms98-a-0030.sph,sw2349A-ms98-a-0009.sph, sw2350A-ms98-a-0028.sph,sw2350B-ms98-a-0113.sph,sw2352B-ms98-a-0070.sph,sw2353A-ms98-a-0020.sph,sw2354A-ms98-a-0008.sph,sw2355B-ms98-a-0005.sph, sw2355B-ms98-a-0091.sph,sw2356A-ms98-a-0054.sph,sw2358A-ms98-a-0046.sph,sw2359A-ms98-a-0051.sph,sw2360B-ms98-a-0044.sph,sw2361A-ms98-a-0011.sph, sw2361B-ms98-a-0086.sph,sw2362A-ms98-a-0095.sph,sw2363B-ms98-a-0032.sph,sw2363B-ms98-a-0173.sph,sw2365B-ms98-a-0093.sph,sw2367A-ms98-a-0003.sph, sw2368B-ms98-a-0012.sph,sw2368B-ms98-a-0101.sph,sw2369B-ms98-a-0069.sph,sw2370A-ms98-a-0046.sph,sw2371B-ms98-a-0024.sph,sw2371B-ms98-a-0131.sph, sw2372A-ms98-a-0088.sph,sw2373B-ms98-a-0075.sph,sw2374A-ms98-a-0031.sph All the above recordings have the same audio clip played "you know you can't even buy a loaf of bread in this country".I thoguht there could have been an error while converting them using sox, but I copied over a few more to double check and they were the same file. 2/29: Reviewed Peer Logs. 2/24: Go through all the .wav files and document any issues. 2/26: review peer logs, learn about running a train. 2/27: same as 2/26 2/29: Learn about running a train. 2/24: None at the moment. 2/26: Documenting our progress. 2/27: Same as 2/26 2/29: Nothing big, documenting our results.
 * Task:
 * Results:
 * Plan:
 * Concerns:

Week Ending March 8, 2016
3/2: Download and convert 61 audio files out of the 60k-90k of the 250K. *NOTE* This past week our team discovered that 11,158 files were corrupt and had the same audio file played from  utterance sw233A-ms98-a-0166 through sw2416B-ms98-a-0143.When sampling the audio files we used the mnt/main/corpus/switchboard/256hr location. 3/5: Listen to all the audio files and compare them to their word files. 3/6: Check if corrupt utterances in the previous week are in the "full" audio folder instead of the "256hr" audio folder(located in mnt/main/corpus/switchboard/" 3/7: Read peer logs/ Document findings.
 * Task:

3/2: All 61 audio files from the 60-90K set were downloaded and converted to my workstation. 3/5: I reviewed all 61 audio files. There were a few missed words, but nothing substantial like last weeks findings of over 11,000 incorrect audio files. 3/6: I copied over 13 utterances from the "full" folder instead of the " 256hr" folder. In the 256hr folder these utterances were incorrect and just sampled the same audio string. These utterances are in fact the correct audio recordings. It might make sense to make a new corpus by copying over all files in the 256hr and then replacing the corrupt files with the recordings from the "full folder".. Verified Utterances: sw2334B-ms98-a-0081.sph,sw2336B-ms98-a-0071.sph,sw2339B-ms98-a-0125.sph,sw2343A-ms98-a-0042.sph,sw2346B-ms98-a-0035.sph,sw2349A-ms98-a-0009.sph,sw2352B-ms98-a-0070.sph,sw2355B-ms98-a-0005.sph,sw2358A-ms98-a-0046.sph,sw2361A-ms98-a-0011.sph,sw2368B-ms98-a-0101.sph,sw2371B-ms98-a-0024.sph,sw2373B-ms98-a-0075.sph. 3/7: Read team logs. Seems everyone's doing well and no more corrupt files have been found as of 3/7. 3/2: Tomorrow I plan to listen to all the audio files for errors. 3/5: Check if the full and 256hr folders to listen to compare the utterance files. 3/6: Check up on peer logs. 3/7: Prepare for class on Wednesday. Present findings to team. 3/2: None 3/5: It seems the modeling group is finding more issues related to data. We simply cannot handle the amount of work that is required to fix all these issues. 3/6: same as 3/5. 3/7: Nothing major.
 * Results:
 * Plan:
 * Concerns:

Week Ending March 22, 2016
3/14: Run a train by following the tutorial on Wiki. 3/15: Build a language model + Decode for the train I ran last night. 3/19: review all the emails that were sent back and forth between the modeling group and professor Jonas. There was A LOT of key information within those messages pertaining to the Data group. 3/20: Look into the newly made scripts that James and Jon wrote to re-create the corpus.
 * Task:

3/14: Was able to run a 5 hr train from Asterix. The script paths from /mnt/main/script/user/ are not mapped to asterix at the moment. 3/15: Successfully ran a train on first_5hr corpus, built a language model from the first 3000 files, and decoded on the trained data. Below are my results from the train. SYSTEM SUMMARY PERCENTAGES by SPEAKER
 * Results:

,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |-+-+-|     | sw2001b |   18    163 | 79.8   16.0    4.3   39.9   60.1  100.0 | |-+-+-|     | sw2001a |   14    101 | 74.3   23.8    2.0   52.5   78.2  100.0 | |-+-+-|     | sw2005a |   13    276 | 80.4   14.9    4.7   13.4   33.0  100.0 | |-+-+-|     | sw2005b |   27    307 | 61.6   22.5   16.0   16.0   54.4  100.0 | |=================================================================|     | Sum/Avg |   72    847 | 72.7   18.9    8.4   24.1   51.4  100.0 | |=================================================================|     |  Mean   | 18.0  211.8 | 74.0   19.3    6.7   30.4   56.4  100.0 | | S.D.   |  6.4   96.3 |  8.7    4.5    6.3   18.9   18.6    0.0 | | Median | 16.0  219.5 | 77.0   19.2    4.5   27.9   57.3  100.0 | `-' 3/19: After reviewing all the emails carefully, I have constructed a list of projects that should be completed asap or in the near future. Some of these items might be completed by the Modeling group. All these tasks are within the emails between Jonas, the modeling group, and the data group.: -Verify that the .sph header size is in fact 1kb(1024 bytes). As it sits the characters of in the header only add up to 154 bytes, but it would only make sense for the header to be 1kb. -Check to see if sox uses some sort of "filler" when creating the utterance files. i.e - are the files rounded up(ex. 4.02349953245 = 4.0235) when using sox? Could this small amount cause the audio file size to be bigger than the transcript size? -Clean up /mnt/main/corpus switchboard by moving 125hr_3170/, 3170/, 256hr/, first_5hr/ and fixed30K/ into .../switchboard/old/ and then create 5hr/ and 150hr/ and 300hr/ (and of course keeping full/). Rename first_4hr/ to first_1K_test/ and use it as your test corpus but only for that purpose, not to actually attain real results. -Why is our First_5hr corpus 3.8 hours? -Create diagram for the new corpus layout and stick it in wiki. -Look into the total # of hours in the conversation files. It is supposed to be "over 240 hours", but when James and Ben added the total number of hours up it equaled 258.8563?? Something is very off here. 3/20: Took a look at the scripts that James and Jon created. They look good. We will have to meet with them on Wednesday to go over the results.

3/14: Tomorrow I hope to finish the process by building a language model and decoding. 3/15: 1.Spend more time researching model building,training, and decoding. 2. Look into the total # of hours in the conversation files. It is supposed to be "over 240 hours", but when James and Ben added the total number of hours up it equaled 258.8563?? Something is very off here. 3. Go through all the emails that have been going back and forth between Jonas, the Modeling Team and the Data team. There was a lot of key information in these, but it was over my head at the time. 3/19: 1.Spend more time looking into training,building a language model, and decoding. 2. Become familiar with PTM(Phonetic Tied Mixtures) and SCTM(State Clustered Tied Mixtures). 3/20: 1. Spend more time looking into training,building a language model and decoding. 2. Become familiar with PTM(Phonetic Tied Mixtures_ and SCTM(State Clustered Tied Mixtures). Speak with the Data Team and the Modeling Team on Wednesday about our future tasks. 3/14: none at the moment. 3/15: None at the moment. 3/19: Amount of work that needs to be done. I just dropped my hours at work so I would have more time during the week for this project. 3/20: n/a
 * Plan:
 * Concerns:

Week Ending March 29, 2016
3/24:(1) Clean up /mnt/main/corpus switchboard by moving 125hr_3170/, 3170/, 256hr/, first_5hr/ and fixed30K/ into .../switchboard/old/ and then create 5hr/ and 150hr/ and 300hr/ (and of course keeping full/). Rename first_4hr/ to first_1K_test/ and use it as your test corpus but only for that purpose, not to actually attain real results.(2) Create New Corpus from the scripts created by Jon Shallow.(3) Create URC Poster 3/26: Create a 5hr test corpus in my home drive with the scripts created by Jon Shallow. 3/28: WER is around 45% on the newly built 150hr Corpus. the data team will be grabbing every 1000 utterances of the full trans, which is about 250 of the whole corpus. The four of us will be listening to about 62 files each and inspect them for consistency. Justin - 1-62 ,Brian A.- 63-125,Brian D. - 126-188,Brenden - 189-250 3/29:Inspect all the audio files.
 * Task:

3/24: In class on Wednesday 3/23 the data team cleaned moved the 125hr_3170/, 256hr, fied30k and the 3170 soft link into the old directory under mnt/main/corpus/switchboard/. Student were working with the first_5hr so we didn't want to touch that. Jon just completed the scripts to make the new Corpus so we will be creating the new 5hr, 150hr, and 300hr corpus before next class. 3/26: Making 5hr corpus (Walk Through - Process)
 * Results:

·        NOT Using Sudo(used my account): ·        Creating directory structure, copying full trans into it ·         cd /mnt/main/home/sp16/bja2020 ·        perl /mnt/main/scripts/user/makeCorpus.pl CorpusTest ·        cd CorpusTest/info/misc ·        cp /mnt/main/corpus/switchboard/full/train/trans/train.trans. ·        Sampling train.trans to get ~5hr trans(starting with full 311 hr transcript, working in CorpusTest/info/misc) ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans ·        311.761 (full 311 hour trans confirmed) ·        perl /mnt/main/scripts/user/sampleTrans.pl -r 51 train.trans ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans-sampled ·        6.15305 ·         mv train.trans train.trans-full (archive full transcript as train.trans-full) ·        rm train.trans-remaining (throwaway extra sampling info, not needed) ·        mv train.trans-sampled train.trans (sampled trans to work with) ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans ·        6.15305 (confirms train.trans is the correct sampled trans) ·        Sampling our ~6 hr trans to get eval (~30m removed), dev(~30m removed), test/trans/train.trans (~30m not removed), and train/trans/train.trans (remainder, approx ~5hrs). ·        perl /mnt/main/scripts/user/sampleTrans.pl -r 11 train.trans ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans ·        6.15305 (will be train.trans-old1) ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans-sampled ·        .57976 (will be dev.trans) ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans-remaining ·        5.57329 (will be new train.trans) ·        mv train.trans train.train-old1 (arching 6.15305 hr train.trans) ·        mv train.trans-sampled ../../test/trans/dev.trans (.57976file now dev.trans in test/trans) ·        mv train.trans-remaining train.trans (prepare to sample leftover 5.57329 hr) ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans ·        5.57329 (confirms train.trans is the leftover trans, ready to sample and remove eval) ·        perl /mnt/main/scripts/user/sampleTrans -r 10 train.trans ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans-sampled ·        .545226 (will be eval.trans) ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans-remaining ·        5.02807 (will be new train.trans with eval.trans samples and removed) ·        mv train.trans-sampled ../../test/trans/eval.trans (sample now test/trans/eval.trans) ·        mv train.trans train.trans-old2 (archiving again) ·        mv train.trans-remaining train.trans (now working with remaining 5.02807) ·        perl /mnt/main/scripts/user/sampleTrans.pl 10 train.trans (sampling WITHOUT removing) ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans-sampled ·        .511358 (will be test/trans/train.trans) ·        mv train.trans-sampled ../../test/trans/train.trans (sample now /test/trans/train.trans) ·        awk '{total += $3 - $2} END {print total / 3600}' train.trans ·        5.02807 (confirming what will be our train/trans/train.trans) ·        cp train.trans ../../train/trans/train.trans (copying train.trans into train/trans) ·        ls (files left in info/misc) ·        train.trans ·        train.trans-full ·        train.trans-old1 ·        train.trans-old2 ·        Verifying test files ·        cd ../../test/trans/ ·        ls ·         dev.trans (awked = .57976) ·        eval.trans (awked = .545226 ·         train.trans (awked = .511358) ·         Creating utterance links ·         cd ../audio/utt (going into test/audio/utt) ·         perl /mnt/main/scripts/user/linkTransAudio.pl ../../trans/dev.trans /mnt/main/corpus/switchboard/full/train/audio/utt/ (creating links for dev.trans to full corpus utts) ·         ls -l | wc -l ·         448 ·         perl /mnt/main/scripts/user/linkTransAudio.pl ../../trans/eval.trans /mnt/main/corpus/switchboard/full/train/audio/utt/ (creating links for eval.trans to full corpus utts) ·         ls -l | wc -l ·         895 ·         perl /mnt/main/scripts/user/linkTransAudio.pl ../../trans/train.trans /mnt/main/corpus/switchboard/full/train/audio/utt/ (creating links for test/trans/train.trans to full corpus utts) ·         ls -l | wc -l ·         1296 (final number of utts in test/audio/utt ·        Creating utterance links for 150hr/train/audio/utt ·        cd ../../../train/trans (now in CorpusTest/train/trans) ·        ls ·         train.trans (awked = 5.02807) ·        cd ../audio/utt (now in 150hr/train/audio/utt) ·        ls ·         empty ·        perl /mnt/main/scripts/user/linkTransAudio.pl ../../trans/train.trans /mnt/main/corpus/switchboard/full/train/audio/utt/ (creating links for train/trans/train.trans to full corpus utts.) ·        ls -l | wc -l ·        4016 ·         exit 3/28: transferred over my 62 utterances to my computer. I will be listening to them tomorrow. 3/29: All of the utterances were fine. 3/24: create a test corpus in my home folder with the scripts that Jon created. 3/26: Run a train on my newly created corpus 3/28: Listen to the 62 utterances and finish running my train if I have time. 3/29: Make a plan with my new team.. If I have time I would like to run a train on the corpus that I built. 3/26:None at the moment.
 * Plan:
 * Concerns:

Week Ending April 5, 2016
4/1: (1)Created a plan with my team(Captain America). (2) Will be creating a 10 hr corpus in the switchboard directory (Brenden and Justin will be creating a 30hr and 75hr corpus.) I will be going to Georgia this weekend for a wedding and won't have much free time. 4/2: just checking in. At a wedding for the weekend. 4/4: just checking in. At a wedding for the weekend. I hope to build that ten hour corpus tomorrow. 4/5: This has been a very very busy week for me and I was not able to complete the 10 hour corpus. I will try to do this tomorrow morning before class. 4/1: made a plan to create a 10 hr corpus. 4/2: checking in. 4/1: will be creating a 10hr corpus 4/5: I did not have time this week to create the corpus. I will try to create it tomorrow before class.
 * Task:
 * Results:
 * Plan:

4/1: none at the moment.
 * Concerns:

Week Ending April 12, 2016
4/6: This week I have a few tasks that I need to finish. The first is to do an introduction and objective write up for our groups URC poster. Second, I need to create do some research for my team. Third, I need to create a ten hour corpus. 4/7: Do some team research for my team. 4/8: Create URC write ups for our URC poster. I will be doing the information on our introduction and our objectives. 4/12: Collaborate with my team mates in Captain America.
 * Task:

4/6: I went and checked out the other posters from downstairs on the first floor to get an idea of what I should be writing. Brenden will be putting together the power point slide once Justin and I get him our write ups. 4/7: The information I researched is secret and only available to the Captain America Team. 4/8: Wrote up the URC introduction and objectives and sent the information off to Brenden who will be creating the power point slide. 4/12: Collaborated with my team mates in Captain America about our plan to win the lowest word error rate competition. We came up with a plan for team members and training that should get us ahead.
 * Results:

4/6: Still need to do some research for my team and get a better understanding of the parameters used in the train config file. I also will need to complete the write ups for Brenden and possibly get that ten hour corpus built. 4/7: I still need to do more research, but I will not be providing information on what research I am doing. I also need to do a write up on for the URC poster. 4/8: I plan to still do more research for my team, Captain America. 4/12: Will be planning with the Data team on our URC presentation and also collaborating with Captain America on our past weeks results.
 * Plan:

4/6: Nothing critical. Just now trying to balance tasks between the data team and our new groups... 4/7: At this point it's about the same as April 6, nothing critical. Just trying to balance the new tasks I am assigned with the ongoing projects of the data group. 4/8: I'm finding it hard to figure out what to write in my log because I am trying to keep information secret and within Captain America. I would like to get credit for my logs.. 4/12: All the work that is going on at the end of the semester. I also am studying abroad this summer and the course work is overlapping with the final weeks of this semester =(
 * Concerns:

Week Ending April 19, 2016
4/14: I volunteered to be one of Captain America's research log editors. So this is what I will be focusing on this week. I will be editing and reconstructing our research log. The data Team was also tasked with decoding on a 300hr corpus so the decode only takes about a quarter of the time. I was speaking with Matt about this on Wednesday and I feel we are pretty confused with what actually needs to be done.. 4/15: Work on the research log for team america. I plan to get the majority of it together for our class on Wednesday. The data team will not be doing the 300hr corpus decide that Jonas requested. It doesn't make sense to do at this time. 4/17: Work the on the log for Captain America. Speak with Data team about the URC Conference. 4/28: Keep plugging away at the log for Captain America.
 * Task:

4/14: Editing Research log. This will take some time to put together. Will need to speak with data team members about the decode on the 300hr corpus. 4/15: Awesome results with the log. I came along way with it. 4/17: Did a little editing in our log. The urc conference is this week and I still need to check in with the data team about it. 4/18: Updated the log with new information from this weekend. I will need to keep doing this each week to stay on top of all it. 4/14: Keep picking away at the research log and collaborate with the data team on what we are going to do about the decode on the 300 hr corpus 4/15: Try to breath more, so I don't pass out. Work with Captain America and the Data team on upcoming tasks. Also plan to keep working on the log. 4/17: Check in with the data team about the URC conference. keep plugging away with research and logs for my other team, Captain America. 4/18: Work with the data team to make a plan for the URC conference. 4/14: balancing all these tasks with my other classes. I have so much going on right now.. =( 4/15: Stress from the end of the semester. I think my heart might give out. 4/17: End of the semester stress. 4/18: At the moment not too much.
 * Results:
 * Plan:
 * Concerns:

Week Ending April 26, 2016
4/21: Just checking in. I will be planning to keep Captain Americas log up to date with everything that has been going on. Also the data team will be performing a full decode on the 300 hour train between one server in Team Spark and one from Captain America. 4/23: Update the log for Captain America with the newly acquired data. Keep the communication going between our team and hope to make a lot of progress. 4/25: Keep up with information with whats going on between the Data Group, Captain America, and Team Spark. 4/26: Just checking in today. I have been very busy this past week with other work and was not able to contribute as much as previous weeks.
 * Task:

4/21: Have not done anything today. Just checking in and planning my tasks for tomorrow. 4/23: I updated the log with current information. I will keep double checking when more is available. There are some spots that I need to fill in for it to be complete. 4/25: Read through informational emails going back and forth in Captain America. Spoke with Brenden about the 150 hour decode he is running. 4/26: Just checking in today. I heard of talk of a truce between the two competing teams. Will find that out tomorrow. 4/21: I will be planning to keep the captain america log up to date and also be a part of the 300 hour decode between the two servers. Right now Brenden and Justin have been doing that. 4/23: Keep doing what I'm doing. It seems the data teams tasks have slowed down since the break off into Captain America and Team Stark. I will be checking in with Brenden and Justin to see how the decode went on on the two servers from each team. 4/25: Keep puttin' away with the research log. 4/26: Class tomorrow. Get info on what we will be doing on the last few weeks for the capstone class. 4/21: Getting everything together before the end of the semester. 4/23: Same as always...right. 4/25: n/a.. 4/26: No concerns at the moment.
 * Results:
 * Plan:
 * Concerns:

Week Ending May 3, 2016
4/28: Just checking in today. I need to edit the research log with a hypothesis for every experiment we ran. This will involve cross referencing the research on each parameter we looked up when we started team Captain America. A detailed and clear hypothesis needs to be written explaining what we thought the parameter change would expect. 4/30: Just checking in today. I have been very busy these past few days with other classes. 5/1: Edit the research log with a hypothesis for every experiment we ran. Like mentioned above this will involve cross referencing the research on each parameter we looked up when we started team Captain America. A detailed and clear hypothesis needs to be written explaining what we thought the parameter change would expect. 5/2: Edit the Captain America Research log.
 * Task:

4/28: Since I am checking in right now, I do not have any results. There are a few parameters that are not well researched and might require me to dig in a little more. 4/30: Just checking in today. I do not have any results to share. 5/1: Just checking in today. I was busy this weekend and wasn't able to get to the log. 5/2: Filled in the information needed in the hypothysis section of the research log. There is possibly more editing that needs to be done,but I will be asking Severna to complete this part of the log as we were working on this as a team. 4/28: I plan to fill in the hypothesis for each experiment that we ran and also clean up the look of the log. 4/30: I plan to fill in the hypothesis for each experiment that we ran and also clean up the look of the log. 5/1: Research log. 5/2: Meet with the class on Wednesday to talk about our final changes we would like to make.
 * Results:
 * Plan:

4/28: n/a.. 4/30: none at the moment. 5/2: No concerns at the moment.
 * Concerns:

Week Ending May 10, 2016
5/4: Today I was sick and was not able to make it to class. I got a few emails from the class about our final report and I will be reading into them to see how we will be writing our final report. 5/7: Start writing the data section on the final report. We were planning on splitting up the tasks throughout each member of the data group. Also, share the Captain America research log with the rest of the class. 5/8: Just checking in today. I spoke with Brenden and Justin the other day and we are just waiting for the decodes to finish. We also will be using information from our URC poster for our final report.
 * Task:

5/4: Looks like team stark forwarded over their log containing their training information. It also looks like team capstone is planning on continuing to run more trains throughout this week with different configurations. 5/7: I spoke with Brenden about about we would like to go about writing our section of the final report. I am waiting on his response for part of it and will get it to asap. We hope to have this done by class on Wednesday May 10. 5/8: Since I'm just checking in I do not have much of any results to give today. We just need to get our report together. 5/4: I plan to speak to the data team and team capstone about our final weeks work. I did complete Captain Americas research log with the exception of some editing. 5/7: Finish the report, see where Brenden and Justin are with the decode. Last I heard it has still not completed yet. 5/8: Get our report together and prepare for the final class. We don't have too much to do. 5/4: I don't have any concerns at the moment. Everything seems to be coming together. 5/7: No Concerns as of right now. 5/8: I do not have any concerns as of right now.
 * Results:
 * Plan:
 * Concerns: