Speech:Spring 2017 Team Empire


 * Home
 * Semesters
 * Spring 2017
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Team Logs

 * Team Rebels
 * [Team Empire]

Team Member Logs

 * Julian Consoli
 * Maryjean Emerson
 * Jeffrey Gancarz
 * Andrew George
 * Huong Ha
 * Dylan Lindstrom
 * Cody Roberge
 * Jacob Sprague
 * Vitali Taranto
 * Alexander Turner

Machines

 * Asterix
 * Miraculix

Report

 * [Final Empire Results]

Competition Slack Stream
3-9-2017

MJ Hi team, just wanted to check in and get an update because I couldn't be in class yesterday due to a sick kiddo. What tasks do we need to do for the next couple of weeks?

Vitali Which machines do we get? Do we have to do a team writeup? What is the plan?

3-20-2017

Vitali We don't know what machines we get. We don't have to do a writeup. There is no plan yet. Ok.

Andrew He didn't say we had to do anything that I know of.

3-22-2017

Cody QUICK TYPE THEM HERE

Vitali Majestix, osterix.

Vitali Wrong Asterix, Miraculix

Alex Our root experiment number : *0300* I've created two sub experiment directories for the first experiment, one for each machine. First experiment on Asterix : *0300/001* First experiment on Miraculix: *0300/002*

3-23-2017

Vitali I went to score the test experiment on asterix, and the scoring log has no error, but also no score. Have any of you seen that before? sclite: 2.3 TK Version 1.3 Begin alignment of Ref File: '001_train.trans' and Hyp File: 'hyp.trans'

Alex Which corpus did you use?

Vitali 5 hour ill go ahead and give it a shot using 30hr to see if it makes a difference

3-24-2017

MJ It shouldn't make a difference I ran a 5hr corpus while I was in class on Wed and got a full score report.

Vitali Maybe I screwed up on my first run then. In any case the 30hr run did give a full score.

3-27-2017

Dylan Initial experiment on Miraculix using the 5hr corpus seems to be running into issues with the decode, will try it with the 30hr corpus

Andrew I think Miraculix needs to have /usr/local/ copied over

4-2-2017

MJ Have we done the copy yet? If not is it something anyone can do or do you need to do it @roo-t. I will give it a go if I can.

Andrew I haven't had a chance to look into it. Anyone should be able to do it but as to exactly how to, I am not sure.

MJ Ok cool I will see what I can find out. Thanks!

Vitali Hello. I seem to be having some difficulty accessing asterix. It just hangs forever when I try. Can any of you guys get on?

Andrew Hmm, I can ping it, but can't SSH into it. I won't be able to check until after 4:30 tomorrow though

Huong That's weird. You don't think it has anything to do with the miniconda install right? I was able to ssh in fine at the end of class Wednesday, but haven't tried since. The same problem is also happening to Miraculix This is the error I get trying to ssh into Miraculix, ssh: connect to host miraculix port 22: No route to host

Andrew Yeah it's not responding to ping. I'll take a look tomorrow.

Huong Cool. Thanks Andrew!

4-3-2017

Andrew Asterix and Miraculix are both back up now

Vitali Thanks. Great work!

MJ Awesome thanks Andrew!

MJ Looks like usr/local is in Miraculix I just ssh in and went to the directory and got this - [root@miraculix local]# ls  bin  etc  games  include  lib  lib64  libexec  sbin  share  src Those are not the exact same folders that are in caesar but I dont know if they need to have the exact same:  here are the directories in local for caesar -  bin games include lib libexec man sbin share src var. Any thoughts. Dylan have you been able to do the experiment in miraculix yet?

Alex That's probably an outdated version of usr/local and should still be copied over from Caesar

MJ Is there a way to ssh into the other machines with our own ID's or do we have to go through root?

Alex As far as I know we need to use root because our profiles haven't been added

MJ ok thanks I am going to attempt to get local copied over

Alex Okay sweet, you'll need to use the "rsync" command in order to maintain the sym-links

MJ so instead of cp I am going to do rsync -r ?

Alex Yes

MJ Ok looks like it is all set. Copied it over into the local directory first not to replace the local directory but removed my mistake and did it again and all the directories look like they are there now

Andrew Wooooo! Thanks MJ!

Alex Nice

4-4-2017

Vitali Okay, we need numpy and scipy for lda. That means asterix needs internet access.

Andrew Okay

Andrew alright, Asterix now has internet I am going to reboot the switch so if you need access to Asterix right now, use 132.177.188.64 to SSH into it

Vitali Cool. I will put the python packages on it soon as I get home.

Vitali Alright. I installed what I needed.

Andrew okay sounds good

4-5-2017

Jeffery Directions for Keygen: [jhc1@caesar ~]$ ssh-keygen Press enter on the followup prompts to enter default values.

Enter file in which to save the key (/mnt/main/home/sp17/jhc1/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Take note of the locations the keys are stored - Your identification has been saved in /mnt/main/home/sp17/jhc1/.ssh/id_rsa. Your public key has been saved in /mnt/main/home/sp17/jhc1/.ssh/id_rsa.pub. At, this point we need to create the symbolic link - [jhc1@caesar ~]$ cd .ssh [jhc1@caesar ~/.ssh]$ ln -s id_rsa.pub authorized_keys

And that's it, you're done. I'll probably make a more detailed instruction manual when creating the documentation, but that should be enough to getcha started.

4-12-2017

Vitali Pick "No Trade" 1.) Greg...  2.) Matt... Protect: vitali Taranto

Alex Is someone working on miraculix right now? It appears that /mnt/main/ is disconnected.

Andrew Hmm not that I know of  But you can access miraculix though?

Alex Sorry for the late response, but yes I can access it. I had to delete my list of known hosts before I could though because it was throwing a man-in-the-middle RSA exception. Also, I remounted /mnt/main and started a 300 hr experiment.

Andrew Oh that's weird, I wonder what happened to it. Everything was working fine the other day. I'll take a look tomorrow.

Alex It was working earlier today too, it just appears that someone unmounted it but never remounted it

4-13-2017

Andrew Okay, yeah I have recently added our accounts to Miraculix so I'm assuming that's why you got the RSA error because I was able to SSH into Miraculix from Caesar with the SSH keygen. As for the /mnt/main issue, I'm guessing someone was on it doing installs and unmounted it because of Jonas' OCD.

4-26-2017

Jeffrey Just made changes to the setting so anyone with the link can see the document https://foss.unh.edu/projects/images/6/6a/Speech_2017_Empire_Results.pdf

4-30-2017

MJ Do we have to have our results for this week? Or next?

Vitali This week.

MJ Where are we with experiments and results so far? Just want to know where we all are so I can help out. I know Jeff set up the doc. I am willing to do the write up or help with the write up. My skills suck at the programming, scripts side. I have more strengths with documenting if anyone wants or needs help with it

Vitali Before tommorow, I am going to edit the write up to include the score. I am going to use Alex's experiment with LDA, with s tags removed

MJ Ok is there anything I can do to help you with that Vitali? Or any where else we need to be working on something?

Vitali Well, you have insight into what the data group did, so you could put that into the write up.

Alex Did we actually run any experiments against the data changes?

MJ Yeah no problem. Let me know if there is anything else though. From any of you guys I don't believe the new script was made available yet to the rest of the groups Alex because of the errors that we kept getting.

Alex Gotcha, so that may be something for next year?

MJ But with your experiment results I can pull data and talk about how the data is manipulated and changed from the original transcripts to the experiment results Yeah definitely. It is going to take a lot of deconstructing the original code. So it will take at least one semester if not a couple to break it down and rebuild it

Vitali Hi Jeff. Might be a good idea to give us all edit permissions as well.

Vitali I rescored our 300 hr experiment with LDA without the s tags. our final result was 45% as compared to the previous year's 50%. That is the score we are going to use.

5-1-2017

Jeff you guys should be good to edit now. Let me know if you cannot edit the document.

MJ Has anyone been taking a look at the experiments being run by the rebels and their scoring logs to see how we are doing compared to them? what information we could get from it?

Vitali We lost. Thier WER is going to be around 40%.

MJ ouch any ideas how they did it?

Vitali They focused on parameter changes. I tried to look into advanced methods, but didn't get anywhere with them.

MJ anything else we can do, we still have some time? would adding to the dictionary or anything like that help?

Vitali No. We can't run another 300 hour train at this time.

MJ true is it all about the score though? according to the document the winner is "who bests articulates their tabulated results" is there anything that we could do to "debunk" their results? I'm just looking through past teams and how they did it. Looks like in Spring 2015 the team that got the lowest score didn't win becuase their results weren't "real world"

Vitali Greg and I are both using 300 hr test on unseen data, with the s tags removed for scoring. Those are the most real world results that we can show. We don't lose points for losing.

MJ well I know our score isn't as good but I think Vitali if you can show a good foundation for why and research into what you did and why you did it and it looks well thought out and planned then we still have a good shot. oh I know that, we just are in charge of the final report just thoughts on my end

Vitali Thank you for your thoughts. It does help with the report, because I am reminded to talk about those things.

Vitali So apparently the report is due next week and not Wednesday. Which means I have time to go run more trains. I will still edit the report by Wednesday in case my information is bad.

MJ Well that's awesome! Any thing I can do to help? Would modifying the dictionary help at all? Dylan and I can see what we can do

Vitali So are the new datasets ready? Because if they are and the other team doesn't use them that would be a huge boon.

5-2-2017

MJ The reg ex isn't working but Dylan had been working on the dictionary so we might be able to add to it there

Alex We might as well try running experiments with the improved dictionary. When will that be ready for use?

MJ @dlindstr0m when do you think we could get a modified dictionary up   @vitali_taranto and @anturner How much time do you need to run a full experiment?

Vitali 4 days for a full 300 hour. Less for testing.

MJ Between the two of us Dylan do you think we could get some things up for them to run

Dylan The improved dictionary has been created but I've found that when running experiments with anything other than the master dictionary, something fails (looks like it's adding a bunch of duplicate phones to the phone list). This happens even when using the original dictionary released from CMU. I can't find any documentation on it, but the master dictionary on the server that everyone's been using only has ~40k words in it whereas the most recent dictionary release from CMU has ~130k words, so I'm not really sure what's going on.

Dylan But for anyone else who might want to look into it, you can run an experiment using 'makeTrain.dic.pl' which points to the new dictionary named 'test.dic', located in /mnt/main/corpus/switchboard/dist/dict/custom. The error I kept running into happened during the train.

MJ Maybe if we can get some more eyes in it we can figure it out?

Alex I'd like to take a look, I'm curious what kind of exceptions it's throwing

MJ Go for it Alex!

Dylan Sounds good, feel free to make any modifications you need to makeTrain.dic.pl and test.dic as I have both backed up on my PC.

Alex Alrighty sounds good

5-3-2017

Vitali I edited the team report into a state I am happy with for now, but you guys should probably go over it before next week. I have never been much of a writer.

Cody document shared? cause if it is, we sohuld pin it

Jeffrey just pinned the document

Alex occurs in the phonelist (/mnt/main/Exp/0300/021/etc/021.phone), but not in the dictionary (/mnt/main/Exp/0300/021/etc/021.dic) WARNING: This word: UM-HUM was in the transcript file, but is not in the dictionary ( UM-HUM ). Do cases match?

These are the two big issues I've found with the updated dictionary

5-9-2017

Vitali Hi. Is everybody satisfied with the team report?

Dylan Looks good to me

5-10-2017

Vitali Hi. I forgot to bring 3 copies of the team report. Could any of you guys print and bring them?

Andrew Will do!

Vitali Thanks man, your a lifesaver.