Speech:Spring 2012 Johnny Mom Log
Week Ending February 6th, 2012
- Completed the reading assigned from professor jonas's twitter on FOSS.
- Installed VMware workstation 8.0
- Created a new virtual machine and installed OPEN SUSE 12.1
- Trying to figure out on how to install Sphinx 3 by doing some research and reading up on some articles.
- Currently working on trying to install Sphinx 4 since Sphinx 3 has barely any documentation and the previous group who did the speech project installed Sphinx 4.
- I did some reason and OPEN SUSE needs (1)JDK (Java Development Kit), (2)Ant 1.6.0 or better, (3)Subversion for interacting with the svn tree.
- Downloaded Sphinx 4 Source (sphinx4-1.0beta6-src.zip) and Binary (sphinx4-1.0beta6-bin.zip).
- Currently trying to get used to this terminal environment and get Sphinx compiled and configured so I can play around with some of the test demos on the site.
- Still having issues with installing sphinx 4. There doesn't seem to have any good tutorial on how to install let alone anything describing on how it functions, etc.
- I installed JDK, Ant, and SVN using YAST. I went to terminal and typed sudo yast and a blue screen with the yast menu pops up (almost like in an XP installation) I was able to browse and search the software on the terminal, and installed JDK, SVN, and ANT.
- Still playing around with OPEN SUSE to figure this Sphinx installation out.
- Read every peer's log to keep up with what they've done, and they seem to be struggling on installing Sphinx 4 also.
- Unzipped sphinx4-1.0beta6-src.zip and sphinx4-1.0beta6-bin.zip, and tried to set up the Java environment (JSAPI.jar) but could not find it. I think i'm missing something, I must do some more reading to understand a bit more on how this sphinx application works.
- Still trying to figure out the Sphinx installation, It's frustrating to say the least.
- Talk to my DATA group (Brandon Mclaughlin, Mike Henenburg)
- Getting to a good understanding of this project, and what my purpose is in the DATA team.
Week Ending February 13, 2012
- Change Wiki page to a more organized look for Professor Jonas's need.
- Get in contact with Mike Henenberg to get him in the loop on what is going on for our group since he was absent for the class meeting.
- I will look into other speech tools necessary to capture and analyze information.
- I hopefully will be able to start working on the settings for sphinx.
- Changed layout in accordance to Professor Jonas's template.
- Contacted Mike Henenberg and got his skype username, also Brandon Mclaughlin's as we will have a VOIP meeting this weekend. In the meeting we will discuss the groups purpose, and goals at this point for the proposal. (Myself, Brandon Mclaughlin, and Mike Henenberg)
- Installed UNHM VPN to access the UNH Manchester network to play with the sphinx environments.
- Looked through the Traubadix server to understand file directories for sphinx 3 switchboards.
- For some reason I cannot log into the Traubadix with my username anymore(jpr62). It was working when we first set it up. I will have to go back and try to fix this. So at this point I will be using the root account to look around in the Traubadix server.
- Looking into some Perl Tutorials in preparation for creating Perl Scripts.
- Went over everyone's logs to see if there is anything I could add to help them, and also take the information and relay it to my group to help what we are doing.
- Met up with Mike H. and Brandon M. on skype for a meeting on where we need to be and it seems that we don't have a good idea on what needs to be done. We have to coordinate with the class mentor, James who was in the same group when he took the class to help guide us.
- I found a great link that can show how sphinx functions: http://cmusphinx.sourceforge.net/sphinx4/#language_models (Look under "Sphinx-4 Architecture and Main Components") It helped me understand more on how sphinx functions.
- Read peer's logs
- Noticed the server traubadix shows up as troubadix....when i ssh'd into traubadix with "ssh traubadix", it comes up with the name showing up as "troubadix", which i we will change to make it less confusing when switching between servers.
- Meet up with James to have him demonstrate sphinx
- Create better documentation so that it will be a lot easier for students to understand sphinx more since I believe the information on the internet is lacking for OpenSuse.
- My concerns is that I'm still don't fully understand what we are supposed to do. There isn't much documentation on the internet to tell us where to turn also...Its something we need to change in this semester.
Week Ending February 20, 2012
- Change my wiki page to be more consistent in layout between all logs.
- Work on the proposal
- Work on a PERL script to parse through transcripts for sphinx 3 requirements
- Work on stripping audio from .sph files to .wav files.
- Skype Meeting with Brandon M., and Mike H. to help bring Mike H. up to speed since he did not attend the UNIX micro lab which helped out a lot on the understanding of what we needed to do. We discussed roles to split up the workload for this data project.
- Looking at some reference Perl scripts to modify in use for some testing on the first 100 lines of the transcript file.
- Skype meeting with Brandon M. and Mike H. to work on the proposal.
- Formatted the proposal section of the wiki to add the data group sections for the proposal.
- We have completed a rough draft of the overview and the proposal but still need to finish the Phases of the proposal.
- Fixed wiki layout to look consistent between all my previous logs.
- Joined Google Group for CIS 790 which was created by James. It wasn't sent via email but through the MOTD on Caesar which I found was a smart way to see who actually logged in and saw it. The link is here.
- Checked to see if the SoX command was working and it was already install in OPEN SUSE. Now currently working on converting one of the SPHERE files to WAVE. I will report tomorrow if the SoX command is successful or not.
- Finished the rough draft proposal for the Data group which will be finalized on Sunday.
- Finished the rough draft of the proposal on our part.
- Created sections Mini Train and Full train but still to figure out who is going to work on that.
- Converted an .sph file to .wav file successfully using the SoX command. It creates a wave file in the same directory as the sph file. (sox SPHEREFILE.sph CONVERTEDSPHEREFILE.wav) I just need to figure out how to automate this process to do multiple files.
- Automate the process to convert .sph files.
- Work on improving the proposal.
- My concerns are getting the proposal finished on time especially with the addition of the mini and full train sections.
Week Ending February 27, 2012
- Completing my section (3-4 Paragraphs) of the proposal.
- Skype meetings during the week with Brandon to gain another perspective.
- Figure out the purpose of files of which to document on FOSS.
- Start creating a new version of the "run_decode.pl" to output files to the correct folders.
- Skype meeting with Brandon to gain perspective on what needs to be done.
- Start looking at the run_decode.pl script functions.
- Looking at and documenting which files need to be where.
- Start writing up the proposal.
- Finished writing up the rough proposal and timeline. I will look in revising it tomorrow.
- Looking at the files in /root/speechtools, testing the decode script of a new task to see exactly what files are created and documenting each file created in preparation to figure out what files need to be in what folder.
- Finalize proposal.
- Work on understanding output files created by decode experiment and organize which folders they need to be in.
- Start working on creating a new script to test.
- I'm concerned about if I can create an efficient new script out of perl since I'm not a programmer by trade (I work as an Systems Administrator).
Week Ending March 5, 2012
- Understand how the script works and what needs to be done before the script can be successfully executed.
- Understand the files created by the decode experiment.
- Start to document each file and folder via the output by the decoder.
- Followed and read these instructions: http://foss.unh.edu/wiki/index.php/Speech:Summer_2011_Training to understand more on the training and decoding process for sphinx.
- Figuring out a way to streamline the process to make it simpler rather than having so many steps to do before hand.
- Completed parts of Train and Decode of "train3"
- Now looking to streamline for the decode part of the process.
- Skyped with Brandon today to have him help me with looking at the perl script I'm working on.
- Figured out that training models needs to be done first before the decoding is done. Slowly learning more about the Sphinx system.
- The decode script takes the files already created in the training process of the task.
- Documented the difference between train1 and train3(my exp task). It seems like there are some extra files in train1 that aren't supposed to be there.
- Created newer decode script in creating folders and dumping files from the tasks directory to a specific experiment id folder.
- Create wiki documentation on files of train and decode.
- I'm not fully understanding if I need to move the files to the exp directory after or during the decoding process... (/mnt/Main/Exp/)
Week Ending March 19, 2012
- Create a new individual script that will move files accordingly from the "/root/speechtools/SphinxTrain-1.0/*taskName*" and place them in the "/mnt/main/Exp". The script will then be integrated into the decode script and will help streamline it so there is one less script to run.
- Document the experiment files on foss.unh.edu in Spring 2012 under Notes
- Created wiki section Notes and the Experiment Files Documentation page.
- Started the intro, listed out the folders in the documentation page and started writing descriptions and explanations of the folders in the the experiment folders.
- Read logs to keep up to date with everyone else.
- Created a perl script to move files but need to figure out if I can create folders automatically without having to manually create them.
- Read logs
- Trying to figure out a way for the script to move up in increments based on the current experiment ids in the EXP folder.
- Still trying to figure out a way for my script to automate it.
- The script may have to take the command " ./move_script.pl 1003" to where 1003 would be the folder name...and then the files can be moved over.
- Add on to the script to create folders based on experiment number.
- Update more information on files on sphinx.
- Creating a script good enough that will take the current experiment already there and add another increment such as if there are already 1001, 1002, and 1003 the script will create a folder named 1004. It should be feasible but since I'm not strong programmer/scripter I am a tad bit concerned if I can accomplish it.
Week Ending March 26, 2012
- Create a more intuitive script that will not just move specific files to a experiment folder but move to a directory recursively based on 1001, 1002, 1003, and so forth.
- Work on documenting the experiment files on foss.unh.edu in Spring 2012 under Notes
- Put in more info on "Sphinx Experiment Notes"
- Still trying to add on to the current script of creating directories recursively rather than having to specify an already created directory.
- Still trying to get the recursive script to create those folders rather than moving files.
- Added more information on Sphinx Experiment Notes.
- I have a script I'm looking at that will create recursive directories, i just need to have it work with the current filesystem we have in /EXP, seems complicated..I'm currently playing around with it changing some parts of the code to see what it does. (http://www.perlmonks.org/?node_id=183899)
- Finish up this script.
- Complete Documentation of experiment folders.
- I've got the moving files part I'm just afraid I wont have it done the way I want it at the end.
Week Ending April 2, 2012
- Finalize the two scripts.
- Work on documenting the experiment file structure steps.
- Now since I will not complete the script to create recursive directories on time, I will finalize two separate scripts.
- The first script will create the folder with the subfolders and the experiment folder will be able to be created with an argument such as: "./create_expfolders.pl 1003"
- The second script will copy the the train directories and files to the specified folders into "/mnt/main/Exp/1003".
- The script now accepts arguments, i just need to create a way for it to create those "arguments" such as "1002" to put into the path based on the argument. It seems like the script is coming together.
- After a long weekend, I have completed the scripts!
- The (2) scripts will be located in "\root\SCRIPTS\expdir_scripts"
- The "create_expdir.pl" will create the experiment directories.
- The "move_to_expdir" will move the specified train files over to the created exp directories you created with "create_expdir".
- I feel very accomplished about this, after many hours of testing it works with no issues. It isn't the most intuitive Perl script out there but it will do the job.
- I will finish up the Experiment setup here to give a better overview of how it works, etc: http://foss.unh.edu/wiki/index.php/Speech:Info
- Completed the Experiment Setup Explanation, and the Experiment Directory Explanation.
- The documents can be looked at here under experiment setup: http://foss.unh.edu/wiki/index.php/Speech:Exp
- If we have time, maybe work on the code to have it be more intuitive (less lines of code), I'm going to be honest the script works but its a bit long and repetitive.
- The only thing I may need to do is tweak the script a bit for the classes needs and for future classes.
- Add on to the Experiment Explanation if needed.
- As of this moment, no. It seems I have done what I needed although I may have to coordinate with Aaron on the official location of the train directories, and such.
Week Ending April 9, 2012
- Finalize the two scripts.
- Finalize Documentation for experiments in preparation for mini train.
- Help work on Poster
- Wrote up some stuff for the experiment structure documentation.
- Looking to tweak the scripts so it will work with just working out of the Exp directory rather than /root/speechtools/SphinxTrain-1.0
- Added more to experiment structure explanation.
- Started work on creating a draft of what the poster will look like in Photoshop.
- Helped Brandon with his script on converting sph to wav for mini train.
- Working up to completing documentation by Tuesday.
- Worked with Brandon to help him out on getting 80hrs for the mini train.
- Got the genTrans.pl script to get rid of all "laughter", and "noise" in the trans file.
- Completed experiment structure explanation.
- Work on completing poster.
- Work on assigned mini-train, Exp 1002.
- At times I'm confused on where I'm on the project but it seems that I've completed my section of the project the best I can, I may have somethings to work out such as directories created by the script.
Week Ending April 16, 2012
- Do all the tasks on my new group page with my new group.
- Finalize Documentation for experiments in preparation for group mini train via new exp directory. Add some screen shots to show hierarchy. Make changes to documentation to reflect what my group is doing.
- Help work on Poster
- Wrote up some last minute stuff to submit on documentation. (a better explanation)
- read new groups logs
- Could not make it for this afternoons skype. (Problems with servers at work)
- Our group needs to run scripts off of /mnt/main/scripts/ rather than in the root directory. If that is the case than I will have to change the experiment documentation explanation to reflect that rather than coming from the root directory.
- At this point I will try to look at the current scripts and have them work off of /mnt/main/scripts/.
- Submitted changes to experiment directory starting to have it reflect having trains working from /mnt/main/Exp/
- Moved my directory expdir_scripts folder over to /mnt/main/scripts
- Looking on if I should do a screenshot of Exp directories, or just list them out with a nicer format to show hierarchy.
- Added screenshot of exp directory
- With my group, We followed steps to copy over files to /mnt/main/, but the last step on the install page isn't working.
- Complete experiment with group
- Continue finalizing exp documentation if asked by professor.
- Completing the group experiment is a concern since the last step we are having an issue.
Week Ending April 23, 2012
- Do all the tasks on my April 24th group to complete a proper experiment
- Tried running train1 created by aaron. The sph files need to be copied over from the /flat directory and into the wavTemp to be converted with the script.
- Will try genTrans.pl to see if it executes.
- I tried doing my own train and got caught with the issue group 1 was facing, which was getting the setup_SphinxTrain.pl script to run, i changed the SPHINXTRAINDIR=$0 to "SPHINXTRAINDIR = /mnt/main/root/speechtools" but it didn't work. Trying to figure out what the script is looking for.
- Finished exp documentation, will need to change to reflect new locations in /mnt/main though.
- Copied all sph files in wavTemp, from "/mnt/main/corpus/dist/Switchboard/disk1/*". Will continue to follow steps in Speech:Summer 2011 Training.
- Copied scripts to etc. (genPhones.csh, genTrans.pl )
- Copied dictionary to task etc directory. (train1)
- Copied raw training script to task etc directory and renamed it to trans_unedited.txt
- Ran genTrans.pl trans_unedited.txt train1 and it worked.
- I started the experiment from "/mnt/main/Exp/0006" (which is our experiment directory.) I should of done it in the folder 0002 but forgot about that.
- I've run everything from setting up the tasks directory all the way to "Copy filler file to etc directory." so that means we are on the "Run make_feats.pl" part of the Speech:Summer 2011 Training .
- I'm currently having issues running the "make_feats.pl" script so I need to figure out what it needs to have it run from "/mnt/main/Exp"
- The "make_feats.pl" script seems more difficult to understand than other scripts. I will have to work with my group to decipher this. (Specifically Aaron since he was in the train group so he will know specifically what it does, etc.)
- If given one more week hopefully we can run this experiment if we get past the make_feats.pl issue.
- We need to get make_feats.pl to work and it seems that it's a complicated script, so I'm not sure if we can fix it on time.
Week Ending April 30, 2012
- Get a train running with the modeling group.
- Keep working on Experiment Documentation.
- We got make_feats working with Brice R. and John S.
- Reading group logs to keep on track with modeling group.
- Proof reading Rough draft report to see if I can add to it or suggest something to make it better by contacting Aaron Green.
- We figured out the make_feats.pl file that we were confused about. It seems like the script make_feats.pl was looking for mfc files in the feat directory. We had John S copy over the the feat directory from /media/data/Speech/train/train10 to /mnt/main/Exp/0001/ creating a feat directory with mfc files in the directory. We now have run a successful make_feat.pl script.
- We now need to run the RunAll.pl script to finally create the models in the "model_parameters" directory.
- Read over Report Groups current progress on "http://foss.unh.edu/wiki/index.php/Speech:Spring_2012_Report", everything looks good, Aaron green is doing a great job on keep everyone in the loop.
- Currently still having the issue with the runAll.pl script, still trying to figure it out.
- Looking over Exp documenation, adding more stuff.
- Currently still working to get the runAll.pl script.
- Help Report group finish the report.
- Complete our train with the modeling group.
- I'm concerned that we will not finish the train since we are stumbling into a lot of problems of running it out of /mnt/main/Exp/
Week Ending May 7, 2012
- Have the train work.
- Look over Report for Aaron G.
- Look over experiment documentation to see if there is anything that needs to be fixed.
Tuesday(5/1): (Don't know why wiki didn't save but I made a log Tuesday Afternoon in class.)
- Fixed part of phase 7 of the RunAll.pl script with Ted, Brice, and John S. Ted found an article online stating that we need to run it in the root of the experiment directory which is /mnt/main/Exp/0001.
- We talked about this between John, Ted, and Brice about how many phases there are. We weren't sure based on the code in RunAll.pl but we'll try to work through them while fixing the issues on getting the script to fully work.
- We are stumped at when the RunAll.pl script is executing the verifyall.pl script at: line 52.."/mnt/main/Exp/0001/scripts_pl/01.vector_quantize/slave.VQ.pl."
- Currently we have no leads on where to go from here.
- Working on writing a part of my results in doing the experiment documentation in the report for the reporting group.
- Completed Train. Now working on decode.
- Complete the Decode! We may be able to get it done before the semester is over.
- Not completing this experiment.