Speech:Spring 2012 April 10th Group A

From Openitware
Jump to: navigation, search


Sub Groups


Group Members


Group Log

Skype Meeting: Friday April 6, 2012 2:30pm


  • train10 is the base used for this week, created and modified by Brice Rader
  • There were a few changes made to scripts and dictionaries here is the change log (thanks to Prof. Jonas):
    • Replaced genPhones.sh with genPhones.csh that can now take a parameter: ./genPhones train9
      • Use ./ in front of the command...if that for some reason doesn't work you'd type: csh ./genPhones train
      • Located in /root/SCRIPTS
    • Fixed genTrans.pl so it now names the files train10_train.fileids and train10_train.trans...
      • e.g. it adds the _train properly
    • Words were added to the train10.dic so it would have every work that was transcribed in the transcript used for train10
    • The train10.phone dictionary was also updated to properly work
  • Took out text that wasn't associate to a .sph file in the transcript file: trans_unedited.txt
  • Prof. Jonas added to (insert filename here) to make sure it ignores certain characters such as "-" from the transcript file when training


Notes
  • After you get .sph files and move them to your wavTemp file make sure that you grab the correct transcripts
    • e.g. cp -a /media/data/Switchboard/disk23/swbi/sw049*.sph .
  • We needed to grab the sphinx_train.cfg file from train1 as for whatever reason it wasn't created when we initial start the building for the train
  • train#.phone is the phones dictionary - located in the etc directory of the task directory
  • train#.dic is the dictionary - located in the etc directory of the task directory