- Semesters - Project Work by Semester
- Experiments - List of speech experiments
- Unix Notes
- Speech Corpus Setup - Switchboard, NOAA
- [Speech Recognition Related Readings]
- Experiment Setup
- Scripts Page
- Model Building - more info on data prep, language models, & building models
Academic paper describing functionally and performance differences between Sphinx 3.6 and Sphinx 4.0
Academic paper discussing baseline scores for speech recognition when translated over telephone networks and sources of degradation.
Academic paper describing changes made within Sphinx 3.X for improved efficiency.
Academic paper discussing speech recognition experiments using Sphinx 3.X
Academic paper that discusses converting speech to digital math equations.
Academic paper that proposes a new framework by refactoring Sphinx4 in a service oriented computing style.
Academic paper that discusses Learning-Based Auditory Encoding for Robust Speech Recognition.
Academic paper that discusses Recent Advances in Speech Recognition.
Academic paper describing a speech recognition architecture. Second page has a technical Sphinx decoder description.
Academic paper shows how use of MWF (filter) during voice recording improves speech recognition performance. Uses Sphinx 4 to test difference.
CMU article on Sphinx 3 vs Sphinx 4 performance comparison.
- link here This link is Bad
Catalog page with more information on the switchboard audio data: https://catalog.ldc.upenn.edu/LDC97S62 (current as of 3/26/2018)
IBM reports a WER of 5.5%.
The IBM article noted that the most of the speakers in their training data set were also in their testing data sets. Two papers had differing opinions as to whether this could be regarded as cheating (among other useful information on speech recognition in general):
Overview of ASR (Automatic Speech Recognition) https://www.youtube.com/watch?v=q67z7PTGRi8&feature=youtu.be
You will refer to this site all through the semester. The main guide for CMU Sphinx3 Toolkit https://cmusphinx.github.io/wiki/tutorial/
The closest thing to an online textbook for Sphinx3 http://www.cs.cmu.edu/~archan/documentation/sphinxDocDraft3.pdf
A huge and valuable collection of links about Sphinx3 http://www.speech.cs.cmu.edu/sphinxman/fr4.html
The second half is a useful tutorial in Sphinx3 http://www.cs.cmu.edu/~archan/documentation/sphinxDocDraft3.pdf
The CMU Pronouncing Dictionary tool. It gives you the phonemes and lexical stresses http://www.speech.cs.cmu.edu/cgi-bin/cmudict
Udemy course on Perl https://www.udemy.com/learn-perl-in-just-7-days/
an online text editor for testing Perl scripts http://rextester.com/l/perl_online_compiler