Speech:Exp

From Openitware
Jump to: navigation, search


Project Notes


Experiment Setup

Build Your Own Experiment   Quick Links
Experiment Directory Capstone Terms
Guide to Adding Experiment Set up Terminal
Filezilla

What is an Experiment

This is a multi step process done in order to take sets of data in order to train and decode the data. By taking data the experiment is to achieve a low error rate by train and decoding the data.

What is a Train

We run a train in order to build an acoustic model for use in a decode. An acoustic model is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representations of the sounds that make up each word. It is used by a speech recognition engine to recognize speech.The first thing the trainer does is verify that it has everything it needs to build a model. Checking to see if:

   Transcript list is valid, it can find audio files the transcript references and vice versa.
   The experiment dictionary contains all the words used in the transcript.
   All Phones used in the dictionary are defined in the <experiment #>.phone file.

Some quick links regarding Training and Acoustic Models:

What is a Language Model

A language model is used to restrict word search. It defines which word could follow previously recognized words (remember that matching is a sequential process) and helps to significantly restrict the matching process by stripping words that are not probable. To reach a good accuracy rate, your language model must be very successful in search space restriction. This means it should be very good at predicting the next word. A language model usually restricts the vocabulary considered to the words it contains.

Some quick links regarding Language Models:

What is a Decode

Decoding is essentially the ability to interpenetrate audio to text. We want to make our Acoustic model as strong as possible in an attempt to get the best decode results. And a decoding is done by:

As suggested in the instructions, you can use an acoustic model different from the one within the current experiment directory. This is mainly for simple decode experiments, where the objective is to evaluate an existing model and thus doesn't require training. It still needs a dictionary, feats, a transcript, and a language model created within the experiment.

After the script finishes you should have a file called decode.log that has the results of the decode. There should be little no output from the decoder. All data about the decoding process is within that logfile.

The decoder may take a little while to run. Like the trainer, it isn't consistent with the length of the audio/transcript data sent to it. It isn't uncommon for it to take more than a few hours. It can even take a few days, with 300hr corpus data.

Note: the command to do a decode will put the activity in the background and give you back your prompt. To see when the decode finishes either

use the command 'top' in a new terminal window

or

use the command 'tail -f decode.log' in a new terminal window to monitor the last few lines as they are being written to the decode.log file.

Archived Experiment Setup Pages

Archive 1 : This page was archived and replaced during spring 2014