Speech:Summer 2014 Marcel


 * Home
 * Semesters
 * Summer 2014

Week Ending June 18th, 2014

 * Task:
 * Wednesday:
 * First meeting with Professor Jonas about my independent study called "Signal Processing with S/W Radio"
 * Thursday:
 * Experimented using "Bootable DVD with GNU Radio pre-installed"
 * Experimented with HDSDR application. I was able to listen to a few FM radio stations. http://www.hdsdr.de
 * Friday:
 * Read about frequency modulation and demodulation, FTT, Low Pass Filters etc.
 * Created a Ubuntu VM in order to start a clean GNU Radio installation.
 * Saturday:
 * Encountered few problems with the installation of GNU Radio
 * Fixed some problems related to kernel driver
 * Sunday: N/A
 * Monday:
 * Used GNU Radio Tutorials in order to create the frequency modulation application

My task this week was to find a hardware/software solution to transmit/receive radio signals to/from a laptop/PC without buying expensive hardware like USRP. The initial idea was to use existing wireless cards.

Another task was to to create an application using GNU Radio framework in order to perform a simple voice transmission.

I researched different open source wireless card drivers and also I spoke with a few RF engineers about the task. Based on their feedback and my research I came to the conclusion that this route would not be feasible for these reasons:
 * Results:


 * Our experimental research would require full access to all protocol layers, down to the physical layer. This would be very difficult, if not impossible without full information about the hardware chipset -(out of the scope of this project). http://en.wikipedia.org/wiki/Comparison_of_open-source_wireless_drivers
 * There are a few wireless chipset devices like those made by Atheros which allow frequency band change using Open WRT Linux based firmware program but it would be hard to find laptops with this kind of chipset incorporated. https://openwrt.org/

The best solution at this time is to use rtl-sdr TV tuners- These are USB dongles based on the Realtek RTL2832 which are designed for DAB/DVB/FM. They can be used as SDR receivers over a frequency range of 52 - 2200 MHz. These USB dongles are extremely cheap and can be installed virtually on any computer.

For the second task, using GNU Radio Framework I was able to create an application that interacts with a walkie-talkie handset.

Hardware used:
 * NooElec R820T SDR & DVB-T
 * Motorola T5720

Using Motorola T5720 I was able to transmit a voice message that was received and processed by my first GNU Radio Application. The purpose of this experiment was to prove GNU Radio Framework ability to create a software controlled signal path.

Next step would be to address each block parameter settings in order to improve the quality of the signal.



Week Ending June 25th, 2014

 * Task:
 * Wednesday:
 * Meeting with Professor Jonas
 * Thursday: N/A
 * Friday:
 * Continued to work on my first application to improve the quality of the sound.
 * Saturday:
 * Tried a few FM transmitters and got one that works very well.
 * Used GNU Radio Tutorial (http://www.ettus.com/kb/detail/sdr-for-beginners-building-an-fm-receiver-with-the-usrp-and-gnu-radio) in order to build a FM Receiver with the R820T dongle and GNU Radio. The tutorial was using an USRP SDR and had to adapt the application to my hardware.
 * Sunday:
 * After reviewing a lot of FM receiver GNURadio applications I was able to obtain a clear sound.
 * Monday:
 * Encountered a problem related to choppy IQ(In-phase Quadrature) buffer being influenced by the computer activity.

My task this week was to demo a radio transmission between 2 computers (SDR dongle as FM receiver and an FM transmitter).

hardware used:
 * FM digital transmiter PortAuthority
 * NooElec R820T SDR & DVB-T


 * Problems:

The first problem I encountered was related to the quality of the sound received. Using the GNU radio tutorial mentioned above I was not able to receive clear sound. After researching the subject and examining different fm receiver applications, I found out that if I inserted an additional Rational Resampler next to the SDR source block the sound would be improved considerably.

Another problem that I noticed sometimes was that when CPU intensive events occurred the console output displayed a lot of "Ua" characters. After searching for a solution I found out that this could be related to Ubuntu version. There is a thread open in GNURadio forum but there is no answer yet. https://lists.gnu.org/archive/html/discuss-gnuradio/2014-04/msg00051.html I noticed that If I unplugged/plugged the SDR receiver the problem disappeared for a while.



Week Ending July 2nd, 2014

 * Task:
 * Wednesday:
 * Meeting with Professor Jonas
 * Thursday: N/A
 * Friday: N/A
 * Saturday:
 * Continued to work on my first application to improve the quality of the sound.
 * I installed the Ubuntu and GNURadio on a desktop computer and got clear sound transmission.
 * This time I found an automatic script to install the GNURadio. I will post step by step instructions on project tasks area.
 * Sunday:
 * With the new installation I got rid of the problem related to choppy IQ(In-phase Quadrature) buffer being influenced by the computer activity. Also I noticed an improvement in the sampling rate that I can set.
 * The R820T outputs 8-bit I/Q-samples, and the highest theoretically possible sample-rate is 3.2 Msps, the highest sample-rate that  I was able to test so far with my setup was 2.0 Msps.
 * Monday:
 * Started to experiment writing flow graphs in Python

A problem that I noticed in my experiments with generated signal sources is that the flowgraphs will consume as many of the computer's resources as possible and cause the GNU Radio software to lock up. To fix this problem a Throttle block can be used to enforce the desired sample rate. This is not valid if there is an audio sink block present because the audio hardware enforces the desired sample rate.

Week Ending July 9nd, 2014

 * Task:
 * Wednesday:
 * Meeting with Professor Jonas
 * Thursday: N/A
 * Friday: N/A
 * Saturday: N/A
 * Sunday:
 * Read about I/Q sampling and experiment with raw data captured from sdr dongle
 * Recorded an I/Q sample stream using R820T device. Samples were written to a file to be analyzed off-line.
 * Monday:
 * Continue to experiment with flow graphs in Python.
 * Analyzed data conversion between different blocks by inserting probes(sink files)
 * Tuesday:
 * Experimented with reading and writing wav files.

Week Ending July 16th, 2014

 * Task:
 * Wednesday:
 * Meeting with Professor Jonas
 * Thursday:
 * Experimented with Wav File Sink and I started to notice a problem reading the generated .wav file
 * Friday:
 * Created gnu radio flow graph to record from "real" signal source (RTL2832)
 * Saturday:
 * After doing some research, I found out that the wav file header does not contain the correct size; it is set to 0. Also I noticed that if I do not use the 'kill the flow graph' button the wav file header is updated correctly. Apparently this is a known bug (Bug #544) in GNU Radio and the solution is to use the close button in the the GUI window (something that I was not doing).
 * Sunday:
 * Read about trunked radio systems and did some experiments. The problem I have is that my scanner is limited to receiving one channel at a time and I need to be able to receive the entire trunked network in real time and record it to disk.

My task this week was to build a GNU Radio application that can be used to received and record emergency radio communications.


 * Problems:

The first problem that I encountered was getting the recorded .wav file to work. I was able to generate the .wav file but when I was trying to open it with a .wav player I was getting an error about the format of the file.

Assuming that the problem is with the flow graph I started to modify the sample block parameters and tried different values. After getting no results I started to investigate the .wav file using GHex and I noticed that there is a lot of sound data. I also found out that the "Subchunk2Size" field is 0. (see .wav format https://ccrma.stanford.edu/courses/422/projects/WaveFormat/)



Another problem that I encountered was related to decoding the trunked radio system (http://en.wikipedia.org/wiki/Trunked_radio_system) used by local government and emergency services. I was able to listen to sporadic conversations but I could not fully monitor multiple frequencies yet.

Week Ending July 23rd 2014
radio communications.
 * Task:
 * Wednesday:
 * Meeting with Professor Jonas
 * Sunday:
 * Continued to read about trunked radio systems and did some experiments. The problem I have is I cannot start and stop recording when there is a valid signal on the channel.
 * Monday:
 * I continued to work on my a GNU Radio application that can be used to received and record emergency
 * I discovered an archive with a lot of recordings that could use to start our decoding experiments.


 * Problems:

I'm still having some problems detecting the signal in the white noise background.

Week Ending July 30th, 2014

 * Task:
 * Wednesday:
 * Meeting with Professor Jonas
 * Monday:
 * Being on the road, I could not experiment too much with my SDR hardware
 * I did some reading about filters, noise reduction, amplifying without distortion

My next tasks would be:
 * Start to process some wave files in order to improve the quality of the sound - apply some filters
 * Fix my recording application so I will be able to make the distinction between white noise and real signal
 * Start to record and classify radio communications captured with my SDR application

Week Ending August 20th, 2014

 * Task:
 * Sunday:
 * After a short vacation I returned to my experiments
 * Experimented with recording emergency radio communications
 * Monday:
 * Continued to work on improving the quality of recordings
 * Read the Project Notes (Speech Recognition) trying to get a deeper understating of this project
 * Tuesday:
 * Continued to work on my problem related to the distinction between white noise and real signal

To improve the quality of the reception I used a discone antenna. Also for frequency reference I used a PRO-2044 Programmable 80-Channel Scanner. I experimented with recording National Weather Service alerts (162.400 MHZ) because the time between transmissions was shorter than other emergency alerts systems.

My next assigned tasks :
 * Setup software radio and document clearly how to do so in Wiki Summer Task page
 * Capture real live over the air audio data using software radio and organize it into a small corpus. This data (say between 15 to 30 minutes of it) should also be hand transcribed (i.e. a corpus has both audio and text) and the should be stored properly in /mnt/main/corpus (see Erol).
 * Run a decoding job on this data using best 125 hour training models that Erol created and post results. Note that this decoding job should be properly entered into the Wiki Experiment Log...(see Erol) and results posted.

Week Ending September 3rd, 2014

 * Task:
 * Friday:
 * After a week of monitoring the radio frequencies I came down to 3 channels that I have a good reception quality:
 * Souhegan Valey Ambulance Service, Inc 153.890 MHZ
 * Exeter Fire Dispach 154.430 MHz
 * NOAA Weather Radio 462.450
 * Monday:
 * Gather all the recorded data and compile in one file (after a discussion with Erol about speach corpus). On the next meeting we are going to discuss which data to use.
 * Tuesday:
 * Make another GNU Radio VM installation from scratch in order to write the documentation. Also this time I used the build-gnuradio script provided by Marcus Leech. I tried to install GNU Radio in the latest version of Ubuntu 14.10 and I had some problems with video drivers.
 * Start to create a detailed description of FM Receiver application to be added to Tasks section.

Week Ending September 17th, 2014

 * Task:
 * Saturday:
 * Continued to work on my documentation.
 * Processed all the wav files and get the best ones. Last meeting we decided we are going to start with NOAA recordings first.
 * Sunday:
 * Created a catalog with the processed recordings files and started to work on hand transcribing them (not a fun task).

File name                     Duration(s)

162.450_01.wav	 		32.898569 162.450_02.wav			63.639856 162.450_03.wav			119.908992 162.450_04.wav			58.606195 162.450_05.wav			42.786117 162.450_06.wav			56.089364 162.450_07.wav			115.234879 162.450_08.wav			27.864908 162.450_09.wav			31.999702 162.450_10.wav			94.021594 162.450_11.wav			57.707326 162.450_12.wav			49.078194 162.450_13.wav			84.134046 162.450_14.wav			62.920761 162.450_15.wav			120.088766 162.450_16.wav			58.965742 162.450_17.wav			119.010125 162.450_18.wav			108.583256 162.450_19.wav			27.325588 162.450_20.wav			184.088169 162.450_21.wav			205.12168 162.450_22.wav			120.807861 162.450_23.wav			58.246647 162.450_24.wav			22.291927 162.450_25.wav			362.243809 162.450_26.wav			58.066874 162.450_27.wav			49.797288 162.450_28.wav			92.403631 162.450_29.wav			184.267943 162.450_30.wav			58.785968 162.450_31.wav			19.775096 162.450_32.wav			208.896927 162.450_33.wav			26.96604 162.450_34.wav			125.302201 162.450_35.wav			200.447567 162.450_36.wav			243.05391 162.450_37.wav			22.831248 162.450_38.wav			100.852991 162.450_39.wav			108.583256 162.450_40.wav			26.96604 162.450_41.wav			183.908396

Total: 3994.569449s

Week Ending September 24th, 2014
Here is the batch file that I used: for f in *.wav; do ffmpeg -i "$f" -acodec pcm_s16le -ac 1 -ar 16000 "new/${f%}";  done
 * Task:
 * Friday:
 * Started to go over the Project Notes for Speech Recognition software
 * I noticed that not have sufficient permissions to use Caesar for my experiments.
 * Saturday:
 * I create a VM image with OpenSuse in order to install CMUSphinx Software.
 * I tried to follow the instructions from the Robust group's tutorial http://www.speech.cs.cmu.edu/sphinx/tutorial.html. Few links were broken and I was not able to find the same version of software used in the tutorial.
 * Sunday:
 * I found out that the wav files from NOAA corpus were not 16khz 16bit mono files in MS WAV format so after a little research I found a small utility program ffmpeg http://www.ffmpeg.org/ that I used to convert the NOAA corpus files.

ffmpeg -i input.mp3 -acodec pcm_s16le -ac 1 -ar 16000 output.wav

Week Ending October 1st, 2014

 * Task:
 * Wednesday:
 * Meeting with the Professor Jonas and the rest of the crew. Most of us lost a point because of incomplete log.
 * I received the root password and found out about Justin's successful experiment with Robust group's tutorial.
 * Friday:
 * Went over experiment 0256 and tried to replicate
 * I was able to complete a training and a decoding on the AN4 (Alphanumeric database) from CMU Audio Databases
 * Saturday & Sunday
 * Started to have problems to login on Caesar
 * Continued to experiment on my VM image.

Week Ending October 8th, 2014

 * Built a language model from NOAA corpus transcript located /mnt/main/Exp/0257/LM


 * Reading Erol's log I found out that he had good results on experiment 0253/012 and I was trying to copy the experiment in my folder using the following script:

cd /mnt/main/Exp/0253/012 perl scripts_pl/copy_setup.pl -task /mnt/main/Exp/0257/001

and got the following error: Can't open perl script "/root/speechtools/SphinxTrain-1.0/scripts_pl/setup_SphinxTrain.pl": No such file or directory Current directory not empty. Will leave existing files as they are, and copy non-existing files. Making basic directory structure. Copying executables from /mnt/main/root/sphinx3/src/programs Copying scripts from the scripts directory Generating sphinx3 specific scripts and config file Set up for decoding /mnt/main/Exp/0257/001 using Sphinx-3 complete Set up for /mnt/main/Exp/0257/001 complete

I changed setup_SphinxTrain.pl script with the correct path but the script was still not working.


 * Continued to experiment on my VM image.


 * I spent a lot of time analyzing the training and experiment documents. It appears that some changes were made to the configuration of the server files without being modified in the wiki. I was trying to run some simple scrips like copy_setup.pl for example and I could not make it run successfully even after fixing the path errors. To prove that I'm not doing something wrong I tested the same script in my own VM without any problems.


 * I create another experiment 0258 to see if my problems are related to not using an empty experiment folder. In this one I tried to copy experiment 0197 using the copy_setup.pl script unsuccessfully. Finally I copied whole experiment folder 0197 in 0258 and starting to modify the scripts in order to accommodate.

Week Ending October 15th, 2014
Generate the set of acoustic model feature files from these NOAA WAV audio recordings.

sphinx_fe -argfile /mnt/main/Exp/0257/001/etc/feat.params -samprate 16000 -c /mnt/main/Exp/0257/CorpusNOAA16/etc/CorpusNOAA16.fileids -di. -do. -ei wav -eo mfc -mswav yes

Run the decode using the first_5hr corpus train data from experiment 0256/001.

/usr/local/bin/sphinx3_decode \ -hmm /mnt/main/Exp/0257/001/model_parameters/001.cd_cont_1000 \ -dict /mnt/main/Exp/0257/001/etc/001.dic \ -fdict /mnt/main/Exp/0257/001/etc/001.filler \ -lm /mnt/main/Exp/0257/LM/tmp.lm.DMP \ -ctl /mnt/main/Exp/0257/CorpusNOAA16/etc/CorpusNOAA16.fileids \ -cepdir /mnt/main/Exp/0257/CorpusNOAA16/wav \ -cepext .mfc \ -hyp /mnt/main/Exp/0257/result/firstNOAA.txt

I got the following results:

SYSTEM SUMMARY PERCENTAGES by SPEAKER

,-.     |                          firstNOAA.txt                          | |-|     | SPKR   | # Snt  # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |+--+-|     | noa    |  185    6320 | 36.6   33.3   30.1    2.0   65.4  100.0 | |=================================================================|     | Sum/Avg|  185    6320 | 36.6   33.3   30.1    2.0   65.4  100.0 | |=================================================================|     |  Mean  |185.0  6320.0 | 36.6   33.3   30.1    2.0   65.4  100.0 | | S.D.  |  0.0    0.0  |  0.0    0.0    0.0    0.0    0.0    0.0 | | Median |185.0 6320.0 | 36.6   33.3   30.1    2.0   65.4  100.0 | `-'

Week Ending October 22th, 2014
I'm trying to decode NOAA with some of Erol's training data to see which one gives me a better result.

/usr/local/bin/sphinx3_decode \ -hmm /mnt/main/Exp/0253/B12/model_parameters/012.cd_cont_10000 \ -dict /mnt/main/Exp/0253/B12/etc/B12.dic \ -fdict /mnt/main/Exp/0253/B12/etc/B12.filler \ -lm /mnt/main/Exp/0257/LM/tmp.lm.DMP \ -ctl /mnt/main/Exp/0257/CorpusNOAA16/etc/CorpusNOAA16.fileids \ -cepdir /mnt/main/Exp/0257/CorpusNOAA16/wav \ -cepext .mfc \ -hyp /mnt/main/Exp/0257/result/firstNOAA_01.txt >result/dec_01.log SYSTEM SUMMARY PERCENTAGES by SPEAKER

,-.     |                        firstNOAA_01.txt                         | |-|     | SPKR   | # Snt  # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |+--+-|     | noa    |  185    6320 | 39.3   13.2   47.5    0.6   61.3  100.0 | |=================================================================|     | Sum/Avg|  185    6320 | 39.3   13.2   47.5    0.6   61.3  100.0 | |=================================================================|     |  Mean  |185.0  6320.0 | 39.3   13.2   47.5    0.6   61.3  100.0 | | S.D.  |  0.0    0.0  |  0.0    0.0    0.0    0.0    0.0    0.0 | | Median |185.0 6320.0 | 39.3   13.2   47.5    0.6   61.3  100.0 | `-'

Week Ending October 29th, 2014

 * Logged all the experiments and created the related folder structure.
 * Planning on running a new experiment using 0253/A12 training data.

Week Ending November 5th, 2014

 * I ran a few decode experiments with a small dictionary that I build off NOAA translation.
 * I encountered a problem with the first 70 wav files. It seems that the decoding error rate is very high for this files.

Week Ending November 12th, 2014

 * This week I intend to investigate the decoding problem of the first 70 files.
 * Because I was suppose to have better results with a smaller dictionary I recreated the dictionary and check it manually trying to eliminate any misspellings and the start and end sentence markers: and . I got a little better but not enough.
 * Next I recreated the Statistical Language Model Using CMUCLMTK.

Week Ending November 19th, 2014

 * I continue to work on investigating my decoding problem.
 * I recorded few new wav files and they seems OK.

Week Ending December 3rd, 2014

 * I was using Audacity to apply some equalization functions on my bad waves and I noticed some improvements in sound quality. The problem I have now is that for some reason I cannot save the filtered wave file yet.