Speech:CreateExp.pl

Summary
Title: createExp.pl Author: Jake Sprague and Nick Bielinski Location: mnt/main/scripts/user/ Usage: createExp.pl [-t] createExp.pl [-l] createExp.pl [-d] Examples: createExp.pl -t (in order to setup your sub-experiment directory for the training process)

Description
This script makes it easier to setup an experiment. This script does NOT completely automate the process of running experiments.

You should create the subdirectory in your assigned group or team directory. Things that we think should be added later are the process for running and experiment on unseen data, and the addition of being able to change configuration files when the train is being created. As of right now, it can only create trains using the default configurations.

The steps for running this script to simplify the entire SEEN data experiment process are as follows:


 * 1) SSH into a drone
 * 2) Run createExp.pl -t from root experiment directory of desired sub-experiment location to setup the train.
 * 3) After genFeats completes, cd into the sub-experiment directory and run "nohup scripts_pl/RunAll.pl &"
 * 4) Once that completes, run createExp.pl -l (you must be in your sub-experiment directory if returning to this script at another time)
 * 5) Run createExp.pl -d from the sub-experiment directory. Once that finishes, pick up in the experiment directions at "nohup run_decode.pl"

=begin comment Create Experiment (createExp.pl v2)
 * 1) !/usr/bin/perl

Original Author: Jake Sprague & Nick Bielinski Semester: Spring 2017 Start Date: 4/18/2017 End Date: 5/3/2017 Last Modified: 5/3/2018

This script combines the various steps required to setup a full working experiment. This script is not responsible for creating wiki pages currently.

=cut

use Cwd;

if ($#ARGV != 1) {	print "Usage: createExp.pl [-t|-l|-d]\n"; print "Example: createExp.pl -t\n"; print "Information: See wiki page for full instructions";

# exit -1; }

$flag = $ARGV[0]; #set flag $corpusSize; $subExp;
 * 1) global directory variable so that we dont have to keep asking the user if they are doing just an individual piece
 * 2) $dir;
 * 3) copy what corpus they used incase they ran a train, if they didnt ask them for the corpus size

if($flag eq '-t') { makeTrain; } elsif($flag eq '-l') { makeLM; } elsif($flag eq '-d') { makeDecode; }

sub getTrainDir {	print "Please enter the directory you want your sub-experiment to be in: \n"; my $dir = ; chomp $dir; $subExp = $dir;

# changes current dir to the directory previously entered or create if doesnt exist unless (chdir($dir)){ print("Directory ".$dir." not found.\nCreating -> " .$dir."\n"); mkdir($dir); $subExp = $dir; chdir($dir) or die "cannot change: $!\n"; } }

sub makeTrain {	getTrainDir;

# the switchboard data size you want to use for your train print("\n\nPlease enter the size of the switchboard you would like to train (5hr, 30hr, 145hr, 300hr):"); $size = ; chomp $size; $corpusSize = $size; # run the makeTrain script using the size previously entered $cmd = "makeTrain.pl switchboard $size/train"; system($cmd);

# use genfeats in the directory structure to generate feats needed for training $cmd = "genFeats.pl -t"; system($cmd);

print("\n\nTrain complete! Now cd into sub-experiment directory and re-run script with -l flag"); }

sub makeLM {	mkdir("LM"); chdir("LM") or die "cannot change to language model directory: $!\n";

print("\nPlease enter the size of the switchboard you would like to train (5hr, 30hr, 145hr, 300hr):"); $corpusSize = ; chomp $corpusSize;

$cmd = 	"cp -i /mnt/main/corpus/switchboard/$corpusSize/train/trans/train.trans trans_unedited "; system($cmd); print($cmd);

# prepare the transcript $cmd = "parseLMTrans.pl trans_unedited trans_parsed"; system($cmd);

# # # copy the script that creates the language model # $cmd = "cp -i /mnt/main/scripts/user/lm_create.pl ."; # system($cmd);

# execute the script $cmd = "lm_create.pl trans_parsed"; system($cmd);

print("\n\nLM complete! Now re-run script with -d flag from the current directory");

}

sub makeDecode {	print("\nPlease enter the size of the switchboard you would like to train (5hr, 30hr, 145hr, 300hr):"); $corpusSize = ; chomp $corpusSize;

my $dir = getcwd; # this should get us the experiment folder and the sub experiment folder $dirFullNumber = ( split '/', $dir )[ -2 ]; # change to the etc directory chdir("$subExp/etc") or die "cannot change to the etc directory: $!\n"; $cmd = "awk '{print $1}' /mnt/main/corpus/switchboard/$corpusSize/test/trans/train.trans >> $dir/$dirFullNumber\_decode.fileids"; system($cmd);

print("\n\nContinue decode steps by running \nnohup run_decode.pl  &\n\n in etc folder")

}