Speech:CopyExp.pl

Summary
Title: copyExp.pl Author: Jake Sprague and Nick Bielinski Location: mnt/main/scripts/user/ Usage: copyExp.pl [-t] copyExp.pl [-d] copyExp.pl [-a] Examples: perl copyExp.pl -t /mnt/main/Exp/0296/001 /mnt/main/Exp/0296/006

Description
This script automates the process of copying previous experiments trains, or decodes from a source directory to a new destination directory. The script works off of flagging. For example, if you do -t, that will copy over the train parts of the experiment. The same goes for -d, but only copying the decode parts. Lastly, -a will copy an entire directory to a new directory. All files and folders will be updated to the new destination directory. For example, if you are copying experiments from 001 to 006, it will find all folders and files with an instance of 001 and replace it with 006. It will also look at the decode and train setting files and replace the old directories with the new, so files will not get overwritten in the older directory.

You will notice that most of this script is not actually Perl code, but rather it is terminal commands built into linux operating systems. Most of the Perl code relates to getting path directories, flagging, conditionals, and then calling the various subroutines below once the flag has been determined.

All unix commands are first made in a string $cmd, and then we put it into the system function provided by perl. This would basically be the same as you opening up a terminal in ubuntu and typing pwd to see your working directory. We did it this way initially so we could first print out what our $cmd was actually inputting to the terminal rather than simply running the command and hoping ti does as expected.

You will notice a lot of the cmd commands will have something like \" in them. This is because we needed it to be treated as a string, but in order to tell the terminal that this quotation markw as not the end of our command, we needed to put a \ in front of the ", which acts as an escape character, so it would continue on down the rest of our command.

Probably the most difficult piece we had to do was $cmd = "find $dest/ -name \'*$srcExpNumber*\' -type f -exec bash -c \'mv \"\$1\" \"\${1/$srcExpNumber/$destExpNumber}\"\' -- {} \\;";

This command finds every instance of a file in the destination after it has been copied over. It does this using wildcards. If you put *001*, that means it will match anything with 001 in it, regardless of whether there is text that does not match in front of it or behind it. We then use -type f in order to find only files, and then use regex string substitution to replace every matched file from for example 001 to 006. We have a command below this one that does the same exact thing, but instead of files, and only changes directory names.

'''PLEASE NOTE: AFTER RUNNING THE SCRIPT, IF YOU GET A LOT OF "DIRECTORIES NOT FOUND" DO NOT WORRY. THIS HAPPENS EVEN ON SUCCESSFUL RUNS'''

=begin comment Copy Experiment (copyExp.pl v1)
 * 1) !/usr/bin/perl

Original Author: Jaden Henry, Arias Talari, Daniel Beitel Semester: Spring 2018 Start Date: 2/28/17 Last Modified: 3/26/18

Currently working: -Generic copy using src and dest (ignores flag for content copied) -Copy AND rename -overwrite links in sphinx_train/decode.cfg - Updating symbolic links after copy (ex: sphinx_decode/train.cfg, possibly more) - Copying only train, only decode, or both

Needing implementation: - ? Improve overwrite protection ? - ? Allow for simplified paths ? - ? Make src and dest are both sub experiment directories ?

Recent changes: - update symbolic links - allow to copy train, decode, or both as flagged

This script copys experiment files from to while maintaining symbolic links between experiment files. =cut

if ($#ARGV != 2) {	print "3 Arguements required\n"; print "Usage: copyExp.pl [-a|-t|-d] \n"; print "Example: copyExp.pl -t /mnt/main/Exp/0296/001 /mnt/main/Exp/0296/006\n"; print "Example: copyExp.pl -a /mnt/main/Exp/0296/copySource /mnt/main/Exp/0296/copyDest\n"; print "Information: You must use full paths for src and dest.\n"; print "Flag -a: Copies entire experiment content\n"; print "Flag -t: Copies content created by training process\n"; print "Flag -d: Copies content created by decode process\n"; exit -1; }

$flag = $ARGV[0]; #set flag $src = $ARGV[1]; # set source path $dest = $ARGV[2]; # set dest path $srcExpNumber = ( split '/', $src )[ -1 ]; $destExpNumber = ( split '/', $dest )[ -1 ];

if($flag eq '-t') { copyTrain; } elsif($flag eq '-d') { copyDecode; } elsif($flag eq '-a') { copyAll; }

sub copyTrain {	print "Copying all train files...\n"; $cmd = "cp -i -r $src/*.html $dest/"; system($cmd); $cmd = "cp -i -r $src/bin $dest/"; system($cmd); $cmd = "cp -i -r $src/bwaccumdir $dest/"; system($cmd); $cmd = "cp -i -r $src/etc $dest/"; system($cmd); $cmd = "cp -i -r $src/feat $dest/"; system($cmd); $cmd = "cp -i -r $src/logdir $dest/"; system($cmd); $cmd = "cp -i -r $src/model_architecture $dest/"; system($cmd); $cmd = "cp -i -r $src/model_parameters $dest/"; system($cmd); $cmd = "cp -i -r $src/python $dest/"; system($cmd); $cmd = "cp -i -r $src/qmanager $dest/"; system($cmd); $cmd = "cp -i -r $src/scripts_pl $dest/"; system($cmd); $cmd = "cp -i -r $src/trees $dest/"; system($cmd); $cmd = "cp -i -r $src/wav $dest/"; system($cmd); editTrainCfg; editDecodeCfg; print "Done!\n"; }

sub copyDecode {	print "Copying all decode files...\n"; $cmd = "cp -i -r $src/etc/$srcExpNumber . '_decode.fileids' $dest/etc"; system($cmd); $cmd = "cp -i -r $src/etc/hyp.trans $dest/etc"; system($cmd); $cmd = "cp -i -r $src/etc/scoring.log $dest/etc"; system($cmd); print "Done!\n"; }

sub copyAll {	print "Copying all files...\n"; $cmd = "cp -i -r $src/* $dest"; system($cmd); print "Done!\n"; editTrainCfg; editDecodeCfg; }

system("basename $src");


 * 1) # $cmd = "cp $src/testsrc.txt $dest/test_$destExpNumber.txt";  #move and replace
 * 2) # system($cmd);

print "Updating links...\n";

sub editTrainCfg {	print "Begin sphinx_train.cfg\n"; $cmd = "sed -i s:$src:$dest: $dest/etc/sphinx_train.cfg"; system($cmd);
 * 1) #sphinx_train

$regex = "'s:CFG_DB_NAME = \"$srcExpNumber\":CFG_DB_NAME = \"$destExpNumber\":'"; #regex to find and replace old db name $cmd = "sed -i $regex $dest/etc/sphinx_train.cfg"; # print "$cmd \n"; system($cmd); }

sub editDecodeCfg {	print "Begin sphinx_decode.cfg\n"; $cmd = "sed -i s:$src:$dest: $dest/etc/sphinx_decode.cfg"; system($cmd);
 * 1) #sphinx_decode

$regex = "\"s:DEC_CFG_DB_NAME = \'$srcExpNumber\':DEC_CFG_DB_NAME = \'$destExpNumber\':\""; #regex to find and replace old db name $cmd = "sed -i $regex $dest/etc/sphinx_decode.cfg"; #print "$cmd \n"; system($cmd); }


 * 1) --CHANGES EVERY INSTANCE OF 001 TO 006 IN ALL FILES--
 * 2) $cmd = "find $dest/ -type f -name \"*\" -exec sed -i 's/001/006/g' {} \\;";
 * 3) $cmd = "find $dest/ -depth -execdir rename -n 's/001/006/g' {} + ";


 * 1) $cmd = "find . -name '*001*' -type d -exec bash -c 'mv "$1" "${1/001/006}"' -- {} \;";

$cmd = "find $dest/ -name \'*$srcExpNumber*\' -type f -exec bash -c \'mv \"\$1\" \"\${1/$srcExpNumber/$destExpNumber}\"\' -- {} \\;"; system($cmd); sleep (5); $cmd = "find $dest/ -name \'*$srcExpNumber*\' -type d -exec bash -c \'mv \"\$1\" \"\${1/$srcExpNumber/$destExpNumber}\"\' -- {} \\;"; system($cmd);
 * 1) Find and replace source exp number with dest exp number in file names
 * 1) Find and replace source exp number with dest exp number in directory names

print "Done!\n";