Speech:Summer 2014 Jared Rhordanz


 * Home
 * Semesters
 * Summer 2014

Week Ending June 4th, 2014

 * 6/2:

Today I plan on reading up on the discoveries that the Systems group made in regards to installing Fedora. I will also create an install disc or two so we're ready to go tomorrow. I may also create the experiment and run the baseline test that each new Fedora install will be compared to.

Systems Group tested and worked with with Fedora version 20, which is still the most current.

I suspected that the machines we are working with are probably 32 bit, but wanted to be certain with which version of Fedora to download. I looked up methods to find this and it was a simple one liner "uname -a" that confirmed 32 bit.

I created the experiment log on the wiki, Exp 0254 on the first_5hr corpus. I wanted to use a smaller corpus but tiny and mini don't seem to be configured correctly for the newer scripts.

I used an cheatsheet that Forrest made last semester to help me remember. For future reference:

'''

THIS METHOD NO LONGER WORKS
''' Running a train 1.	% mkdir  2.	cd  3.	/mnt/main/root/tools/SphinxTrain-1.0/scripts_pl/setup_SphinxTrain.pl -task  4.	cd etc 5.	vim sphinx_train.cfg a.	Line 6 – Update with Exp number. Example: $CFG_DB_NAME = "" b.	Line 7 – Update with Database path. Example: $CFG_BASE_DIR = "/mnt/main/Exp/0252/" c.	Line 79 - Uncomment by removing (Hashtag) d.	Line 80 – Comment out using (Hashtag) e.	Line 107 – set density value example: 2,4,8,16,13….. f.	Line 120 - Change Senone Value Example: (10000) g.	Save and quit with Esc “:qw” 6.	% cd .. Back to experiment directory e.g 001 7.	% /mnt/main/root/sphinx3/scripts/setup_sphinx3.pl -task  8.	vim sphinx_decode.cfg a.	Line 51 – Change  to tmp 9.	% /mnt/main/scripts/user/buildData2.pl  Example: % /mnt/main/scripts/user/buildData2.pl mini/mono 10.	scripts_pl/make_feats.pl -ctl data/train_train.fileids 11.	nohup scripts_pl/RunAll.pl & DONE!

The first_5hr train is taking a long time... I think that it would ultimately save time if I could fix mini or even tiny for testing.

In glancing at the setup of the two, tiny appears to be in it's original state while mini is almost set up. The info folder is empty, however. Steps for setting it up should hopefully be recorded in someone's log and fairly straight forward.


 * 6/3:

The plan today is to install Fedora on Obelix and to make a smaller experiment work with the new symbolic methods.

I'm having trouble getting Obelisk to recognize the DVD to boot from. I looked up the commands to find info on the CD drive and it revealed that the drives do not support DVDs.

I also looked in to maybe updating the BIOs to see if a newer version could handle USBs but had no luck. Dell no longer supports Linux and BIOS updates are done now through flashing (which I am not comfortable with) and installing Windows, updating, and reinstalling Linux. I think our best option is a CD (not DVD).

I'm kind of stuck now as I don't have any media with me here and I can't afford to go pick some up. For now I will try to get the mini corpus working in order to make the tests a more reasonable length.

Erol managed to find a CD-R that Obelix can read, but we had to burn the network install version. The full DVD is about 4 gb, much too large for a CD and the network install in only 700mb but needs internet connection during install. This brings up new issues as I know the networking situation with the machines is complex with Caesar acting as a switch and router.

ifconfig on Caesar:

eth0     Link encap:Ethernet  HWaddr 00:0F:1F:6D:25:3E inet addr:132.177.189.63 Bcast:132.177.191.255  Mask:255.255.252.0 inet6 addr: fe80::20f:1fff:fe6d:253e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500  Metric:1 RX packets:3110989 errors:0 dropped:0 overruns:0 frame:0 TX packets:233554 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:274864159 (262.1 Mb) TX bytes:35397284 (33.7 Mb) Interrupt:28 eth1     Link encap:Ethernet  HWaddr 00:0F:1F:6D:25:3F inet addr:192.168.10.1 Bcast:192.168.10.255  Mask:255.255.255.0 inet6 addr: fe80::20f:1fff:fe6d:253f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500  Metric:1 RX packets:976809 errors:0 dropped:0 overruns:0 frame:0 TX packets:1157611 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:287237549 (273.9 Mb) TX bytes:912468711 (870.1 Mb) Interrupt:29 lo       Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436  Metric:1 RX packets:131 errors:0 dropped:0 overruns:0 frame:0 TX packets:131 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:12820 (12.5 Kb) TX bytes:12820 (12.5 Kb)


 * 6/4:

Met with Professor Jonas and Erol to discuss the installation issues. This resulted in the DVD drive from Caesar being switched into Veralinix. The Fedora installations will be done by swapping the drives into the machine with the DVD drive.

Week Ending June 11th, 2014

 * 6/5:

Continued working on a baseline experiment... they kept crashing early in training. I think this was due to scripts being secretly updated. I am trying again with all the latest scripts in the scripts/user dir.

setup_SphinxTrain.pl prepareExperiment3.pl generateFeats2.pl

I am installing Fedora 20 on Obelix. Hopefully I can work out the networking issues today so I can work on the test experiment remotely.

Week Ending June 18th, 2014
6/12


 * copy exp
 * generatefeats2
 * run train

Today I'm working with Erol to establish a baseline experiment and get the network up on Obelix.

copy exp 0115 entirely

edit config for new location

add the wav files from the dist- they were removed last semester from all old exps

6/17

I'm working on getting the networking setup on Obelix again. Did some reading on NetworkManager and network.service. The manager is the GUI and the service is what actually runs. I'm experimenting with different configuration files for the NIC cards. It's confusing because NetworkManager doesn't recognize eth0 and eth1 by default but rather creates it's own em1 and em2. I don't think this really matters I just need to choose one set to set ONBOOT=yes and one to no.

I was comparing the setup to Rome's and noticed that Rome only has one ethernet port so it's a different setup.

With all my tinkering, I knocked out eth1. I tested with a different cable to confirm it was me. At this point I'm calling it a night. I'm going to start the re-intstall of Fedora before I go just to have a fresh start next time.

Week Ending June 15th, 2014
6/24

I am going to attempt again to get Obelix online. My approach today will be to emulate the settings on Rome. I'll start with the GUI and then try altering the config files if that doesn't do it.

Rome only has one ethernet port and NetworkManager shows that it is set to manual on the ipv4 tab.

netmask 255.255.255.0 gateway is caesar

just entering these fields did not work.

l0 is identical on the machines

This is the NIC card from ROME:

PEERROUTES="yes" IPV6INIT="yes" UUID="48f065c8-27de-41ce-8c7b-57f761e0dac8" IPV6_PEERDNS="yes" DEFROUTE="yes" PEERDNS="yes" IPV4_FAILURE_FATAL="no" HWADDR="00:11:11:2A:20:16" BOOTPROTO="static" IPV6_DEFROUTE="yes" IPV6_AUTOCONF="yes" IPV6_FAILURE_FATAL="no" IPV6_PEERROUTES="yes" TYPE="Ethernet" ONBOOT="yes" NAME="enp2s0" DHCPCLASS= IPADDR="192.168.10.11" NETMASK="255.255.255.0"

This is the card from Obelix after my changes to the GUI:

Actually... in looking for is I noticed that there are four NIC files... em1, em2, ifcfg-Wired_connection_1 and 2. The wired connection ones are configured to start onboot while the em's are not. That's strange because the gui deals with the em's.

I made a copy of the four files on the desktop before i start messing with them.

Disabled networkManager

manually added the dns servers to resolv.conf

nameserver 132.177.189.40 nameserver 132.177.189.41 nameserver 132.177.205.44

i am using ifup em# to make sure they are on

for some reason the mac addresses are wrong for em1 and em2 in their files. I am correcting this.

I have network connectivity now. I tried to ssh into Obelix from my machine but it didn't work. I'm going to look into Sinisa's log because I think this key issue is what he dealt with last semester.

error message:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @   WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that the RSA host key has just been changed. The fingerprint for the RSA key sent by the remote host is 5b:bd:f3:1c:be:61:36:2c:be:c9:0e:58:ad:25:9e:38. Please contact your system administrator. Add correct host key in /mnt/main/home/sp14/rohrdanz/.ssh/known_hosts to get rid of this message. Offending key in /mnt/main/home/sp14/rohrdanz/.ssh/known_hosts:7 RSA host key for obelix has changed and you have requested strict checking. Host key verification failed.

I saved the files again just in case in running_network.

Changed the hostname and reboot... hopefully this might fix it.

ssh-keygen -R obelix

this resets the RSA key... pretty simple. but once I'm ssh'ing in it wants a local password on obelix. I'm going to set myself up an account that matches my wildcats so i can get in.

network doesn't work on reboot...

you can SSH in but cannot access the internet through Firefox. Hmmm.

Week Ending July 2nd, 2014
7/1

Today I am trying again to resolve the issue with Obelix allowing for ssh access but no internet connection. I need the internet in order to install NFS.

I quickly tried copying the config files I saved when it was all running Thursday... no dice.

I noticed the the Wired_connection files have reset their ONBOOT parameter to yes. Setting them to no and trying a reboot.

The network rebooted without an error message.

In Googling ways to check for connectivity I think I may have a DNS or name problem.

I found that my hostname reverted to localhost.

I notice that the other drones have "search caesar" preceding the name servers in their resolv.conf file. I'm going to try adding it to Obelix's conf.

I also stopped and disabled NetworkManager which apparently causes issues when manually setting up the network.

The gateway seems to be missing when I give the "route" command.

I enabled masquerading and was able to ping both Rome and Google's server. iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

I am now running updates on the system to make sure everything is up to date.

7/2

Installed nfs-utils via yum

used this guide http://www.server-world.info/en/note?os=Fedora_19&p=nfs&f=2

NFS seems to be working but I'm unsure that it is as 'complete' as the other drones. I will ask about this at the meeting today.

I also contacted Erol in regards to getting his new methods for training and decoding so I can get rocking on that soon. Once Obelix is done the rest should go much faster as I will have the method down. I will put together a document for my own reference and also for the wiki.

Week Ending July 9nd, 2014
7/7

mounted /usr/local to Caesar's local. I can see the files on caesar but something still seems off.

took a look at root directory on one of the other drones. bin lib and sbin are all links too. there is no mention of this on the wiki.

decided to instead look at rome, since it is a successful fedora install.

I can run the files in the /usr/local/bin on rome but not obelix so something in my setup is off.

used df on rome to show network status:

caesar:/mnt/main       458311936 290061056 144969984  67% /mnt/main

This is what I will try to mount on Obelix.

mount -t nfs caesar:/mnt/main /mnt/main

restarting the network...

I can run the commands in the bin now! but I still don't think it is using Caesar's local director for some reason.

Changed my password on caesar and attempted to log in to Obelix with the new password.

now that it's mounted right I created the soft link and the files in /bin won't run now.

checked back on rome and there is no soft link there.

7/8

I'm restarting Obelix... in retrospect I should have tried this yesterday.

Upon reboot usr/local is a broken link again and Caesar isn't mounted.

remount and /bin programs still don't work.

Week Ending July 16th, 2014
7/14

Did some more reading on NFS. Learned about /etc/fstab which sets up the mount at boot time. I would eventually need to do this anyway and I'm wondering if the /usr/local issue is related to this.

I can't seem to get in to my rohrdanz account on obelix (it's local to it) anymore. I even logged in as root, changed rohrdanz's password and couldn't.

I got an error message on the reboot: write failed: broken pipe

I think it's just timing out. It does reboot NFS mounted.

I went into /usr on one of the other drones (Prof Jonas mentioned that something is off on Rome) to check the permissions. /bin, /lib, and /sbin did not have user write permission so I added it. chmod u+w /lib

Nothing changes.

7/14

on obelix: id rohrdanz uid=2311(rohrdanz) gid=1002(cis790) groups=1002(cis790)

on rome: id rohrdanz uid=2311(rohrdanz) gid=1001(cis790) groups=1001(cis790)

on majestix: id rohrdanz uid=2311(rohrdanz) gid=1001(cis790) groups=1001(cis790),33(video)

I notice the GID is off on obelix so I've changed it in /etc/group to match the others.

Nothing changed... doing a little reading it's not that easy.

groupmod -g 1001 cis790

this worked id'ing myself shows it

curiously when I try ssh'ing in to obelix as rohrdanz again, I no longer get the weird rsa message but still the password doesnt work. I wonder if this has to do with my having an account on obelix already.

/etc/ssh/sshd_config is different on rome. I'm going to try copying it to obelix.

no dice.

Week Ending July 23rd 2014
7/20

Sent a note to Prof. Jonas looking for suggestions on what to look in to.

Sent a note to Erol about the scripts he's using to train and decode.

Caesar is not responding. If this is still the case tomorrow night I may have to check it out after work.

7/21

Caesar had crashed and just needed to be restarted.

I can't even get into Obelix as root anymore. I'm going to try copying Rome's ssh_config and sshd_config files to Obelix. I've designated the old files with a 2.

I've recieved Erol's training and decoding method so I'm going to start on that.

ssh -v obelix OpenSSH_5.4p1, OpenSSL 1.0.0 29 Mar 2010 debug1: Reading configuration data /etc/ssh/ssh_config debug1: Applying options for * debug1: Connecting to obelix [192.168.10.3] port 22. debug1: Connection established. debug1: identity file /mnt/main/home/sp14/rohrdanz/.ssh/id_rsa type 1 debug1: identity file /mnt/main/home/sp14/rohrdanz/.ssh/id_rsa-cert type -1 debug1: identity file /mnt/main/home/sp14/rohrdanz/.ssh/id_dsa type -1 debug1: identity file /mnt/main/home/sp14/rohrdanz/.ssh/id_dsa-cert type -1 debug1: Remote protocol version 2.0, remote software version OpenSSH_6.3 debug1: match: OpenSSH_6.3 pat OpenSSH* debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_5.4 debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug1: kex: server->client aes128-ctr hmac-md5 none debug1: kex: client->server aes128-ctr hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP debug1: SSH2_MSG_KEX_DH_GEX_INIT sent debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY debug1: Host 'obelix' is known and matches the RSA host key. debug1: Found key in /mnt/main/home/sp14/rohrdanz/.ssh/known_hosts:20 debug1: ssh_rsa_verify: signature correct debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug1: SSH2_MSG_NEWKEYS received debug1: Roaming not allowed by server debug1: SSH2_MSG_SERVICE_REQUEST sent debug1: SSH2_MSG_SERVICE_ACCEPT received debug1: Authentications that can continue: publickey,password,hostbased debug1: Next authentication method: publickey debug1: Offering public key: /mnt/main/home/sp14/rohrdanz/.ssh/id_rsa debug1: Authentications that can continue: publickey,password,hostbased debug1: Trying private key: /mnt/main/home/sp14/rohrdanz/.ssh/id_dsa debug1: Next authentication method: password

ssh -v idefix OpenSSH_5.4p1, OpenSSL 1.0.0 29 Mar 2010 debug1: Reading configuration data /etc/ssh/ssh_config debug1: Applying options for * debug1: Connecting to idefix [192.168.10.7] port 22. debug1: Connection established. debug1: identity file /mnt/main/home/sp14/rohrdanz/.ssh/id_rsa type 1 debug1: identity file /mnt/main/home/sp14/rohrdanz/.ssh/id_rsa-cert type -1 debug1: identity file /mnt/main/home/sp14/rohrdanz/.ssh/id_dsa type -1 debug1: identity file /mnt/main/home/sp14/rohrdanz/.ssh/id_dsa-cert type -1 debug1: Remote protocol version 2.0, remote software version OpenSSH_5.4 debug1: match: OpenSSH_5.4 pat OpenSSH* debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_5.4 debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug1: kex: server->client aes128-ctr hmac-md5 none debug1: kex: client->server aes128-ctr hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP debug1: SSH2_MSG_KEX_DH_GEX_INIT sent debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY debug1: Host 'idefix' is known and matches the RSA host key. debug1: Found key in /mnt/main/home/sp14/rohrdanz/.ssh/known_hosts:1 debug1: ssh_rsa_verify: signature correct debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug1: SSH2_MSG_NEWKEYS received debug1: Roaming not allowed by server debug1: SSH2_MSG_SERVICE_REQUEST sent debug1: SSH2_MSG_SERVICE_ACCEPT received debug1: Authentications that can continue: publickey,keyboard-interactive debug1: Next authentication method: publickey debug1: Offering public key: /mnt/main/home/sp14/rohrdanz/.ssh/id_rsa debug1: Server accepts key: pkalg ssh-rsa blen 279 debug1: read PEM private key done: type RSA debug1: Authentication succeeded (publickey). debug1: channel 0: new [client-session] debug1: Requesting no-more-sessions@openssh.com debug1: Entering interactive session. debug1: Sending environment.

7/23

Prof Jonas mentioned at the meeting that Obelix was working correctly at one time. I reverted the sshd_config file back to the old one and once again I can log in as root but not rohrdanz. At least I don't have to be local to work any more.

He also mentioned that the other drones have cron jobs that run "mount -a" periodically and I'll have to do this for Obelix.

Week Ending July 30th, 2014
7/28

I've been looking in to why ssh isn't working properly.

It could possibly be the permissions on .ssh and authorized keys.

in looking in Obelix /root/.ssh there is no authorized_keys

I made one and gave it the proper permissions (600)

I changed the ssh_config file to match Idefix's

I kept experimenting with settings in the sshd_config file but NOTHING works. I am always prompted for a password even though I have it set not to. Makes me wonder if this file is even being used or if the restarting it is working.

7/29

I tried again copying the sshd config file from a drone to obelix... it didn't work.

I have no idea what is causing this issue. use PAM = yes/no is what allows root to get in. All the permissions seem to be in order.

8/5

I think I was mistakenly testing with Caesar's sshd_config instead of Obelix's last week. Whoops. Fortunately I didn't lock myself (or the others) out. i was getting the same issue Obelix has with the other drones though (only root can login through password).

Doing a little more debugging (ssh -vvv HOST) shows that Obelix isn't accepting the public key, but it is checking the right directory.

I copied both ssh config files from Rome to Obelix.

Obelix seems to be missing some keys.

8/10

I generated ssh_host_key and ssh_host_key.pub on Obelix since the other machines had this. Didn't seem to affect the issue.

I tried disabling different authentication methods in the sshd_config file and nothing worked. only root can ssh into obelix.

8/11

I received an email from Prof. Jonas about the ssh issue. He told me of a command that works much better the the -vvv arguements on ssh: service sshd status

The first suggestion was that SELinux is preventing /usr/sbin/sshd from getattr access on the file /usr/bin/tcsh.

ran this cmd:

/sbin/restorecon -v /usr/bin/tcsh

getting this message now: SELinux is preventing /usr/sbin/sshd from read access on the lnk_file authorized_keys.

now it suggests that SElinux is blocking the nfs home directory and suggests

setsebool -P use_nfs_home_dirs 1 It is finally working!

8/12

made a new EXP because the old one was getting cluttered with failures... training on last_5hr on idefix for the baseline now.

I'm using the method that Erol sent me since he's had success:

setup for train create the experiment directory and get into it /mnt/main/scripts/user/prepareExperiment3.pl 3170/train edit and configure sphinx_decode and sphinx_train /mnt/main/scripts/user/generateFeats2.pl nohup scripts_pl/RunAll.pl. & decode perl scripts_pl/make_feats.pl -ctl etc/012_test.fileids perl scripts_pl/decode/slave.pl

Trained on idefix and decoding now.

Training on obelix.