- Semesters - Project Work by Semester
- [System Description]
- Experiments - List of speech experiments
2019 (CURRENT) Backup Solution
- Lutetia is the backup server -- renamed from 'capstonebackup'
Helpful link: http://rsnapshot.org/rsnapshot/docs/docbook/rest.html
See the following general topology before reading: https://foss.unh.edu/projects/images/f/f2/Overview.pdf
- Lets Begin
- Rome is serving as the go-between to backup Caesar to lutetia. This is done by having Caesar and Lutetia mounted to Rome at /mnt/main and /mnt/backup_solution respectively.
- Step 1
- For the first back up, do a clean copy cp of the directory to back it up.
- Use this script to do the backup and add to it / modify it as needed
#!/bin/bash # Backup Script Begin Here # Created a copy of the initial directories that you want backed up # Uses cp to do the initial copy # Written By: Don Combs # Date: 2-21-2019 - # Run from Rome /mnt directory after mounting both Caesar(/mnt/main) and Lutetia(/home/backup_dir) # cd /mnt/main mkdir /mnt/backup_solution/rsync/scripts nohup cp -i -v -ar /mnt/main/scripts/. /mnt/backup_solution/rsync/scripts/ & mkdir /mnt/backup_solution/rsync/local nohup cp -i -v -ar /mnt/main/local/. /mnt/backup_solution/rsync/local/ & mkdir /mnt/backup_solution/rsync/corpus nohup cp -i -v -ar /mnt/main/corpus/. /mnt/backup_solution/rsync/corpus/ & mkdir /mnt/backup_solution/rsync/Exp mkdir /mnt/backup_solution/rsync/Exp/sp18 nohup cp -i -v -ar /mnt/main/Exp/sp18/. /mnt/backup_solution/rsync/Exp/sp18/ & mkdir /mnt/backup_solution/rsync/Exp/su18 nohup cp -i -v -ar /mnt/main/Exp/su18/. /mnt/backup_solution/rsync/Exp/su18/ & mkdir /mnt/backup_solution/rsync/Exp/sp17 nohup cp -i -v -ar /mnt/main/Exp/sp17/. /mnt/backup_solution/rsync/Exp/sp17/ & mkdir /mnt/backup_solution/rsync/Exp/su17 nohup cp -i -v -ar /mnt/main/Exp/su17/. /mnt/backup_solution/rsync/Exp/su17/ & mkdir /mnt/backup_solution/rsync/Exp/0313 nohup cp -i -v -ar /mnt/main/Exp/0313/. /mnt/backup_solution/rsync/Exp/0313/ & mkdir /mnt/backup_solution/rsync/Exp/0314 nohup cp -i -v -ar /mnt/main/Exp/0314/. /mnt/backup_solution/rsync/Exp/0314/ & mkdir /mnt/backup_solution/rsync/Exp/0315 nohup cp -i -v -ar /mnt/main/Exp/0315/. /mnt/backup_solution/rsync/Exp/0315/ & mkdir /mnt/backup_solution/rsync/Exp/0316 nohup cp -i -v -ar /mnt/main/Exp/0316/. /mnt/backup_solution/rsync/Exp/0316/ & mkdir /mnt/backup_solution/rsync/Exp/0317 nohup cp -i -v -ar /mnt/main/Exp/0317/. /mnt/backup_solution/rsync/Exp/0317/ & mkdir /mnt/backup_solution/rsync/install nohup cp -i -v -ar /mnt/main/install/. /mnt/backup_solution/rsync/install/ & mkdir /mnt/backup_solution/rsync/home nohup cp -i -v -ar /mnt/main/home/. /mnt/backup_solution/rsync/home/ &
- as you can see in the above script we are creating the directory structure and completing the initial copy for the rsync process.
you will have to edit the above to fit your enviroment as at the end of each year the 03xx directories are placed into a directory for that semester, IE sp19 will be filled with the 03xx directories listed above.
- For a one line example, see below
nohup cp -i -v -ar /mnt/main/Exp/0309/ /mnt/backup_solution/rsyncTest/
- Step 2
- Do an archival rsync which will determine the difference between the initial copy and the current state of the directory. Running this as a nohup and including the backupconfirmation.txt is advisable as follows:
nohup rsync -av /mnt/main/Exp/0309/ /mnt/backup_solution/rsyncTest/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsyncTest/ &
- and the above is good if you want to run the command from the command line everyday but we want to automate the process.
- we do that by adding items to the crontab process that runs the above command at a set time automatically. To do that we need to make a special file that we will use fill the crontab file with the correct entries.
- Here is the file
#M H DOM MOY DOW COMMAND #0 3 * * 1 rsnapshot weekly #0 3 * * 1 rsync -av /mnt/main/Exp/0309/ /mnt/backup_solution/rsyncTest/ #0 3 * * 1 rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsyncTest/ 1 0 * * 1 rsync -av /mnt/main/local /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ 0 1 * * 1 rsync -av /mnt/main/scripts /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ #0 2 * * 1 rsync -av /mnt/main/install /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ 0 3 * * 1 rsync -av /mnt/main/home /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ 1 0 * * 7 rsync -av /mnt/main/corpus /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ #0 5 * * 1 rsync -av /mnt/main/Exp/sp17 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ #0 6 * * 1 rsync -av /mnt/main/Exp/su17 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ #0 7 * * 1 rsync -av /mnt/main/Exp/sp18 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ #0 8 * * 1 rsync -av /mnt/main/Exp/su18 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ #0 9 * * 1 rsync -av /mnt/main/Exp/0313 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ 0 10 * * 1 rsync -av /mnt/main/Exp/0314 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ 0 11 * * 1 rsync -av /mnt/main/Exp/0315 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ 0 12 * * 1 rsync -av /mnt/main/Exp/0316 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/ 0 13 * * 1 rsync -av /mnt/main/Exp/0317 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
- When you are doing this do not forget to add new 03xx directories for backup as time goes by, you will need to do this periodically and it will be up to you to check as the others will not tell you about there new experiments directories they make.
- To check that the backup is working correctly
- SSH into ROME
- Go to this directory:
- And look for these files
-rw-------. 1 root root 65930 Apr 24 20:30 cron -rw-------. 1 root root 123063 Mar 31 03:38 cron-20190331 -rw-------. 1 root root 123964 Apr 7 03:10 cron-20190407 -rw-------. 1 root root 127118 Apr 14 04:18 cron-20190414 -rw-------. 1 root root 124757 Apr 21 03:39 cron-20190421
- as you can see it creates a new one after it gets to a certain size
- to make you life easier run this script against each
cat cron-20190421 | grep rsync
- and the above will show they ran, but what about success or failure? Well we go to this direcorty and read these files to see that
cd /var/mail cat root | grep rsync
- This will shoot entries right off the screen so you may want to refine it just a little bit more
tail -f -n 100 root.old <- I changed the file name as this file was getting way to large for what we need to do, there will be a root file there soon enough
- but that will only give you info on the last items that were added to the root mail file
- So what do we do, first lets get some idea of what we are looking for. Load the file into nano
- Then search for a data that you did a backup on.
- if you know how to read the crontab file you can see that most of the backups happen at 00:01AM on Mondays ( the 1 day of the week)
- So what I looked for was "22 Apr 2019" no quotes
- When you do that you will be brought to the beginning of that days cron job where you can scroll down to read what happened.
- It is up to you to make the success and failure reporting better.
- You should see a backupconfirmation.txt this is a cronjob on Caesar that adds the date to the text file once an hours. Then writes-over itself once a week
[root@rome backup_solution]# pwd /mnt/main/backup_solution [root@rome backup_solution]# cat backupconfirmation.txt Sun Apr 8 15:00:25 EDT 2018 Sun Apr 8 15:00:29 EDT 2018 Sun Apr 8 16:00:01 EDT 2018
- Symbolic Links
- Symbolic Links slow down the backup considerably. These links can be skipped, however this would not be a true backup. The recommendation to complete an rsync backup while keeping symbolic links it to create a Perl script and run the rsyncs in parallel.
- The Copy and Rsync processes are very slow, this could potentially be sped-up by running multiple rsyncs at once, such as
- Rsync directory A
- Rsync directory B
- Rsync directory C
- Changing the CPU affinity and renice priority had no effect on speed.
- Possible Solution
- Create a parallelization script in Perl to stack rsyncs atop each other.
- Space \ Storage
- If you tail -20 nohup.txt in the Rome:/mnt/backup_solution/rsyncTest/nohup.txt you will see the following errors:
rsync: recv_generator: mkdir "/mnt/backup_solution/rsyncTest/40/qmanager" failed: No space left on device (28) *** Skipping any contents from this failed directory ***
- If you run df -h on Rome you will see that /mnt/backup_solution says it is taking up 84% of the partition. Given that Lutetia has a 4TB hard drive, the resolution for this should be to simply extend the size of the partition to utilize the full 4TBs.
- BACKUPS ARE NOT RUNNING AT THIS TIME (Semester end, 2018)
The current backup system will utilize an system built on rsync. The system called rsnapshot is a perl based utility script that invokes rsync and uses linux native hard linking to save space. (More about that below).
Rsnapshot requires no dependencies other than default linux tools like perl, rsync, and cron.
rsnapshot is pretty much a prepackaged script that performs backups and rotates them according to your desired retention settings / space
Rsnapshot info http://rsnapshot.org
File System Explanation
In order to understand how this system works it is important to understand how files are handled on linux systems.
Each file stored on a linux system is assigned to a inode. This is basically a pointer to where the file physically lives on a given hard disk. Suppose we have file1 and we "delete) the file with the rm command. (rm file1). In a linux system this does not actually delete the file from the disk, it simply removes the link/inode telling the system where the file is. Since the system has no paths back to the file it is considered "deleted".
This accomplishes two things.
1.) The system now knows that that particular space is available to be used by something else (free space)
2.) Makes "deleting" files really quick.
file1 still actually resides on the disk until something else is written over it.
The trick comes in that files can have more than one "link".
Hard links vs. Soft links
Soft links are simply pointers to files, similar to shortcuts in Windows. If the original file is removed the soft link or "shortcut" will no longer work.
Hard links are called this because they actually create another file system pointer or inode to the data. So if we have file1 and a hard link file1-hardlink. The system has two "hard" pointers or paths to the the file. If you delete the first file1 you can still access the file with file1-hardlink.
Basically using rm file1 would change the pointer count from 2 back to 1.
rsnapshot and rsync utilize this system to create and take multiple backups without consuming a ton of space.
With hard links file1 might be 100mb and file1-hardlink will take almost no space because it simply takes up another inode.
More detailed explanation [here]http://www.mikerubel.org/computers/rsync_snapshots
Currently our system is setup on CapBack and is located in the IT room 124. This system is a Hyper-V hosted virtual machine.
Current VM details
Name:CapBack IP: 192.168.1.5 OS Ubuntu Server (no GUI) 16.04 2GB Ram 1 vCPU 13 GB root partition /dev/sda1 3.6TB (usable) /dev/sdb1 ext4 partition mounted at /mnt/rsnapshot
The only package installed in addition to a base Ubuntu Server install was SSHD. This can be selected at the time of install.
rsnapshot was installed on CapBack with the deb package located at /mnt/main/install/deb/rsnapshot_1.3.1-4_all.deb
Installation command dpgk -i rsnapshot_1.3.1-4_all.deb
Both Rome and CapBack were setup with ssh keys to facilitate automatic logins for backup purposes.
Here is a quick guide to generating keys. http://www.rsync.net/resources/howto/ssh_keys.html
SSH keys were generated on both Rome and named "romersa & romersa.pub"
[root@rome .ssh]# ls -l total 24 -rw-r--r--. 1 root root 394 Apr 30 20:18 authorized_keys -rw-------. 1 root root 1675 Apr 27 20:52 identity -rw-------. 1 root root 1675 Apr 27 20:59 id_rsa -rw-r--r--. 1 root root 1993 May 4 15:58 known_hosts -rw-------. 1 root root 1675 Apr 27 20:48 romersa -rw-r--r--. 1 root root 399 Apr 27 20:48 romersa.pub
and CapBack named "capback and capback.pub"
root@capback:~/.ssh# ls -l total 16 -rw------- 1 root root 1197 Apr 27 20:57 authorized_keys -rw------- 1 root root 1679 Apr 30 20:17 capback -rw-r--r-- 1 root root 394 Apr 30 20:17 capback.pub -rw-r--r-- 1 root root 442 Apr 27 20:54 known_hosts
The .pub keys are copied to the opposite system and then copied into the authorized_keys file. This can be done with the command
"cat capback.pub >> authorized_keys"
Once the files were in place the ssh_config file has to be modified to instruct the systems to use it's private key file automatically.
This file is located at /etc/ssh/ssh_config on both systems.
- Enter the following line on each respective system.
/etc/rsnapshot.conf is configuration file used to set parameters like backup destination, backup source, rotation cycle etc...
Partial listing of config file
# backup destination parameter ########################### # SNAPSHOT ROOT DIRECTORY # ########################### # All snapshots will be stored under this root directory. snapshot_root /mnt/rsnapshot/ ######################################### # BACKUP INTERVALS # # Must be unique and in ascending order # # i.e. hourly, daily, weekly, etc. # ######################################### #retain hourly 6 retain daily 365 #retain weekly 4 #retain monthly 3 #logfile logfile /var/log/rsnapshot.log
This system is currently set to backup at midnight daily.
Currently it is configured to provide 365 daily backups before removing previous backups. It can be configured to roll weekly and even monthly if you prefer. If space becomes an issue this might be the best option because then you can return to a point 5 weeks ago without needing everything in between.
As for space this should be watched and updated changed accordingly at some point after the full backup and a few weeks of running ones. The system does not currently have the ability to monitor free space and adjust this on its own although that functionality could be added in the future.
Because we have defined 365 daily backups you will find daily.0 through daily.365 (or latest number) in the /mnt/rsnapshot/rome folder. When you run a new backup daily.6 will be removed and the newest backup will be written to daily.0.
Current Backup Dirs As of right now the CapBack system is connected to Rome (192.168.1.4) via a dedicated network in the 192.168.1.x range. This is done through a small switch in the server room, and utilizes the eth1 interface on Rome
Rome eth1 Link encap:Ethernet HWaddr 00:22:19:25:8F:CA inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.255.0
Rome has the NFS share /mnt/main mounted so we can access and read it for backups.
#Active backups - add new lines here backup email@example.com:/mnt/main/scripts rome/
To add more directories add extra lines under active backups. Be sure to utilize the exclude options if setting large directories such as /mnt/main or /
If it is desired to backup Caesar or another server other the easiest way would be to add an interface and give it an IP on the 192.168.1 range, although you could setup keys and tunnel through Rome.
The system is currently automated using cron. (https://help.ubuntu.com/community/CronHowto)
Basically this is a task scheduler for windows. You can view the current crontab settings by using the command "crontab -l"
# m h dom mon dow command 0 0 * * * rsnapshot daily
This indicates the system will run the command "rsnapshot daily" at midnight daily.
The m stands for minutes, h for hour, dom for day of month, mon for month, dow day of week.
So in our case 0 hour is 12am and 0 means 12:00am every day of the month, every month, every day of the week.
VERY OLD Backup System
JUST IGNORE THIS
In Spring 2013, the systems group explored the option of Clonezilla. Clonezilla, which is open source, is software that takes a full image of a disc drive at a time to save information. Backups need to be done manually in this case. System downtime will also be experienced with this option.
In Spring 2012, the systems group explored the option of Google Code and upload files from the system that way.