Speech:Backups

From Openitware
Jump to: navigation, search


System Description

2019 (CURRENT) Backup Solution

Lutetia is the backup server -- renamed from 'capstonebackup'

Helpful link: http://rsnapshot.org/rsnapshot/docs/docbook/rest.html

See the following general topology before reading: https://foss.unh.edu/projects/images/f/f2/Overview.pdf

Lets Begin
Rome is serving as the go-between to backup Caesar to lutetia. This is done by having Caesar and Lutetia mounted to Rome at /mnt/main and /mnt/backup_solution respectively.
Step 1
For the first back up, do a clean copy cp of the directory to back it up.
Use this script to do the backup and add to it / modify it as needed
#!/bin/bash
# Backup Script Begin Here
# Created a copy of the initial directories that you want backed up
# Uses cp to do the initial copy
# Written By: Don Combs
# Date: 2-21-2019 -
# Run from Rome /mnt directory after mounting both Caesar(/mnt/main) and Lutetia(/home/backup_dir)
#
cd /mnt/main
mkdir /mnt/backup_solution/rsync/scripts
nohup cp -i -v -ar /mnt/main/scripts/. /mnt/backup_solution/rsync/scripts/ &
mkdir /mnt/backup_solution/rsync/local
nohup cp -i -v -ar /mnt/main/local/. /mnt/backup_solution/rsync/local/ &
mkdir /mnt/backup_solution/rsync/corpus
nohup cp -i -v -ar /mnt/main/corpus/. /mnt/backup_solution/rsync/corpus/ &
mkdir /mnt/backup_solution/rsync/Exp
mkdir /mnt/backup_solution/rsync/Exp/sp18
nohup cp -i -v -ar /mnt/main/Exp/sp18/. /mnt/backup_solution/rsync/Exp/sp18/ &
mkdir /mnt/backup_solution/rsync/Exp/su18
nohup cp -i -v -ar /mnt/main/Exp/su18/. /mnt/backup_solution/rsync/Exp/su18/ &
mkdir /mnt/backup_solution/rsync/Exp/sp17
nohup cp -i -v -ar /mnt/main/Exp/sp17/. /mnt/backup_solution/rsync/Exp/sp17/ &
mkdir /mnt/backup_solution/rsync/Exp/su17
nohup cp -i -v -ar /mnt/main/Exp/su17/. /mnt/backup_solution/rsync/Exp/su17/ &
mkdir /mnt/backup_solution/rsync/Exp/0313
nohup cp -i -v -ar /mnt/main/Exp/0313/. /mnt/backup_solution/rsync/Exp/0313/ &
mkdir /mnt/backup_solution/rsync/Exp/0314
nohup cp -i -v -ar /mnt/main/Exp/0314/. /mnt/backup_solution/rsync/Exp/0314/ &
mkdir /mnt/backup_solution/rsync/Exp/0315
nohup cp -i -v -ar /mnt/main/Exp/0315/. /mnt/backup_solution/rsync/Exp/0315/ &
mkdir /mnt/backup_solution/rsync/Exp/0316
nohup cp -i -v -ar /mnt/main/Exp/0316/. /mnt/backup_solution/rsync/Exp/0316/ &
mkdir /mnt/backup_solution/rsync/Exp/0317
nohup cp -i -v -ar /mnt/main/Exp/0317/. /mnt/backup_solution/rsync/Exp/0317/ &
mkdir /mnt/backup_solution/rsync/install
nohup cp -i -v -ar /mnt/main/install/. /mnt/backup_solution/rsync/install/ &
mkdir /mnt/backup_solution/rsync/home
nohup cp -i -v -ar /mnt/main/home/. /mnt/backup_solution/rsync/home/ &
as you can see in the above script we are creating the directory structure and completing the initial copy for the rsync process.

you will have to edit the above to fit your enviroment as at the end of each year the 03xx directories are placed into a directory for that semester, IE sp19 will be filled with the 03xx directories listed above.

For a one line example, see below
nohup cp -i -v -ar /mnt/main/Exp/0309/ /mnt/backup_solution/rsyncTest/
Step 2
Do an archival rsync which will determine the difference between the initial copy and the current state of the directory. Running this as a nohup and including the backupconfirmation.txt is advisable as follows:
nohup rsync -av /mnt/main/Exp/0309/ /mnt/backup_solution/rsyncTest/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsyncTest/ &
and the above is good if you want to run the command from the command line everyday but we want to automate the process.
we do that by adding items to the crontab process that runs the above command at a set time automatically. To do that we need to make a special file that we will use fill the crontab file with the correct entries.
Here is the file
#M H DOM MOY DOW  COMMAND
#0 3 * * 1 rsnapshot weekly
#0 3 * * 1 rsync -av /mnt/main/Exp/0309/ /mnt/backup_solution/rsyncTest/
#0 3 * * 1 rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsyncTest/
1 0 * * 1 rsync -av /mnt/main/local /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
0 1 * * 1 rsync -av /mnt/main/scripts /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
#0 2 * * 1 rsync -av /mnt/main/install /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
0 3 * * 1 rsync -av /mnt/main/home /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
1 0 * * 7 rsync -av /mnt/main/corpus /mnt/backup_solution/rsync/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
#0 5 * * 1 rsync -av /mnt/main/Exp/sp17 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
#0 6 * * 1 rsync -av /mnt/main/Exp/su17 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
#0 7 * * 1 rsync -av /mnt/main/Exp/sp18 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
#0 8 * * 1 rsync -av /mnt/main/Exp/su18 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
#0 9 * * 1 rsync -av /mnt/main/Exp/0313 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
0 10 * * 1 rsync -av /mnt/main/Exp/0314 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
0 11 * * 1 rsync -av /mnt/main/Exp/0315 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
0 12 * * 1 rsync -av /mnt/main/Exp/0316 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
0 13 * * 1 rsync -av /mnt/main/Exp/0317 /mnt/backup_solution/rsync/Exp/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsync/
When you are doing this do not forget to add new 03xx directories for backup as time goes by, you will need to do this periodically and it will be up to you to check as the others will not tell you about there new experiments directories they make.
To check that the backup is working correctly
SSH into ROME
Go to this directory:
/var/log
And look for these files
-rw-------. 1 root root  65930 Apr 24 20:30 cron
-rw-------. 1 root root 123063 Mar 31 03:38 cron-20190331
-rw-------. 1 root root 123964 Apr  7 03:10 cron-20190407
-rw-------. 1 root root 127118 Apr 14 04:18 cron-20190414
-rw-------. 1 root root 124757 Apr 21 03:39 cron-20190421
as you can see it creates a new one after it gets to a certain size
to make you life easier run this script against each
cat cron-20190421 | grep rsync
and the above will show they ran, but what about success or failure? Well we go to this direcorty and read these files to see that
cd /var/mail
cat root | grep rsync
This will shoot entries right off the screen so you may want to refine it just a little bit more
tail -f -n 100 root.old <- I changed the file name as this file was getting way to large for what we need to do, there will be a root file there soon enough
but that will only give you info on the last items that were added to the root mail file
So what do we do, first lets get some idea of what we are looking for. Load the file into nano
nano root.old
Then search for a data that you did a backup on.
if you know how to read the crontab file you can see that most of the backups happen at 00:01AM on Mondays ( the 1 day of the week)
So what I looked for was "22 Apr 2019" no quotes
When you do that you will be brought to the beginning of that days cron job where you can scroll down to read what happened.
It is up to you to make the success and failure reporting better.


Note
You should see a backupconfirmation.txt this is a cronjob on Caesar that adds the date to the text file once an hours. Then writes-over itself once a week
[root@rome backup_solution]# pwd
/mnt/main/backup_solution
[root@rome backup_solution]# cat backupconfirmation.txt
Sun Apr  8 15:00:25 EDT 2018
Sun Apr  8 15:00:29 EDT 2018
Sun Apr  8 16:00:01 EDT 2018

Known Issue(s)

Symbolic Links
Symbolic Links slow down the backup considerably. These links can be skipped, however this would not be a true backup. The recommendation to complete an rsync backup while keeping symbolic links it to create a Perl script and run the rsyncs in parallel.
Slow
The Copy and Rsync processes are very slow, this could potentially be sped-up by running multiple rsyncs at once, such as
  • Rsync directory A
  • Rsync directory B
  • Rsync directory C
Changing the CPU affinity and renice priority had no effect on speed.
Possible Solution
  • Create a parallelization script in Perl to stack rsyncs atop each other.
Space \ Storage
If you tail -20 nohup.txt in the Rome:/mnt/backup_solution/rsyncTest/nohup.txt you will see the following errors:
rsync: recv_generator: mkdir "/mnt/backup_solution/rsyncTest/40/qmanager" failed: No space left on device (28)
*** Skipping any contents from this failed directory ***
If you run df -h on Rome you will see that /mnt/backup_solution says it is taking up 84% of the partition. Given that Lutetia has a 4TB hard drive, the resolution for this should be to simply extend the size of the partition to utilize the full 4TBs.


BACKUPS ARE NOT RUNNING AT THIS TIME (Semester end, 2018)



OLD BACKUP

Archived

The current backup system will utilize an system built on rsync. The system called rsnapshot is a perl based utility script that invokes rsync and uses linux native hard linking to save space. (More about that below).

Rsnapshot requires no dependencies other than default linux tools like perl, rsync, and cron.


The Basics

rsnapshot is pretty much a prepackaged script that performs backups and rotates them according to your desired retention settings / space

Rsnapshot info http://rsnapshot.org


File System Explanation

In order to understand how this system works it is important to understand how files are handled on linux systems.

Each file stored on a linux system is assigned to a inode. This is basically a pointer to where the file physically lives on a given hard disk. Suppose we have file1 and we "delete) the file with the rm command. (rm file1). In a linux system this does not actually delete the file from the disk, it simply removes the link/inode telling the system where the file is. Since the system has no paths back to the file it is considered "deleted".

This accomplishes two things.

1.) The system now knows that that particular space is available to be used by something else (free space)

2.) Makes "deleting" files really quick.

file1 still actually resides on the disk until something else is written over it.

The trick comes in that files can have more than one "link".


Hard links vs. Soft links

Soft links are simply pointers to files, similar to shortcuts in Windows. If the original file is removed the soft link or "shortcut" will no longer work.

Hard links are called this because they actually create another file system pointer or inode to the data. So if we have file1 and a hard link file1-hardlink. The system has two "hard" pointers or paths to the the file. If you delete the first file1 you can still access the file with file1-hardlink.

Basically using rm file1 would change the pointer count from 2 back to 1.

rsnapshot and rsync utilize this system to create and take multiple backups without consuming a ton of space.

With hard links file1 might be 100mb and file1-hardlink will take almost no space because it simply takes up another inode.

More detailed explanation [here]http://www.mikerubel.org/computers/rsync_snapshots


Config

Currently our system is setup on CapBack and is located in the IT room 124. This system is a Hyper-V hosted virtual machine.

Current VM details

    Name:CapBack
    IP: 192.168.1.5
    OS Ubuntu Server (no GUI) 16.04
    2GB Ram
    1 vCPU
    13 GB root partition /dev/sda1
    3.6TB  (usable) /dev/sdb1 ext4 partition mounted at /mnt/rsnapshot

Extra Packages

The only package installed in addition to a base Ubuntu Server install was SSHD. This can be selected at the time of install.


Installation

rsnapshot was installed on CapBack with the deb package located at /mnt/main/install/deb/rsnapshot_1.3.1-4_all.deb

Installation command dpgk -i rsnapshot_1.3.1-4_all.deb


SSH Keys

Both Rome and CapBack were setup with ssh keys to facilitate automatic logins for backup purposes.

Here is a quick guide to generating keys. http://www.rsync.net/resources/howto/ssh_keys.html

SSH keys were generated on both Rome and named "romersa & romersa.pub"

[root@rome .ssh]# ls -l
total 24
-rw-r--r--. 1 root root  394 Apr 30 20:18 authorized_keys
-rw-------. 1 root root 1675 Apr 27 20:52 identity
-rw-------. 1 root root 1675 Apr 27 20:59 id_rsa
-rw-r--r--. 1 root root 1993 May  4 15:58 known_hosts
-rw-------. 1 root root 1675 Apr 27 20:48 romersa
-rw-r--r--. 1 root root  399 Apr 27 20:48 romersa.pub


and CapBack named "capback and capback.pub"

root@capback:~/.ssh# ls -l
total 16
-rw------- 1 root root 1197 Apr 27 20:57 authorized_keys
-rw------- 1 root root 1679 Apr 30 20:17 capback
-rw-r--r-- 1 root root  394 Apr 30 20:17 capback.pub
-rw-r--r-- 1 root root  442 Apr 27 20:54 known_hosts

The .pub keys are copied to the opposite system and then copied into the authorized_keys file. This can be done with the command

"cat capback.pub >> authorized_keys"

SSH Setup

Once the files were in place the ssh_config file has to be modified to instruct the systems to use it's private key file automatically.

This file is located at /etc/ssh/ssh_config on both systems.

  1. Enter the following line on each respective system.


Rome

IdentityFile ~/.ssh/romersa

CapBack

   IdentityFile ~/.ssh/capback


Config file

/etc/rsnapshot.conf is configuration file used to set parameters like backup destination, backup source, rotation cycle etc...


Partial listing of config file

# backup destination parameter
###########################
# SNAPSHOT ROOT DIRECTORY #
###########################
 
# All snapshots will be stored under this root directory.
snapshot_root   /mnt/rsnapshot/
 
#########################################
#           BACKUP INTERVALS            #
# Must be unique and in ascending order #
# i.e. hourly, daily, weekly, etc.      #
#########################################
 
#retain         hourly  6
retain          daily   365
#retain         weekly  4
#retain monthly 3
 
#logfile
logfile /var/log/rsnapshot.log


Usage

This system is currently set to backup at midnight daily.

Currently it is configured to provide 365 daily backups before removing previous backups. It can be configured to roll weekly and even monthly if you prefer. If space becomes an issue this might be the best option because then you can return to a point 5 weeks ago without needing everything in between.

As for space this should be watched and updated changed accordingly at some point after the full backup and a few weeks of running ones. The system does not currently have the ability to monitor free space and adjust this on its own although that functionality could be added in the future.


Because we have defined 365 daily backups you will find daily.0 through daily.365 (or latest number) in the /mnt/rsnapshot/rome folder. When you run a new backup daily.6 will be removed and the newest backup will be written to daily.0.


Current Backup Dirs As of right now the CapBack system is connected to Rome (192.168.1.4) via a dedicated network in the 192.168.1.x range. This is done through a small switch in the server room, and utilizes the eth1 interface on Rome

Rome
eth1      Link encap:Ethernet  HWaddr 00:22:19:25:8F:CA
          inet addr:192.168.1.4  Bcast:192.168.1.255  Mask:255.255.255.0


Rome has the NFS share /mnt/main mounted so we can access and read it for backups.

#Active backups - add new lines here
backup  root@192.168.1.4:/mnt/main/scripts      rome/

To add more directories add extra lines under active backups. Be sure to utilize the exclude options if setting large directories such as /mnt/main or /

If it is desired to backup Caesar or another server other the easiest way would be to add an interface and give it an IP on the 192.168.1 range, although you could setup keys and tunnel through Rome.


Automation

The system is currently automated using cron. (https://help.ubuntu.com/community/CronHowto)

Basically this is a task scheduler for windows. You can view the current crontab settings by using the command "crontab -l"

Current Settings

    # m h  dom mon dow   command
      0 0   *   *   *     rsnapshot daily

This indicates the system will run the command "rsnapshot daily" at midnight daily.

The m stands for minutes, h for hour, dom for day of month, mon for month, dow day of week.

So in our case 0 hour is 12am and 0 means 12:00am every day of the month, every month, every day of the week.

VERY OLD Backup System

JUST IGNORE THIS

In Spring 2013, the systems group explored the option of Clonezilla. Clonezilla, which is open source, is software that takes a full image of a disc drive at a time to save information. Backups need to be done manually in this case. System downtime will also be experienced with this option.

In Spring 2012, the systems group explored the option of Google Code and upload files from the system that way.