Speech:Backups


 * Home
 * Semesters - Project Work by Semester
 * Information
 * [System Description]
 * Experiments - List of speech experiments

System Description

 * Hardware Setup
 * System Software
 * RedHat installation
 * [Backups]
 * Network Bridge
 * Utilities
 * Software Base

2018 (CURRENT) Backup Solution

 * Lutetia is the backup server -- renamed from 'capstonebackup'

Helpful link: http://rsnapshot.org/rsnapshot/docs/docbook/rest.html

See the following general topology before reading: https://foss.unh.edu/projects/images/f/f2/Overview.pdf


 * Lets Begin
 * Rome is serving as the go-between to backup Caesar to lutetia. This is done by having Caesar and Lutetia mounted to Rome at /mnt/main and /mnt/backup_solution respectively.


 * Step 1
 * For the first back up, do a clean copy cp of the directory to back it up.

nohup cp -i -v -ar /mnt/main/Exp/0309/ /mnt/backup_solution/rsyncTest/


 * Step 2
 * Do an archival rsync which will determine the difference between the initial copy and the current state of the directory. Running this as a nohup and including the backupconfirmation.txt is advisable as follows:

nohup rsync -av /mnt/main/Exp/0309/ /mnt/backup_solution/rsyncTest/ && rsync -av /mnt/main/backup_solution/backupconfirmation.txt /mnt/backup_solution/rsyncTest/ &


 * Note
 * You should see a backupconfirmation.txt this is a cronjob on Caesar that adds the date to the text file once an hours. Then writes-over itself once a week

[root@rome backup_solution]# pwd /mnt/main/backup_solution [root@rome backup_solution]# cat backupconfirmation.txt Sun Apr 8 15:00:25 EDT 2018 Sun Apr 8 15:00:29 EDT 2018 Sun Apr 8 16:00:01 EDT 2018

Known Issue(s)

 * Symbolic Links
 * Symbolic Links slow down the backup considerably. These links can be skipped, however this would not be a true backup. The recommendation to complete an rsync backup while keeping symbolic links it to create a Perl script and run the rsyncs in parallel.


 * Slow
 * The Copy and Rsync processes are very slow, this could potentially be sped-up by running multiple rsyncs at once, such as


 * Rsync directory A
 * Rsync directory B
 * Rsync directory C
 * Changing the CPU affinity and renice priority had no effect on speed.


 * Possible Solution
 * Create a parallelization script in Perl to stack rsyncs atop each other.


 * Space \ Storage
 * If you tail -20 nohup.txt in the Rome:/mnt/backup_solution/rsyncTest/nohup.txt you will see the following errors:

rsync: recv_generator: mkdir "/mnt/backup_solution/rsyncTest/40/qmanager" failed: No space left on device (28)
 * Skipping any contents from this failed directory ***
 * If you run df -h on Rome you will see that /mnt/backup_solution says it is taking up 84% of the partition. Given that Lutetia has a 4TB hard drive, the resolution for this should be to simply extend the size of the partition to utilize the full 4TBs.


 * BACKUPS ARE NOT RUNNING AT THIS TIME (Semester end, 2018)

=OLD BACKUP=

Archived
The current backup system will utilize an system built on rsync. The system called rsnapshot is a perl based utility script that invokes rsync and uses linux native hard linking to save space. (More about that below).

Rsnapshot requires no dependencies other than default linux tools like perl, rsync, and cron.

The Basics

rsnapshot is pretty much a prepackaged script that performs backups and rotates them according to your desired retention settings / space

Rsnapshot info http://rsnapshot.org

File System Explanation

In order to understand how this system works it is important to understand how files are handled on linux systems.

Each file stored on a linux system is assigned to a inode. This is basically a pointer to where the file physically lives on a given hard disk. Suppose we have file1 and we "delete) the file with the rm command. (rm file1). In a linux system this does not actually delete the file from the disk, it simply removes the link/inode telling the system where the file is. Since the system has no paths back to the file it is considered "deleted".

This accomplishes two things.

1.) The system now knows that that particular space is available to be used by something else (free space)

2.) Makes "deleting" files really quick.

file1 still actually resides on the disk until something else is written over it.

The trick comes in that files can have more than one "link".

Hard links vs. Soft links

Soft links are simply pointers to files, similar to shortcuts in Windows. If the original file is removed the soft link or "shortcut" will no longer work.

Hard links are called this because they actually create another file system pointer or inode to the data. So if we have file1 and a hard link file1-hardlink. The system has two "hard" pointers or paths to the the file. If you delete the first file1 you can still access the file with file1-hardlink.

Basically using rm file1 would change the pointer count from 2 back to 1.

rsnapshot and rsync utilize this system to create and take multiple backups without consuming a ton of space.

With hard links file1 might be 100mb and file1-hardlink will take almost no space because it simply takes up another inode.

More detailed explanation [here]http://www.mikerubel.org/computers/rsync_snapshots

Config

Currently our system is setup on CapBack and is located in the IT room 124. This system is a Hyper-V hosted virtual machine.

Current VM details Name:CapBack IP: 192.168.1.5 OS Ubuntu Server (no GUI) 16.04 2GB Ram 1 vCPU 13 GB root partition /dev/sda1 3.6TB (usable) /dev/sdb1 ext4 partition mounted at /mnt/rsnapshot

Extra Packages

The only package installed in addition to a base Ubuntu Server install was SSHD. This can be selected at the time of install.

Installation

rsnapshot was installed on CapBack with the deb package located at /mnt/main/install/deb/rsnapshot_1.3.1-4_all.deb

Installation command dpgk -i rsnapshot_1.3.1-4_all.deb

SSH Keys

Both Rome and CapBack were setup with ssh keys to facilitate automatic logins for backup purposes.

Here is a quick guide to generating keys. http://www.rsync.net/resources/howto/ssh_keys.html

SSH keys were generated on both Rome and named "romersa & romersa.pub"

and CapBack named "capback and capback.pub"

The .pub keys are copied to the opposite system and then copied into the authorized_keys file. This can be done with the command

"cat capback.pub >> authorized_keys"

SSH Setup

Once the files were in place the ssh_config file has to be modified to instruct the systems to use it's private key file automatically.

This file is located at /etc/ssh/ssh_config on both systems.


 * 1) Enter the following line on each respective system.

Rome

CapBack

Config file

/etc/rsnapshot.conf is configuration file used to set parameters like backup destination, backup source, rotation cycle etc...

Partial listing of config file

Usage

This system is currently set to backup at midnight daily.

Currently it is configured to provide 365 daily backups before removing previous backups. It can be configured to roll weekly and even monthly if you prefer. If space becomes an issue this might be the best option because then you can return to a point 5 weeks ago without needing everything in between.

As for space this should be watched and updated changed accordingly at some point after the full backup and a few weeks of running ones. The system does not currently have the ability to monitor free space and adjust this on its own although that functionality could be added in the future.

Because we have defined 365 daily backups you will find daily.0 through daily.365 (or latest number) in the /mnt/rsnapshot/rome folder. When you run a new backup daily.6 will be removed and the newest backup will be written to daily.0.

Current Backup Dirs As of right now the CapBack system is connected to Rome (192.168.1.4) via a dedicated network in the 192.168.1.x range. This is done through a small switch in the server room, and utilizes the eth1 interface on Rome

Rome has the NFS share /mnt/main mounted so we can access and read it for backups.

To add more directories add extra lines under active backups. Be sure to utilize the exclude options if setting large directories such as /mnt/main or /

If it is desired to backup Caesar or another server other the easiest way would be to add an interface and give it an IP on the 192.168.1 range, although you could setup keys and tunnel through Rome.

Automation

The system is currently automated using cron. (https://help.ubuntu.com/community/CronHowto)

Basically this is a task scheduler for windows. You can view the current crontab settings by using the command "crontab -l"

Current Settings

# m h dom mon dow   command 0 0  *   *   *     rsnapshot daily

This indicates the system will run the command "rsnapshot daily" at midnight daily.

The m stands for minutes, h for hour, dom for day of month, mon for month, dow day of week.

So in our case 0 hour is 12am and 0 means 12:00am every day of the month, every month, every day of the week.

=VERY OLD Backup System=

JUST IGNORE THIS
In Spring 2013, the systems group explored the option of Clonezilla. Clonezilla, which is open source, is software that takes a full image of a disc drive at a time to save information. Backups need to be done manually in this case. System downtime will also be experienced with this option.
 * Clonezilla

In Spring 2012, the systems group explored the option of Google Code and upload files from the system that way.


 * Google Code