automatic backups: HOWTO





[ Home ]


Things needed

  • Your machine (Referred to as "guest-machine")
  • A *nix machine to backup to (Referred to as "backup-machine")
  • An account in backup-machine (Your account name is assumed to be "youraccount")
  • SSH (secure) access to backup-machine. Most *nix machines run SSH server. To test if backup-machine has ssh access, log in to the backup-machine and type "ssh 127.0.0.1" in the command line (Without the quotes of course).
We will do the following steps.

1. Set up SSH login without passwords

We will create a public-private key pair without passphrase in guest-machine. We then copy the public key to backup-machine. This will allow us to ssh into backup-machine from guest-machine without a password.

Quick steps without explanations

  • guest-machine: ssh-keygen -t dsa
  • guest-machine: cd .ssh/
  • guest-machine: chmod 600 id_dsa
  • guest-machine: scp id_dsa.pub youraccount@backup-machine:
  • guest-machine: ssh backup-machine -l youraccount
  • backup-machine: cat id_dsa.pub >> ./.ssh/authorized_keys2
  • backup-machine: logout
  • guest-machine: ssh backup-machine -l youraccount

Steps (with rants)

Login into guest-machine. Open a terminal and type

ssh-keygen -t dsa

This is followed by your machine spitting out

"Generating public/private dsa key pair."

or some such thing. Wait. This is followed by

"Enter file in which to save the key: .ssh/id_dsa".

Press enter. Press enter twice to leave the passphrase empty. Change to the .ssh directory

cd .ssh/

You should see two files. id_dsa and id_dsa.pub, make sure that you type

chmod 600 id_dsa

If someone steals your id_dsa, you are screwed. Now type

scp id_dsa.pub youraccount@backup-machine:

Type your password. Login into backup-machine

ssh backup-machine -l youraccount

There should be an id_dsa.pub in your home directory. You should append it to the last line of the file "authorized_keys2" in your .ssh/ directory. Dont worry if you dont have an "authorized_keys2" already.

cat id_dsa.pub >> ./.ssh/authorized_keys2

NOTE: Sometimes you will have a ".ssh2" directory instead of ".ssh" directory. Sometimes "authorized_keys2" may be called "authorized_keys". Make appropriate changes. Log out of backup-machine.

logout

Now from guest-machine, test it

ssh backup-machine -l youraccount

You should be logged into backup-machine without typing in a password. How does this work ? Basically backup-machine uses id_dsa.pub (your public key) to encrypt a string and sends it to guest-machine. Guest-machine decrypts it with id_dsa (your private key), to prove that it is authorized to log in. Its just like backup-machine demanding a secret password from guest-machine. Thats why its a bad idea for somebody else to get access to id_dsa file.

2. Create scripts to do actual backup

We will now create two scripts "backup" and "cleanbackup" for backing up stuff in guest-machine. We will also create a file ".excludes.txt" and list the directories that we DO NOT want to backup.

Quick steps without explanations

  • backup-machine: mkdir guest-backup
  • backup-machine: logout
  • guest-machine: cat > .excludes.txt
  • my_pirated_music/
  • my_world_domination_plans/

  • guest-machine: mkdir tmp
  • guest-machine: mkdir scripts
  • guest-machine: cd scripts
  • guest-machine: cat > backup
  • rsync -e ssh -avz --log-format="%%o %%t %%f" --exclude-from=/home/youraccount/.excludes.txt /home/youraccount/ youraccount@backup-machine:guest-backup/ > /home/youraccount/tmp/rsynclog.txt
  • scp /home/youraccount/tmp/rsynclog.txt youraccount@backup-machine:guest-backup/

  • guest-machine: cat > cleanbackup
  • rsync -e ssh -avz --delete --log-format="%%o %%t %%f" --exclude-from=/home/youraccount/.excludes.txt /home/youraccount/ youraccount@backup-machine:guest-backup/ > /home/youraccount/tmp/rsynclog.txt
    scp /home/youraccount/tmp/rsynclog.txt youraccount@backup-machine:guest-backup/

  • guest-machine: chmod 755 backup
  • guest-machine: chmod 755 cleanbackup

Steps (with rants)

In backup-machine, create a directory guest-backup

mkdir guest-backup

This will be the directory in which we will backup files from the guest-machine. In guest-machine, create a file .excludes.txt in your home directory and list the directories you *DO NOT* want to backup. For example, your .excludes.txt can contain lines like

my_pirated_music/
my_world_domination_plans/

Create a directory "tmp" (if it doesnt already exist) and a directory "scripts"

mkdir tmp
mkdir scripts
cd scripts/

Fire up your favorite editor and create a file called "backup" in it, type

rsync -e ssh -avz --log-format="%%o %%t %%f" --exclude-from=/home/youraccount/.excludes.txt /home/youraccount/ youraccount@backup-machine:guest-backup/ > /home/youraccount/tmp/rsynclog.txt

scp /home/youraccount/tmp/rsynclog.txt youraccount@backup-machine:guest-backup/

This asks rsync to backup your home directory (assuming it is /home/youraccount) in guest-machine to the "guest-backup/" directory in the backup-machine. It also creates a log file called "rsynclog.txt" and copies it over to the guest-backup/ directory in backup-machine. So you know which files are backed up and when they were backed up. Note that this DOES NOT backup the directories listed in the .excludes.txt file. Now create a file "cleanbackup"

rsync -e ssh -avz --delete --log-format="%%o %%t %%f" --exclude-from=/home/youraccount/.excludes.txt /home/youraccount/ youraccount@backup-machine:guest-backup/ > /home/youraccount/tmp/rsynclog.txt

scp /home/youraccount/tmp/rsynclog.txt youraccount@backup-machine:guest-backup/

Make sure "backup" and "cleanbackup" have execute permissions

chmod 755 backup
chmod 755 cleanbackup

Note that "backup" and "cleanbackup" are identical except for the magic words

--delete

in "cleanbackup". What "backup" does is to make a copy all your files from guest-machine to backup-machine. Assume you had a file "foo.txt" in guest-machine and then you run "backup" script. Now assume you deleted "foo.txt" and run "backup" again. "foo.txt" is not removed from backup-machine. This is a good thing, you can recover files that you accidentally deleted. However, once in a while, we would like to remove junk from our backups, thats where "cleanbackup" comes in. It removes "foo.txt" from backup-machine.

3. Automating the backups

We will now use "cron" to run our backup script daily

Quicksteps without explanation:

  • guest-machine: cd scripts
  • guest-machine: cat > cronjob.txt
    # rsynch at 2:30am every morning
    30 2 * * * /home/youraccount/scripts/backup
    # rsynch at 7 am every first of the month (with delete)
    0 7 1 * * /home/youraccount/scripts/cleanbackup

  • guest-machine: crontab -u youraccount cronjob.txt

Steps (with rants)

We will use "cron" to automate our backups. In guest-machine, create a file "cronjob.txt" and in it type

# rsynch at 2:30am every morning
30 2 * * * /home/youraccount/scripts/backup

# rsynch at 7 am every first of the month (with delete)
0 7 1 * * /home/youraccount/scripts/cleanbackup

Now set your crontab

crontab -u youraccount cronjob.txt

You can see your cron jobs by typing

crontab -l

Your machine should spit back whatever was in cronjob.txt. What we have done is to instruct "cron" to run our "backup" script at 2:30 every night and our "cleanbackup" script on the first of every month. Of course you can choose not to run "cleanbackup" at all, or run your scripts at some other time.

After your first backup, be sure to examine "rsynclog.txt" file. It has a list of all files backed up and the time of the backup

Have fun ! I will put up a windows version of this document. It involves setting up cygwin and creating a batch file to do the rsync and using windows scheduler or cron to schedule jobs.