Linux backups like Time Machine with rsync hard links

TimeMachine is a very mighty but nonintrusive backup utility on Mac. It would be nice to have something comparable also on Linux. But just with few linex shell script and rsync you get some of the features.

There is already a good tutorial: Time Machine for every Unix out there

The following script adds some features:

  • Create hourly backups
  • Keep daily and weekly backups and delete old ones
  • Prevent executing the script more than one at a time (lockfile)

Feel free to use adapt it to your needs!

Update 08.03.2012: Mondane wrote an enhanced version of my script. Since WordPress broke the code and the comments were hardly readable please find his code attached: backup.sh.zip

Update 03.03.2013: Mondane sent me a new version with fancy Ubuntu Unity notifications (see his comment). You can download it here: timebackup_3-mar-2013.tar Thanks!



#!/bin/sh

# settings
backup=”/home”
target=”/mnt/backup/”

# date for this backup
date=`date “+%Y-%m-%dT%H_%M_%S”`

# check and create lockfile
if [ -f ${target}lockfile ]
then
echo “Lockfile exists, backup stopped.”
exit 2
else
touch ${target}lockfile
fi

# create folders if neccessary
if [ ! -e ${target}current ]
then
mkdir ${target}current
fi
if [ ! -d ${target}weekly ]
then
mkdir ${target}weekly
fi
if [ ! -d ${target}daily ]
then
mkdir ${target}daily
fi
if [ ! -d ${target}hourly ]
then
mkdir ${target}hourly
fi

# rsync
rsync \
–archive \
–xattrs \
–human-readable \
–delete \
–link-dest=${target}current \
$backup \
$target$date-incomplete

# backup complete
mv $target$date-incomplete ${target}hourly/$date
rm -r ${target}current
ln -s ${target}hourly/$date ${target}current
touch ${target}hourly/$date

# keep daily backup
if [ `find ${target}daily -maxdepth 1 -type d -mtime -2 -name "20*" | wc -l` -eq 0 ] && [ `find ${target}hourly -maxdepth 1 -name "20*" | wc -l` -gt 1 ]
then
oldest=`ls -1 -tr ${target}hourly/ | head -1`
mv ${target}hourly/$oldest ${target}daily/
fi

# keep weekly backup
if [ `find ${target}weekly -maxdepth 1 -type d -mtime -14 -name "20*" | wc -l` -eq 0 ] && [ `find ${target}daily -maxdepth 1 -name "20*" | wc -l` -gt 1 ]
then
oldest=`ls -1 -tr ${target}daily/ | head -1`
mv ${target}daily/$oldest ${target}weekly/
fi

# delete old backups
find ${target}hourly -maxdepth 1 -type d -mtime +0 | xargs rm -rf
find ${target}daily -maxdepth 1 -type d -mtime +7 | xargs rm -rf

# remove lockfile
rm ${target}lockfile

27 comments

  1. Mondane

    Hi,

    How hard is it to change this script to include monthly and yearly rotation? Are you willing to do it?

    What I want:

    1) rsync hourly, this can be setup as a cron, to a ssh remote location
    2) save last x hours, probably up to the last 24 (PC isn’t running the whole day)
    3) save last x days, probably the last 6 days
    4) save last x weeks, probably just the last month, ie 4 /5 weeks
    5) save last x months, probably the last 11 months
    6) save last x years, probably the last 2 years (cumulative it would be possible to go back 3 years, the first year per month and the other 2 years with a year per step
    7) remove obsolute backups

    PS I don’t quite understand the rotation and removal in this script, is this correct:

    1) execute rsysc to folder with current date+time
    2) after backup is complete, remove current folder and link it to the last backup
    3) create a file the current folder with filename -date+time- of backup
    4) save a daily backup
    — what are the criteria, I don’t understand the find for daily and hourly, is 20* hardcoded for this century?
    5) save a weekly backup
    —- what are the criteria, I don’t understand the find for weekly and daily, is 20* hardcoded for this century?
    6) delete old backups
    —- which files are deleted, what are the criteria?

  2. jan

    How hard is it to change this script to include monthly and yearly rotation?

    Very easy.

    Are you willing to do it?

    I’m sure you can manage that ;) It will surely help if you read the article linked in my post.

    1) rsync hourly, this can be setup as a cron, to a ssh remote location

    Just setup a cron entry for the script and change rsync destination to your remote server

    2) save last x hours… 7) remove obsolute backups

    That’s exactly what this script does. You have just add monthly and yearly rotation.

    — what are the criteria, I don’t understand the find for daily and hourly, is 20* hardcoded for this century?

    man find helps with that. 20* is hardcoded to ignore other files that maybe exist.

  3. Mondane

    Lol. I’m working on it. One question, why do you ‘touch’ the new folder rsync created in this line?

    touch ${target}hourly/$date

  4. Mondane

    just another question.. most of the time you use ${target} , but sometimes not, why is that? And why don’t the other variables have {} ie $oldest?

  5. jan

    why do you ‘touch’ the new folder rsync created in this line?

    To set modified time of the folder to the time when backup is finished. That time will be used by find command to move or delete old backups.

    most of the time you use ${target} , but sometimes not, why is that? And why don’t the other variables have {} ie $oldest?

    {} are not necessary but help bash to know what part of a string is the variable.

    I really recommend reading this: http://tldp.org/LDP/Bash-Beginners-Guide/html/

  6. Mondane

    I have made some adjustments, but I believe the rotation and deletion part doesn’t work properly. I touched a file in hourly to be 2 days old, it is moved to daily properly the next time the script is run. But after that, a new daily is created and the old one is removed, instead of being moved to weekl.

    Would you have a look at it?

    Removed Code and added to post. / Jan-Kaspar

  7. jan

    I’ve read your script carefully and found no error. BTW nice comments, I’m always too lazy ;)

    Replace old current symlink with newly created symlink, it’s done with mv to
    # avoid having a moment where there is no symlink ‘current’ (-f avoids prompt
    # when a target already exists).
    is not necessary in my opinion. The lockfile prevents the script from running more than once and noone else notifies if “latest” symlink does not exist for a second.

    I tested the following (maybe more natural than your test):
    - run script twice
    - the oldest will be moved to daily
    - then touched the folder in daily to be 3 days old
    - at next backup a new daily will be created and the oldest in daily moved to weekly

    Congratulations :)

  8. Mondane

    Thanks for the code review and testing. I’m going to implement the script and keep you posted if anything should be changed.

    PS You’re right about the removal of symlinks, with the lockfile, this isn’t needed.

  9. Mondane

    I made some changes to my version of the script:
    - added excludes
    - made it possible to change the ‘hostname’ identifier
    - lockfile is placed locally and is used to check the PID of the backup
    - should a lockfiles exists, but there is no process running with it’s PID, backup starts again and remaining incomplete folders are removed
    - should work with folders with spaces (every source and target is enclosed in quotes)
    - start, finish and errors are written to a log file, every line includes the PID

    Removed Code and added to post. / Jan-Kaspar

  10. Roy

    Hello Jan-Kaspar and Mondane,

    I really like the backup script. But I have one question. I’m only using the backup locally on my machine. How can a do this the best way? Change the file for instance or use ssh locally?

    Thanks

    • Mondane

      I tested the script locally first, indeed with SSH locally. So, I would do it with SSH. It has the advantage that if you would ever move the backups to another location, the only thing you have to change is the server location.

      • intangybles

        Hi Mondane & Jan,
        Can I ask a novice question, how do you run it locally with SSH? it keeps just asking me for a password?
        Thank you.

        • Mondane

          You need to generate a ssh key and place the public part in your local ‘authorized keys’ file. That should work.

          Good luck.

  11. Taryck

    Have to adapt script because not using default SSH port….
    ssh ${ssh_connect} becomes ssh -p ${ssh_port:-22} ${ssh_connect}
    and for Rsync :
    –rsh=”ssh -p ${ssh_port:-22}” \

  12. Francois Scheurer

    thanks for this work!
    you can find also this script to backup the whole disk with rsync here: http://blog.pointsoftware.ch/index.php/howto-local-and-remote-snapshot-backup-using-rsync-with-hard-links/
    It uses the same idea of file deduplication thanks to hard-links, uses also MD5 integrity signature, ‘chattr’ protection, filter rules, disk quota, retention policy with exponential distribution (backups rotation while saving more recent backups than older, so with 10 backups can can keep one year of daily backups).
    It was already used in Disaster Recovery Plans for banking companies, in order to replicate datacenters, using only little network bandwidth and transport encryption tunnel.
    it can be used locally on the servers or per network on a central remote backup server.

    it is free of course ^^
    Francois Scheurer

  13. sanjay

    Shouldn’t the target have a trailing slash?
    This will ensure that the source directory (/home) will be created on the target.

  14. Mondane

    Just updated my version of the script to include more logging and support for a SSH port (thanks Taryck). I use the script to rsync to a Synology Diskstation NAS, version 4.1 doesn’t support extended attributes. Because of this, I removed the –xattr option from rsync.

    The backup.sh script will display a notification on Ubuntu Unity. When using Ubuntu Unity, the script backup-indicator.py will display an icon indicating a running backup. For our Windows users using Cygwin, install AutoHotKey and run backup-indicator.ahk.

    NB I sent a compressed package to Jan for him to post.

  15. Sean

    Is this code posted somewhere (GitHub maybe)? I’d like to write a Chef cookbook for installing/configuring this easily on multiple systems and keeping the script version updated.

  16. Mondane

    Turned out there were some errors:

    - the AutoHotKey script checked for running Windows processes, Cygwin processes don’t show up there. Changed it to check using cygwin PS.
    - added path to executables in backup.sh to avoid problems. In Cygwin and Windows 7, hostname from Windows was used which returns ^M in the name. Obviously this breaks.

    A new version has been sent to Jan and I’m setting a github repo as of now.

  17. Mike

    Great stuff. Modified a bit to fit my needs such as passing the backup and target values as arguments, changing some options on rsync, etc. Love the simplicity and elegance.

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>