TimeMachine is a very mighty but nonintrusive backup utility on Mac. It would be nice to have something comparable also on Linux. But just with few linex shell script and rsync you get some of the features.
There is already a good tutorial: Time Machine for every Unix out there
The following script adds some features:
- Create hourly backups
- Keep daily and weekly backups and delete old ones
- Prevent executing the script more than one at a time (lockfile)
Feel free to use adapt it to your needs!
#!/bin/sh
# settings
backup="/home"
target="/mnt/backup/"
# date for this backup
date=`date "+%Y-%m-%dT%H_%M_%S"`
# check and create lockfile
if [ -f ${target}lockfile ]
then
echo "Lockfile exists, backup stopped."
exit 2
else
touch ${target}lockfile
fi
# create folders if neccessary
if [ ! -e ${target}current ]
then
mkdir ${target}current
fi
if [ ! -d ${target}weekly ]
then
mkdir ${target}weekly
fi
if [ ! -d ${target}daily ]
then
mkdir ${target}daily
fi
if [ ! -d ${target}hourly ]
then
mkdir ${target}hourly
fi
# rsync
rsync \
--archive \
--xattrs \
--human-readable \
--delete \
--link-dest=${target}current \
$backup \
$target$date-incomplete
# backup complete
mv $target$date-incomplete ${target}hourly/$date
rm -r ${target}current
ln -s ${target}hourly/$date ${target}current
touch ${target}hourly/$date
# keep daily backup
if [ `find ${target}daily -maxdepth 1 -type d -mtime -2 -name "20*" | wc -l` -eq 0 ] && [ `find ${target}hourly -maxdepth 1 -name "20*" | wc -l` -gt 1 ]
then
oldest=`ls -1 -tr ${target}hourly/ | head -1`
mv ${target}hourly/$oldest ${target}daily/
fi
# keep weekly backup
if [ `find ${target}weekly -maxdepth 1 -type d -mtime -14 -name "20*" | wc -l` -eq 0 ] && [ `find ${target}daily -maxdepth 1 -name "20*" | wc -l` -gt 1 ]
then
oldest=`ls -1 -tr ${target}daily/ | head -1`
mv ${target}daily/$oldest ${target}weekly/
fi
# delete old backups
find ${target}hourly -maxdepth 1 -type d -mtime +0 | xargs rm -rf
find ${target}daily -maxdepth 1 -type d -mtime +7 | xargs rm -rf
# remove lockfile
rm ${target}lockfile
Hi,
How hard is it to change this script to include monthly and yearly rotation? Are you willing to do it?
What I want:
1) rsync hourly, this can be setup as a cron, to a ssh remote location
2) save last x hours, probably up to the last 24 (PC isn’t running the whole day)
3) save last x days, probably the last 6 days
4) save last x weeks, probably just the last month, ie 4 /5 weeks
5) save last x months, probably the last 11 months
6) save last x years, probably the last 2 years (cumulative it would be possible to go back 3 years, the first year per month and the other 2 years with a year per step
7) remove obsolute backups
PS I don’t quite understand the rotation and removal in this script, is this correct:
1) execute rsysc to folder with current date+time
2) after backup is complete, remove current folder and link it to the last backup
3) create a file the current folder with filename -date+time- of backup
4) save a daily backup
— what are the criteria, I don’t understand the find for daily and hourly, is 20* hardcoded for this century?
5) save a weekly backup
—- what are the criteria, I don’t understand the find for weekly and daily, is 20* hardcoded for this century?
6) delete old backups
—- which files are deleted, what are the criteria?
How hard is it to change this script to include monthly and yearly rotation?
Very easy.
Are you willing to do it?
I’m sure you can manage that ;) It will surely help if you read the article linked in my post.
1) rsync hourly, this can be setup as a cron, to a ssh remote location
Just setup a cron entry for the script and change rsync destination to your remote server
2) save last x hours… 7) remove obsolute backups
That’s exactly what this script does. You have just add monthly and yearly rotation.
— what are the criteria, I don’t understand the find for daily and hourly, is 20* hardcoded for this century?
man findhelps with that. 20* is hardcoded to ignore other files that maybe exist.Lol. I’m working on it. One question, why do you ‘touch’ the new folder rsync created in this line?
touch ${target}hourly/$date
just another question.. most of the time you use ${target} , but sometimes not, why is that? And why don’t the other variables have {} ie $oldest?
why do you ‘touch’ the new folder rsync created in this line?
To set modified time of the folder to the time when backup is finished. That time will be used by find command to move or delete old backups.
most of the time you use ${target} , but sometimes not, why is that? And why don’t the other variables have {} ie $oldest?
{} are not necessary but help bash to know what part of a string is the variable.
I really recommend reading this: http://tldp.org/LDP/Bash-Beginners-Guide/html/
I have made some adjustments, but I believe the rotation and deletion part doesn’t work properly. I touched a file in hourly to be 2 days old, it is moved to daily properly the next time the script is run. But after that, a new daily is created and the old one is removed, instead of being moved to weekl.
Would you have a look at it?
#!/bin/sh
# exit codes are taken from /usr/include/sysexits.h
# settings, target must end in /
backup='/home'
target='/backup'
ssh_user='username'
ssh_server='localhost'
# Create the connection string.
ssh_connect="${ssh_user}@${ssh_server}"
# Check if the ssh connection can be made, a ssh keypair without keyphrase must exist.
ssh -q -q -o 'BatchMode=yes' -o 'ConnectTimeout 3' ${ssh_connect} exit &> /dev/null
if [ $? != 0 ]
then
echo "SSH connection ${ssh_connect} failed."
exit 69 # service unavailable
fi
# check if target exists
if ssh ${ssh_connect} "[ ! -d '${target}' ]"
then
echo "Target '${target}' does not exist, backup stopped."
exit 66 # cannot open input
fi
# Get the current host and append it to target, create a folder for the host if it doesn't exist.
host=`hostname -s`
target="${target}${host}/"
if ssh ${ssh_connect} "[ ! -d '${target}' ]"
then
ssh ${ssh_connect} "mkdir '${target}'"
fi
# Create rotation folders if neccessary.
# Note: this is not a real array since bin/bash can't be used.
folders0='hourly'
folders1='daily'
folders2='weekly'
folders3='monthly'
folders4='yearly'
index=0
max_index=5
while [ ${index} -lt ${max_index} ]
do
eval folder="\${target}\${folders${index}}"
if ssh ${ssh_connect} "[ ! -d '${folder}' ]"
then
ssh ${ssh_connect} "mkdir '${folder}'"
fi
index=`expr ${index} + 1`
done
# Date for this backup.
date=`date '+%Y-%m-%d_%Hh%Mm%Ss'`
# Check and create lockfile.
lockfile=${target}lockfile
if ssh ${ssh_connect} "[ -f '${lockfile}' ]"
then
echo "Lockfile '${lockfile}' already exists, backup stopped."
exit 73 # can't create (user) output file
else
ssh ${ssh_connect} "touch '${lockfile}'"
fi
# -- make backup
# Make the actual backup, note: the first time this is run, the latest folder
# can't be found. rsync will display this but will proceed.
rsync \
--archive \
--xattrs \
--compress \
--human-readable \
--delete \
--link-dest=${target}latest \
${backup} \
${ssh_connect}:${target}${date}-incomplete
# Backup complete, it will be moved to the hourly folder.
ssh ${ssh_connect} "mv ${target}${date}-incomplete ${target}hourly/${date}"
# Create a symlink to new backup .
ssh ${ssh_connect} "ln -s ${target}hourly/${date} ${target}latest-${date}"
# Replace old current symlink with newly created symlink, it's done with mv to
# avoid having a moment where there is no symlink 'current' (-f avoids prompt
# when a target already exists).
ssh ${ssh_connect} "mv -fT ${target}latest-${date} ${target}latest"
# Set the modification moment to now for the new backup, this way, when rotating,
# the time when a backup was finished is used.
ssh ${ssh_connect} "touch ${target}hourly/${date}"
# -- rotate backups
# To determine when to rotate a backup from ie hourly to daily, the latter must
# be checked to see if there is a backup present up until the amount of days
# ago. If there isn't, and the former folder has more then 1 backup, the oldest
# is moved to the latter folder.
rotate1='2' # Rotate the oldest hourly if there is no daily in the last 2 days
rotate2='14' # Rotate the oldest daily if there is no weekly in the last 14 days
rotate3='60' # Rotate the oldest weekly if there is no monthly in the last 60 days (approx. 2 months)
rotate4='730' # Rotate the oldest monthly if there is no yearly in the last 730 days (approx. 2 years)
index=0 # Start with 0, this ways the first from folder can be determined.
max_index=4
while [ ${index} -lt ${max_index} ]
do
eval from="\${target}\${folders${index}}"
# Increase index now so the amount of days and the to folder can be determined.
index=`expr ${index} + 1`
eval days="\${rotate${index}}"
eval to="\${target}\${folders${index}}"
# The -name '20*' is there to limit the files which can be found to everything
# starting with 20*. This means the script only works for the years 2000-2099 but
# this should be enough :).
if [ `ssh ${ssh_connect} "find ${from} -maxdepth 1 -name '20*' | wc -l"` -gt 1 ] && [ `ssh ${ssh_connect} "find ${to} -maxdepth 1 -type d -mtime -${days} -name '20*' | wc -l"` -eq 0 ]
then
oldest=`ssh ${ssh_connect} "ls -1 -tr ${from} | head -1"`
ssh ${ssh_connect} "mv ${from}/$oldest ${to}"
fi
done
# -- delete old backups
# To determine when to delete a backup from ie hourly it must be older then
# the given amount of days. Note, because of this deletion, the rotation is
# done before it.
delete0='0' # Hourly backups older then 1 day are removed.
delete1='7' # Daily backups older then 7 days are removed.
delete2='30' # Weekly backups older then 30 days (approx. 1 month) are removed.
delete3='365' # Monthly backups older then 365 days (approx. 1 year) are removed.
delete4='1095' # Yearly backups older then 1095 days (approx. 3 years) are removed.
index=0
max_index=5
while [ ${index} -lt ${max_index} ]
do
eval from="\${target}\${folders${index}}"
eval days="\${delete${index}}"
ssh ${ssh_connect} "find ${from} -maxdepth 1 -type d -mtime +${days} | xargs rm -rf"
index=`expr ${index} + 1`
done
# remove lockfile
ssh ${ssh_connect} "rm ${lockfile}"
I’ve read your script carefully and found no error. BTW nice comments, I’m always too lazy ;)
Replace old current symlink with newly created symlink, it’s done with mv to
# avoid having a moment where there is no symlink ‘current’ (-f avoids prompt
# when a target already exists). is not necessary in my opinion. The lockfile prevents the script from running more than once and noone else notifies if “latest” symlink does not exist for a second.
I tested the following (maybe more natural than your test):
- run script twice
- the oldest will be moved to daily
- then touched the folder in daily to be 3 days old
- at next backup a new daily will be created and the oldest in daily moved to weekly
Congratulations :)
Thanks for the code review and testing. I’m going to implement the script and keep you posted if anything should be changed.
PS You’re right about the removal of symlinks, with the lockfile, this isn’t needed.