> Florian, you mention you use rdiff-backup (for max 20G) and rsync for
> larger. Is it not recommended to use rdiff-backup on large backups?
> Mine are several Tb's with possible gb's daily increments. Would it be
> recommend to stick to rsync?
There are some reasons we do not use rdiff-backup for high
volumes of data that also changes a lot frequently:
1. rdiff-backup is slow if files get heavily modified or deleted.
Storing the increments needs a lot of I/Os and we do not want to spend
that much money on 10k u/min disks just for backups. rdiff-backup also
has serious performance issues for incomplete backups when
it has to revert to the old state.
2. rdiff-backup stores increments, thus it needs constant verification.
This process is extremely time consuming and, again, needs a lot of I/O
and a big tmpfs in ram, unless you want to make it even more time
consuming. We'd rather put our ram in production maschines.
3. listing and restoring of backups requires command line knowledge of
the rdiff-backupp command and read-access to the FULL repository. We
need something that provides easy-access, just like a directory
structure where you can just ls or cp and only if you had appropriate
permissions on the original file.
4. rdiff-backup can only delete the oldest increment(s), not in between.
5. restoring from increments (not the current mirror) is somewhat slow and takes a bunch of cpu and I/O.
6. wheen the repository gets corrupted in some way, it is really hard
to fix this. There is a high risk of losing a lot of files then.
There was a little more, but that were the primary reasons we decided
to use rsync with link-dest as this provides everything we need. It
takes somewhat more storage, but 7200u/min disks are cheap and with sequential
writes they offer succifient performance. And it is dead-simple to fix
a corrupt repository since its just plain files or hardlinks.
That said, one big advantage in using rdiff-backup for os backups is
that you can run it as an unprivileged user on the storage side. It
automatically takes care about saving metadata such as ownership and
permissions and so on even as non-root. This is something rsync cannot
> How about backuppc? What are your
> thoughts/comments/suggestions regarding that?
Some of my clients use backuppc and that seems to run fine. Its fast but
not as fast as simple rsync and if something in the repository goes
wrong, bad things can happen. They do use rsync but spice it up with
some kind of snapshotting and deduplication. They have the same
performance issues for incompleted backup tasks as rdiff-backup.
All in all, too complicated for me. Backups should be as simple as