Completed Backups

This chapter describes dependencies (or non-existing dependencies) when all backups are complete. This means, there are absolutely no ``late-links'' to set and all replication jobs finished successfully. To finish backups with ``lateLinks'' or replications you have to use storeBackupUpdateBackup.pl. If you have interrupted backups, they are simply broken and also ``finished'' in this sense.54

A simple Backup (no replication, no lateLinks) is just a directory with several files in it. It is located at backupDir/series/timestamp. The files in the backup might be hard linked to other files in the same backup or to files somewhere else (normally other backups). This hard linking has nothing to do with storeBackup's functionality. (Naturally, storeBackup tries to generate hard links to save space (deduplication), but that's it.) Let's make the following thought experiment: You have two files hard linked in one backup, and you delete the second one of these hard links in the backup and then copy the first one to exactly the path/name of the deleted second one. The result are two files with the same content instead of one file (inode) with two hard links. This (useless) change in your backup means nothing to storeBackup. It is not even able to detect your change.

So it is very easy: storeBackup doesn't know and doesn't care in any way if (identical) files are hard linked or not. That's the reason why you can delete (completed) backups (without lateLinks references to them) like you want (this affects the number of hard links to files also shown in other backups) or why you can copy your backups to another file system with less or more hard links supported per file via linkToDirs.pl. Hard links in the backup are managed and controlled by the file system only, not by storeBackup.

But storeBackup has other information about the backup. Within each backup, there is a file called .md5CheckSums.bz2 (or maybe not compressed). This file has the following structure:

# contents/md5 compr dev-inode inodeBackup ctime mtime atime size uid gid mode filename
dir 0 18-139259 0 1376114212 1376114212 1376114500 0 1049 1049 509 sub1
c5f89e40c144b6fb8b61f2ef72e4b556 c 18-141517 148481 1376114209 1376114209 1376114500 31400 1049 1049 493 pwd
c5f89e40c144b6fb8b61f2ef72e4b556 c 18-141521 148481 1376114212 1376114212 1376114500 31400 1049 1049 493 sub1/pwd
b5607b4dc7d896c0fab5c4a308239161 c 18-141513 147606 1376114202 1376114202 1376114500 110088 1049 1049 493 ls
b5607b4dc7d896c0fab5c4a308239161 c 18-141516 147606 1376114205 1376114205 1376114500 110088 1049 1049 493 sub1/ls

This is the printout of a small example I made. In the first column, you see the keyword `dir' which means the items described in that line are directories. Below, you can see the md5 sums of the files pwd and ls in different subdirectories. You also see some of them with the same md5 sums, so these have been copies in the sourceDir. There's another file, .md5CheckSums.info:

version=1.3
date=2013.08.10 08.04.29
sourceDir='/tmp/a/s'
followLinks=0
compress='bzip2'
uncompress='bzip2' '-d'
postfix='.bz2'
comprRule='$size > 1024 and' 'not $file =~ /\.zip\Z|\.bz2\Z|\.gz\Z|\.tgz\Z|\.jpg\Z|\.gif\Z|\.tiff?\Z|\.mpe?g\Z|\.mp[34]\Z|\.mpe?[34]\Z|\.ogg\Z|\.gpg\Z|\.png\Z|\.lzma\Z|\.xz\Z|\.mov\Z/i'
exceptDirs=
includeDirs=
exceptRule=
includeRule=
writeExcludeLog=no
exceptTypes=
archiveTypes=
specialTypeArchiver=cpio
checkBlocksRule0=
checkBlocksBS0=
checkBlocksCompr0=
checkBlocksRead0=
checkBlocksRule1=
checkBlocksBS1=
checkBlocksCompr1=
checkBlocksRead1=
checkBlocksRule2=
checkBlocksBS2=
checkBlocksCompr2=
checkBlocksRead2=
checkBlocksRule3=
checkBlocksBS3=
checkBlocksCompr3=
checkBlocksRead3=
checkBlocksRule4=
checkBlocksBS4=
checkBlocksCompr4=
checkBlocksRead4=
preservePerms=yes
lateLinks=no
lateCompress=no
cpIsGnu=no
logInBackupDir=no
compressLogInBackupDir=no
logInBackupDirFileName='.storeBackup.log'
linkToRecent=

It gives the storeBackup scripts several information needed to do additional tasks with the backup, e.g. how to restore: In this example, the compressed files in the backup have the postfix .bz2 (keyword postfix) and have been uncompressed with the external command bzip2 -d (keyword uncompress).55

There's also an empty directory called .storeBackupLinks which contains the information if the backup was created with lateLinks and is not yet finished via storeBackupUpdateBackup.pl. But that's explained later.

The entry for ``compression'' (c or u) can be set to value b also. This means, that the particular file is a ``blocked file''. At the place the file name points to, storeBackup created a subdirectory with the name of the ``blocked file'' in the sourceDir. Beside the splitted parts of the saved file, there is a file called .md5BlockCheckSums.bz2 in this directory. This file contains the information about the splitted parts of the ``blocked file'', eg:

66fa1a8f82c35ca87a08aed9701c5d20 c add_VirtualBox/HardDisks/debian.vdi/0000000001.bz2
c522c1db31cc1f90b5d21992fd30e2ab c add_VirtualBox/HardDisks/debian.vdi/0000000002.bz2
c522c1db31cc1f90b5d21992fd30e2ab c add_VirtualBox/HardDisks/debian.vdi/0000000003.bz2
c522c1db31cc1f90b5d21992fd30e2ab c add_VirtualBox/HardDisks/debian.vdi/0000000004.bz2
.....

The first column contains the md5 sum, the second one c (compressed) or u (uncompressed) and the relative path/name combination of that block inside the backup.

Conclusion

Knowing the information shown above, it is easy to understand, that storeBackup is pretty insensible against manipulations you do with the backups - naturally only if you do not touch the integrity of the single backups (backup directories). You can copy backups, move series or even single backups (on the same file system) around. If you do so, you naturally might have to adjust the paths and names (series) in you configuration files according to the changes you made.

Heinz-Josef Claes 2014-04-20