Using Wildcards for Replication

If you want to use wildcards to configure the replication, be sure you understand the principles the replication is based on before reading this chapter.

Configuring the hard link Option otherBackupSeries using Wildcards

To make the result of wildcard usage simple and transparent, storeBackup.pl also prints the result of wildcard expansion into the log files.

Imagine you want to hard link all series which we want to replicate in a second step. To make this easy to configure, every series except the ones with a name ending with .norepl have to be replicated. I assume that each series name is the name of the server (server1 and server2). Therefore, the names of the series to replicate is simply server1 and server2, the name of the series not to replicate is server1.norepl and server2.norepl. You want to hard link each series with each other, but naturally, you do not want to get hard links from a replicated series to a not-replicated ones. This would result in error messages in the replication. You can use the following configurations in the configuration files:

for server1 $\rightarrow$ otherBackupSeries = 0:$*$ -:$*$.norepl
for server2 $\rightarrow$ otherBackupSeries = 0:$*$ -:$*$.norepl

for server1.norepl $\rightarrow$ otherBackupSeries = 0:$*$
for server2.norepl $\rightarrow$ otherBackupSeries = 0:$*$

Alternatively, you can use also the following syntax:

for server1 $\rightarrow$ otherBackupSeries = +0:$*$ -:$*$.norepl
for server2 $\rightarrow$ otherBackupSeries = +0:$*$ -:$*$.norepl

for server1.norepl $\rightarrow$ otherBackupSeries = +0:$*$
for server2.norepl $\rightarrow$ otherBackupSeries = +0:$*$

The ``+'' sign is optional. You can also use the plus and minus sign without wildcards, but especially the minus sign doesn't make sense in that case.52

Starting storeBackup.pl (in this example command line options are used) prints the following log:53

$ storeBackup.pl -s s -b b -S server1 '0:*' -- '-:*.norepl'
....
INFO      2014.02.22 10:02:20  6822 consider series <*>:
INFO      2014.02.22 10:02:20  6822     consider series <server1>
INFO      2014.02.22 10:02:20  6822     consider series <server1.norepl>
INFO      2014.02.22 10:02:20  6822     consider series <server2>
INFO      2014.02.22 10:02:20  6822     consider series <server2.norepl>
INFO      2014.02.22 10:02:20  6822 avoid series <*.norepl>:
INFO      2014.02.22 10:02:20  6822     avoid series <server1.norepl>
INFO      2014.02.22 10:02:20  6822     avoid series <server2.norepl>
INFO      2014.02.22 10:02:20  6822 resulting series to hard link
INFO      2014.02.22 10:02:20  6822     series <server1>
INFO      2014.02.22 10:02:20  6822     series <server2>
....

The only difference between the two different cases (see e.g. server1 and server1.norepl) when configuring otherBackupSeries is ``-:$*$.norepl''. If you want to generate the configuration files, you have to differentiate between the series to replicate and the ones not to replicate. But when generating a configuration file, you have to know if you want to replicate that series or not. So that shouldn't be a problem. The advantage you get by using wildcards is the possibility to group series without having to know (and complete) the series names all the time a new series is added.

Naturally, you can also use more than two different kinds of series to hard link by choosing useful names. This is just a simple example to show the principles.

Configuring Replication options seriesToDistribute and backupCopy$*$ using Wildcards

If you want to dynamically replicate backup series (and maybe avoid replication for others, like in this example), you can use wildcards when configuring replication. In the example the following directories in /tmp are used to explain the usage of wildcards:

/tmp/a/b
master backup directory
/tmp/a/d
delta cache directory
/tmp/a/c
replication (``copy'') directory
Now you have to create some backup series directories and to generate the configurations files for the replication:

$ cd /tmp/a
$ mkdir server1 server1.norepl server2 server2.norepl
$ storeBackupReplicationWizard.pl -m b -c c -d d -S server1

In the next step you have to edit the configuration files with the following results:

$ grep -vP '\A\s*\Z|\A[#;]' b/storeBackupBaseTree.conf 
backupTreeName='Master Backup'
backupType=master
seriesToDistribute= +* -*.norepl
deltaCache=/tmp/a/d

$ grep -vP '\A\s*\Z|\A[#;]' d/deltaCache.conf 
backupCopy0='Backup Copy' +*

$ grep -vP '\A\s*\Z|\A[#;]' c/storeBackupBaseTree.conf 
backupTreeName='Backup Copy'
backupType=copy
seriesToDistribute= +*
deltaCache=/tmp/a/d

Important: Remember, if you run your backups with storeBackup.pl, you have to use option lateLinks when using replicaton!

As you see, the syntax is very similar to the one used with otherBackupSeries. Instead of ``+$*$'' you can also write ``$*$''.

The configuration above may show the following error message when starting the replication program on the replication copy directory:

$ storeBackupUpdateBackup.pl -b c
....
ERROR     2014.02.22 11:59:48 10007 c/storeBackupBaseTree.conf series <server1> missing in <Backup Copy>, defined in /tmp/a/d/deltaCache.conf
ERROR     2014.02.22 11:59:48 10007 use option --createNewSeries if you want missing series to be created automatically
....

The only chance for storeBackup to compare the series to distribute from deltaCache to the replication copy when using wildcards is to expand the wildcards. But when running the replication for a new series the very first time, the newly replicated series naturally does not exist in the replication copy directory. You can create it manually with mkdir (which is probably not what you want) or do, what storeBackupUpdateBackup.pl tells you to do - create the new series automatically:

$ storeBackupUpdateBackup.pl -b c --createNewSeries

Heinz-Josef Claes 2014-04-20