Is it the right tool?

Ben Escoto
Thu, 29 Aug 2002 11:17:40 -0700

Content-Type: text/plain; charset=us-ascii

>>>>> "PC" == pchandler  <>
>>>>> wrote the following on Thu, 29 Aug 2002 11:42:53 +0100

  PC> I'm involved in moving a small organisation from windows 98 p2p
  PC> to running a Linux server.  Everything's fine except they are
  PC> now hooked on a (discontinued) Powerquest product called
  PC> Datakeeper.  That monitored folders on the Win98 server and kept
  PC> versions of changed files.  It didn't work on timed runs but, I
  PC> guess, on system calls.  You set how many versions you want to
  PC> keep and it will keep that many, independent of how old the
  PC> versions are.

Hmm, maybe a versions + time approach would be better.  This way it
seems if you had the habit of saving your documents every 5 minutes
you would exhaust the number of versions..  (And a file that changed
once would have old versions around for years.)

  PC> I've been trying to find a similar Linux solution and have been
  PC> trying rdiff-backup.  The way I've been testing it is running it
  PC> every 5 minutes to try and catch recent changes but I of course
  PC> end up with huge (in terms of no. of files) rdiff-backup-data
  PC> directories.  The other problem is that restoring is cumbersome
  PC> when all you want to do is restore the 'last version'.

Yeah, rdiff-backup wasn't really designed with a 5 minute cycle in
mind.  What do you mean about restoring the 'last version' though?
Can't you just use cp?

  PC> I appreciate the problem is the way I'm trying to use
  PC> rdiff-backup, but I'm wondering whether it is possible to
  PC> achieve what they want (i.e. a set number of prior versions
  PC> irrespective of age) or whether anyone knows of an alternative
  PC> approach/product.  I'd really like to get close as their IT guy
  PC> is going out on a bit of a limb by going Linux.

You can't really set a different number of backups on a file-by-file
basis with rdiff-backup.  I suppose it wouldn't be hard to write a
script that just culled the rdiff-backup-data directory based on
number of increments for each file instead of by a specific time, but
the larger problem seems to be that it is inefficient to search
through all a system's data every 5 minutes.  Intercepting system
calls seems to be a must.

    I don't have any experience with this (perhaps someone else on
this will pipe up) but there was a recent discussion on the rsync list
about this.  Maybe this message will be helpful to you:  (edited by

Subject: Re: directory replication between two servers
From: jw schultz <>
Date: Wed, 3 Jul 2002 17:40:04 -0700

On Wed, Jul 03, 2002 at 11:10:13AM -0700, Eric Ziegast wrote:
> If you need read-write access on one server and need to replicate data
> to a read-only server and need synchronous operation (i.e.: the
> write must be completed on the remote server before returning to the
> local server), then you need operating-system-level or storage-level
> replication products.
>     Veritas:
> 	It's not available on Linux yet, but Volume Replicator performs
> 	block-level incremental copies to keep two OS-level filesystems
> 	in sync.  $$
> 	File Replicator is based (interestingly enough) on rsync, and
> 	runs under a virtual filesystem layer.  It is only as reliable
> 	as a network-wide NFS mount, though.  (I haven't seen it used
> 	much on a WAN.)  $$
>     Andrew File System (AFS)
> 	This advanced filesystem has methods for replication
> 	built in, but have a high learning curve for making them
> 	work well.  I don't see support for Linux, though. $
>     Distributed File System (DFS)
> 	Works alot like AFS, built for DCE clusters, commercially
> 	supported (for Linux too)  $$$
>     NetApp, Procom (
> 	Several network-attached-storage providers have replication
> 	methods built into their products.  The remote side is kept
> 	up to date, but integrity of the remote data depends on the
> 	application's use of snapshots.  $$$
>     EMC, Compaq, Hitachi (
> 	Storage companies have replication methods and best practices
> 	built into their block-level storage products.   $$$$
> If others know of other replication methods or distributed filesystem
> work, feel free to chime in.

		A filesystem level sharing over the network.
		Don't pooh-pooh NFS because it is old.  I
		don't recommend it on an unsecured network
		but it is suprisingly fast.  Given a fast
		network Netledger found Oracle ran faster on a 
		NFS mounted volumes than on small local disks.
		The linux NFS server does need some
		performance improvement.  Not suitable for

		A distributed filesystem Based on research from AFS.
		Single tree structure that lives as an alien
		in the unix tree.  Primary focus is
		disconnected operation. Lacks locking so even
		when all nodes are online can have update
		conflicts.  Available on linux, is FREE.

		A distributed filesystem Based on research
		from Coda.  Seems less alien than than Coda
		with better support for multiple
		mountpoints. Provides locking mechanisms for
		connected operations but still allows
		resyncronization on reconnect.  Developed on
		Linux, is FREE.

		A cluster filesystem can be used with
		multiport disks, SAN devices and xNDB.
		Filesystem is online, writable for all
		nodes.  Storage device is responsible for
		HA.  Still in developement.

If you look cluster websites you will probably find a few
more solutions.


	J.W. Schultz            Pegasystems Technologies
	email address:

		Remember Cernan and Schmitt

Content-Type: application/pgp-signature

Version: GnuPG v1.0.7 (GNU/Linux)
Comment: Exmh version 2.5 01/15/2001