0.7.6 slower than 0.6.1

Ben Escoto bescoto@stanford.edu
Sun, 16 Jun 2002 01:42:48 -0700

Content-Type: text/plain; charset=us-ascii

>>>>> "DG" == dean gaudet <dean-list-rdiff-backup@arctic.org>
>>>>> wrote the following on Sat, 15 Jun 2002 19:21:28 -0700 (PDT)

  DG> at times there's a cpu limitation, but i'm guessing the problem
  DG> is that rdiff-backup is mostly serialized.  fixing that is a
  DG> chore though :)

  DG> i suspect that there'd be some benefit to spawning a couple of
  DG> rdiffs in parallel.  basically so that while one rdiff is
  DG> blocked on reading data, another is calculating.

Yep, but at most a factor of 2 benefit, if I understand your
suggestion right.  Suppose the entire process involves x seconds of
calculation and other CPU tasks, and y seconds of reading data.  If
they could be done in parallel, the whole thing could take (at
minimum) max(x,y) seconds.  Currently the session would take x+y
seconds.  (x+y)/max(x,y) is greatest when x=y and then it equals 2.

    But that's assuming that CPU time and reading time happen to be
exactly the same, and that they can all be done simultaneously.
Probably the actual speed to be gained is much less than a factor of

  DG> with that change i know that my bottleneck would be bandwidth --
  DG> there's only a 128kbit uplink from my mirror to my primary
  DG> (1.5mbit the other way).  i can watch the uplink saturate when
  DG> rdiff hits a large file and read-ahead can feed it data as fast
  DG> as the cpu can do the checksums.  when it's in amongst small
  DG> files the uplink isn't saturated at all.

Is CPU usage close to 100% when the line usage drops off?  If so it
could just be a CPU problem, not the serialization.

  DG> in terms of scaling the mirror host to handle many primaries it
  DG> might be nice to have rdiff-backup cache the signature files on
  DG> the mirror.  perhaps a file per directory with the names and
  DG> signatures of the directory contents.  not bulletproof -- if
  DG> someone goes about mucking in the mirror they could damage
  DG> things.  but this would reduce the cpu and i/o requirements on
  DG> the mirror host.

Yep, that would make sense, but doesn't sound exactly trivial to
implement, and it would probably be a headache to figure out
everything that could go wrong at every point.  I guess we'll see,
after I optimize the current version a bit, how many real-world
situations could require something like that.

Ben Escoto

Content-Type: application/pgp-signature

Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Exmh version 2.5 01/15/2001