Loading whole filelist into memory

Ben Escoto bescoto@stanford.edu
Fri, 29 Mar 2002 12:04:01 -0800

Content-Type: text/plain; charset=us-ascii

>>>>> "DB" == Donovan Baarda <abo@minkirri.apana.org.au>
>>>>> wrote the following on Fri, 29 Mar 2002 21:39:30 +1100

  >> Also, --exclude-from-filelist wouldn't make much sense if the
  >> entire filelist couldn't be read first.

  DB> not entirely true... a smart scanner can exclude files and skip
  DB> whole directories as it scans.

Well, suppose there is an exclude filelist.  rdiff-backup wants to
start backing up, so it begins with file, say, /bin/ls.  Should it
back it up, or it is somewhere in the exclude list?  Unless we require
the exclude list to be sorted or something like that we can't process
a single file until the whole list is read.

  DB> Currently my "dirscan.py" module builds and returns a big python
  DB> list of all matching files. This was so you could do things
  DB> like;

  DB> for file in scan(startdir,selectlist): do something...
  DB> I'm thinking of changing/extending this so that it can be used
  DB> to process files as they are scanned. The simplest approach
  DB> would be to introduce an os.walk() style command that applies a
  DB> function to each matching file as it finds them. A probably
  DB> better way would be for me to delve into how things like xrange
  DB> work to see if I could implement something like it.

They are called generators and are a great new feature of python 2.2.
So you can use the exact same:

for file in scan(startdir,selectlist):
    do something...

but have scan(..) yield objects as they are requested by the for loop.

Ben Escoto

Content-Type: application/pgp-signature

Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Exmh version 2.5 01/15/2001