The ifile FAQ



What is ifile?

ifile is a general mail filtering system which uses a modern-day text learning algorithm to intelligently filter mail according to the way the user tends to organize mail.

ifile is different from other mail filtering programs in three major ways:

  1. ifile does not require the user to generate a set of rules in order to successfully filter mail
  2. ifile uses the entire content of messages for filtering purposes
  3. ifile learns as the user moves incorrectly filtered messages to new mailboxes

ifile is not dependent upon any specific mail system and should be adaptable to any system which allows an outside program to perform mail filtering.

If ifile doesn't work properly, where can I get information on the errors that have occured?

Run ifile with the "-g" option to generate a log file in ~/.ifile_log. This will give you detailed information on what ifile is doing at each stage and may help to determine what is causing your problem.

If you get a core file, you might be able to use the GNU debugger to determine what went wrong. You'll need to have solid knowledge of C to make use of it, though. Run the following command in the directory where you find the core file: "gdb ifile core".

Can I use ifile to filter only a portion of my mail?

Yes, but only if you have another filter to do the preprocessing which can call ifile for mail which does not match other patterns. If you are using the MH system, an easy answer to this question is to use slocal. To use slocal, you would create a .maildelivery file (with 0600 permissions) in your home directory with filtering rules, one of those being a rule to filter all remaining mail through ifile. An example .maildelivery file might look like this:

from ifile-discuss ^ ? "/usr/lib/mh/rcvstore +ifile"
from joe ^ ? "/usr/lib/mh/rcvstore +friends"
default - ^ ? "ifile"

Anything from ifile-discuss or joe would always get filtered to a specific folder and all other mail would be filtered according to ifile's filtering method.

Is it possible to have ifile NOT filter mail to a certain mailbox?

Yes, of course the implementation is highly client dependent. For the mh-ifile package and EXMH or MH, one would create a file named ".skip_me" in the directory corresponding to the mailbox you wish ifile to ignore. Once this file is created, ifile will no longer filter mail to that mailbox.

All my mail is being filtered to my inbox. What's wrong?

One possible problem is that one of your ifile executables is not in the PATH of the environment under which your mail client runs. One telltale sign of this problem is that the mail messages which are filtered to your inbox will have the header "X-filter: => inbox". Normally, when ifile filters your mail, it adds a header similar to "X-filter: ifile 0.6.2 => friends" to each message it processes. The "ifile 0.6.2" string comes from the ifilter.<mail_client> program calling "ifile --version". If ifile is not executable, then ifilter.<mail_client> will read an empty string.

Another possible problem is that ifile is causing a segmentation fault during its execution. One way to detect this is to run ifile with the --log-file option and examine the ~/.ifile.log file. Each major operation (reading messages, reading/writing database and calculating ratings) should print two lines to the file, one to indicate the start of the task and one to indicate it's completion. If any operation has a starting line but no completion line, a segmentation fault most likely occurred. If you're not an experienced C hacker, your best bet would be to send an e-mail to the ifile mailing list(s) describing your problem.

How can I install ifile across multiple platforms from a common source directory?

If you run "./configure --help" from the directory where ifile untars to, you will notice the option --srcdir. This allows you to store executables and status files in a directory other than the source directory. If you have multiple platforms to compile for, create one subdirectory off the ifile source directory for each platform. Then, from each platform-specific directory, run "../configure --srcdir=../". This will create a Makefile which can be used to compile all the executable files. All platform-specific files will be stored in the subdirectory of the source directory, instead of being stored in the source directory.

In case you wish to install files in a different directory than configure defaults to, you can use the "--prefix=DIR" to specify where files should be installed. DIR should be a directory which may contain bin/, lib/, src/ and other such directories.

Where/How does ifile store information about my organizational preferences?

ifile keeps a file in your home directory named ".idata" which has information on the frequency of words in each of your mailboxes. Any time new mail is filtered, statistics on the words in the document are stored in this file. Any time mail is moved from one mailbox to another, this data file is updated accordingly.

When ifile makes a decision on where to filter a piece of mail, it does so by comparing the word frequencies in the mail to the folder frequencies in the database. ifile filters the mail in the folder which has the closest match.

What is the format of the .idata file?

The top row is simply a list of your mailboxes. Later in the data file, they are refered to by number according to their order in the first row. Indices for the folders start at 0.

The second row is the total word instances for each folder (i.e. sum of frequencies of all words kept in .idata).

The third row is a count of the number of messages for which information is stored for each folder.

Each row following the first three is composed of three parts. The first is a word. The second is the 'age' of the word (i.e. the number of messages which have been entered into the system since first sight of this word - this is used for trimming out words which occur with low frequency). The third is a listing of folder:frequency pairs. Information for a specific folder is only explicitly kept if the frequency for a word in a folder is non-zero. Frequencies which aren't mentioned in the data file are assumed to be zero.

See the tutorial for an example.

Written by Jason Rennie.

ifile is copyright © 1996-2002 Jason D. M. Rennie

Last modified: Thu Jul 31 15:05:27 2003