fffff lll fff ff lll iii fff fff iii lll iii fff iii lll fffffff lll eeeeee 000 1 iiii fff iiii lll eee ee 0 0 11 iii fff iii lll eeeeeeee v v 0 0 1 iii fff iii lll eee v v 0 0 1 iiiii fffff iiiii lllll eeeeee v 000 * 111 Alpha Released Sat Aug 3 20:49:01 EDT 1996 What is ifile? ifile is an intelligent mail filtering program which uses framework provided by the mh mailing system and the slocal filtering system. It uses a text-based learning algorithm named bayesian learning to determine the user's preferences. This allows it to learn where the user likes to have his or her mail files without requiring a strict set of rules to be layed out. Does it actually work? Yes. It's not perfect and it does take some time to learn a user's preferences, but in my experiences, it has done quite a good job learning where I like my mail filtered. For example, within a day of posting the first message about ifile on the exmh.users bboard, I received about 15 responses concerning the filter. Only three of those required me to do any refiling into my exmh mailbox. Granted, most of those had the same subject line and included similar quotes from my original message, but, hey, at least its a move in the right direction :) What exactly does it do? The ifile mail filtering system replaces two pieces of the system commonly found with exmh. The first of those is the mh refile program, which is completely replaced by irefile, one of the perl scripts which comes with this package. irefile provides the basic functionality of the mh refile program and is sufficient to be used with exmh. However, it should not be considered a complete replacement as it is significantly slower and lacks many of the options of the mh refile program. It does do one thing the mh refile program does not do: it records the word frequencies of every message that you refile to a new folder. This information is stored in your home directory in a file named ".idata" and provides the base of information which ifile works upon. ifile also replaces the filtering. Better put, it uses the slocal filtering system to call its own mail filtering program. The idea is simple enough that it could be called from any filtering program, or even exmh itself (future version :), but the simplicity of running through slocal won out in this case. The filtering program reads in one message from standard input, parses it and then uses the information in ".idata" to decide where to put it. How do I know if I can actually use this system? To be able to use this system, you must have the entire mh mailing system installed on your machine. You (most likely) will need to have root access on this machine to install the new refile program. You must have the slocal mail filtering program installed on your machine. You must have perl v4.036 or later installed on your system. It is also highly recommended that you have exmh installed and use that as your primary e-mail program. With some minor changes to exmh code, it would be quite possible to get by the root access or slocal requirements. I'd definitely be interested in hearing about it if you can. GOOD STUFF vvvvvvvvvv So, how do I install this wonderful system? Unzip this package in the directory of your choice (I would suggest /usr/local/ifile) and then run the install script (if you do not have perl installed in /usr/bin, you will need to edit 'install.perl' and change the first line to the proper directory). It will ask you a few question as to where the mh refile binary is, where perl resides on your system and where you want the scripts installed. Once it has this information, it will copy the scripts to the specified directory, backup the current refile program and make a link to irefile. It will also add two lines to your .maildelivery file (creating it if necessary) which will have slocal run the ifilter program. There is one additional action you must perform to finish all the connections. Within exmh, you must set your Incorporate Mail -> Ways to Inc option to "presort". Is that it? Yup. Once you have completed those two minor tasks, ifile will be installed. At this point, you can sit back and wait for the mail to roll in. As you receive it, simply refile any messages which are filed into the wrong folder. Each mailbox will probably require a few refilings before ifilter begins to understand your preferences for that folder. Folders which contain messages very similar to each other will be easy to learn for ifile. Folders which have less of a central theme (say an 'inbox' or 'other' folder) will be more difficult to learn. Why is this version 0.1Alpha? Don't worry over this too much. There are still a lot of improvements I would like to make, but the essentials of the package do work reasonably well. Currently, the two programs are written in perl, and require a good number of calculations and file i/o. Thus they are slow compared to the programs they will be replacing. I am planning to port the programs to C in a future version to increase efficiency. Additionally, the data file which the system uses can become fairly large. My current data file is about 50k after about 20 refilings. If you find yourself refiling often you may end up with a data file nearing a meg. Finally, the only configuration which I can guarantee this program will work on is mine. I run RedHat Linux 3.0.3 with kernel v2.0.7 and exmh v1.6.7. I would like to find out how many different systems this version successfully runs on. The more systems is runs on, the more stable I can advertise it as. Who wrote this? Why, I did! 'I' am Jason Rennie, a sophomore majoring in Computer Science at Carnegie Mellon University. I've been programming for most of my life, starting with BASIC as a youth, moving through Pascal during high school and then widening my horizons in college with such languages as C/C++, perl and Java. Since installing it in December of '95, I have become a big-time linux fan, running it almost exclusively (I still have DOS for those occasional Quake sessions :) I've been doing quite a bit of work with learning algorithms, especially text-based learning over the summer and decided it would be cool to apply one of the ideas to mail filtering. So, I did. How do I contact this computer geek? By e-mail, of course! jr6b+@andrew.cmu.edu You can also check out my web page at "http://syrinx.res.cmu.edu/". Why such an uncreative name? Well, what can I say. I haven't put a lot of thought into it. I'm probably going to be doing some soul searching :) to see if I can figure out a better one (don't be suprised to see the name 'ifile' gone in a version or two). Or, if any of you out there can think of a better name, I'd consider using it, just drop me a line. Hey, I would even mention somewhere in the readme that credit for the name goes to you! (how's that for instant fame :) If you do find this program helpful and would like to support it and see it grow into a powerful, full-featured mail filtering system, please fill out the following short survey and mail it to jr6b@andrew.cmu.edu: ------------------------->8-------------------------- What operating system do you run? : What version of exmh do you have? : What version of perl is on your system? : What is your suggestion for a better name? : Did you have any problems installing ifile? If so, please describe them here: Did you encounter any problems while running ifile (lost mail - poor job of filtering)? If so, please describe them here (copies of .idata and /tmp/irefile.info and /tmp/ifilter.info would be really helpful!) : Do you have any suggestions on how to improve ifile? (this would be most helpful if you have specific code changes) ----------------------8<--------------------------- Thank you for your interest!!!! :)