Copyright © 1997-2009 Jari Aalto
License: This material may be distributed only subject to the terms and conditions set forth in GNU General Public License v2 or later; or, at your option, distributed under the terms of GNU Free Documentation License version 1.2 or later (GNU FDL).
This page documents of part of the files included in Procmail Module Library project. Each modules is can be used as a plug-in. Some can be used as subroutines to your own programs and some are self containing recipes. There is no user or site specific settings in these modules.
An attempt to minimize the use of external processes was the design goal. The modules try to use pure procmail way as much as possible e.g. to get date and directly from message without calling expensive shell date. It is important to remember that procmail is run on every incoming message and every CPU tick spent counts.
Document control
This document has been automatically generated from the procmail files with 2 small perl programs in the following manner:
% perl -S ripdoc.pl `ls pm-ja*.rc|sort` > pm-lib.raw % perl -S t2html.pl \ --html-frame \ --base http://freecode.net/projects/procmail-lib \ --button-previous http://freecode.net/projects/procmail-lib \ --title "Procmail module documentation" \ --author "Jari Aalto" \ --meta-keywords "procmail, sendmail, programming, library" \ --meta-description "Procmail plug-in module documentation" \ --name-uniq \ --Out \ pm-lib.txt |
The perl program assume that the documentation sections have been written in Technical Text Format. The perl program ripdoc.pl can be found at CPAN entry http://www.cpan.org/modules/by-authors/id/J/JA/JARIAALTO/ and t2html.pl is available at project perl-text2html.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc extracts the various components of email address from variable INPUT. You can do quite a lot interesting things with your email address. One of the tricks that you could use if you don't have sendmail plus addressing capabilities, is that you put the additional infomation to the RFC comment. Eg. If you read and followup to posts in usenet games groups, you could use:
From: <em>login@site.com</em> (John Doe+usenet.games) |
Or if your email address's localpart (that's characters before @) already signify your First and surname, you don't need to repeat it in comment. However, place special marker "+" to mark additional information part for your procmail recipes:
From: <em>first.surname@site.com</em> (+usenet.games) |
The use of RFC comment should work everywhere because RFC requires that comments are preserved along with the address information. If you would have sendmail plus addressing capabilities you would have used:
From: <em>login+usenet.games@site.com</em> (John Doe) |
The idea is that the list infomation is readily available from the email. The following recipe will derive the plus information and use it directly as a mailbox where to drop the message. If The Editor: Emacs, means anything to you, you can program it to generate the appropriate From headers automatically when you send mail from Gnus Mail/Newsreader MUA. Drop me a message if you need an example how a piece of Emacs lisp code makes those magic RFC plus addresses in the background while you compose the body of the message.
RC_EMAIL = $PMSRC/pm-jaaddr.rc TOME = "(login1|login2)" :0 *$ ^TO\/.*$TOME.* { INPUT = $MATCH INCLUDERC = $RC_EMAIL PLUS = $COMMENT_PLUS # If COMMENT_PLUS was defined, we found "+" # address which contain "usenet.games". Save it to # folder. :0 : * PLUS ?? [a-z] $PLUS } |
1998-05 David Hunt dh@west.net also mentioned that "you need to remember that some MTAs, (qmail for one, and soon vmail) use a dash ( - ) as the subaddress delimiter. So you'll want to allow for that in your code". For this reason the email part accepts both "-" and "+". The RFC comment however accepts only "+" and "--".
"From: foo+procmail@this.site.com (Mr. foo)" traditional "From: foo-procmail@this.site.com (Mr. foo)" new styled |
NOTE: M$SOFT mailers tend to send idiotic smart quotes "'Mr. foo'" and this recipe ignores these two quotes ["'] as if message had only the standard ["]
ADDRESS "foo+procmail@this.site.com" containing the email address without <> ACCOUNT "foo+procmail" all characters before @ ACCOUNT1 "foo" characters before plus: account1+account2@site Note, if there is no "+", this is same as ACCOUNT. ACCOUNT2 "procmail" _only_ set if plus found: account1+account2@site SITE "this.site.com" all characters after @ DOMAIN "site.com" the main domain, preceding words in site are considered subdomain (local) addresses. sub.sub.domain.net SUB "this.site" all the sub-domain names without the NET part. SUB1 "site" The first subdomain counted from the _RIGHT_ after NET SUB2 "this" Second subdomain. SUB3 "" Third subdomain. SUB4 "" Fourth subdomain. NET "com" last characters after last period ( net,com,edu ...) COMMENT Anything unside parenthesis (Mr. Foo) or if no parentheses found, then anything between quotes "Mr. Foo" COMMENT_PLUS Anything after the "+" in the comment, like "Mr Foo+mail.usenet" --> "mail.usenet" Note: some MTA's don't allow + character, so use alternatively '--': "Mr Foo--mail.usenet" --> "mail.usenet" |
Additionally there is variables DOT1 DOT2, which behave like ACCOUNT1 and ACCOUNT2, but in respect to dotted firstname.surname type address:
john.doe@site.com ACCOUNT1 = john.doe ACCOUNT2 = <empty> DOT1 = john DOT2 = doe |
If there is plus, the ACCOUNT2 is defined
john.doe+foo@site.com ACCOUNT1 = john.doe ACCOUNT2 = foo DOT1 = john (in respect to ACCOUNT1) DOT2 = doe (in respect to ACCOUNT1) |
Variable ERROR is set to "yes" if INPUT wasn't recognized or parsing the address failed.
PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.
INPUT = string-to-parse |
Read From field and address from it. This is lot faster than using external formail call.
PMSRC = $HOME/pm RC_ADDR = $PMSRC/pm-jaaddr.rc :0 * ^From:\/.*@.* { INPUT = $MATCH # Turn off the logging while executing this part VERBOSE="off" INCLUDERC = $RC_ADDR VERBOSE="on" :0 * ERROR ?? yes { # Hmm, no std email address found. Any other ideas? } } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details <http://www.gnu.org/copyleft/gpl.html>.
Preserve last N arriving messages in a separate sub-directory. This should be your safety-belt recipe that you put to the beginning of your .procmailrc.
Procmail saves the backup files with names like: msg.rcG msg.scG msg.3YS1, msg.4YS1, msg.VYS1, msg.fYS1 to the backup directory.
Note: this recipe will alawys call shell commands for each message you recive. That is needed because cleaning of the backup directory. If you receive only small number of messages per day, the performance drop of your .procmailrc is not crucial. But if you store many messages per day, then the shell calls may be a performance problem.
In that case, consider moving the cleanup to the pm-jacron.rc module (The cleanup is run only once a day, not for every message)
John Gianni send his simple bsckup script to Jari, who packaged and generalized the code. The code is reused with John's permission and maintaining responsibility was transferred to Jari
You only want to keep backup of messages that are not from mailing lists. You may want to use TO_ macro to detect addresses better, this example matches against all headers
LISTS = "(procmail|list-1|list-2)" JA_BUP_DIR = $HOME/Mail/backup/. # Create the path too JA_BUP_MAX = 42 # this should be enough :0 *$ ! $LISTS { INCLUDERC = $PMSRC/pm-jabup.rc } |
If you get many messages, please don't use this module. Instead see pm-jacron.rc where similar backup work is done better.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
When given a string, this subroutine returns a unique number representing a string, a cookie.
INPUT = "foo@site.com" JA_COOKIE_CMD = "md5" # or chksum INCLUDERC = $PMSRC/pm-jacookie1.rc cookie = $OUTPUT |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with program. If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
Visit <http://www.gnu.org/copyleft/gpl.html>
This recipe handles generating the cookie to new users, comparing the returned cookie against the original one and passing known users through if they already had returned their cookie.
When you run automatised scripts, eg. to manage mailing lists where users can subscribe and unsubscribe, you have better to install safety measure so that someone can not subscribe his enemy to 30 mailing lists.
The cookie is any continuous block of random characters that is sent to person who wanted to use the service. He must send back the cookie before the service starts an action, like subscribe. If someone forges the From address to pretend to be someone else and then subscribes as-beeing-someone-else to a mailing list, the cookie protects this from happening.
The cookie is sent to someone-else, and he must return the cookie before the "subscribe" service is activated. Obviously this someone-else will not be interested in sending back the cookie and thus the forgery fails. Isn't that simple, but efective protection against misuse?
Unsolicited Bulk Email aka Spam is crawling from every possible domain thinkable, so you might think that a challenge-response policy could be deployed to regular email communication as well. The idea would be that unknown people are requested to "join" to a white list, before discussion is initiated with them. Bulk email shotguns do not reply to challenges (here: cookies), so confirmations are not returned. Individual people that want to talk, may want to return the cookies.
Sounds like a perfect Unsolicited Bulk Email shield? No more non-invited mail? Wrong. Don't use this module for that. The whole idea of challenge-response is flawed and causes trouble for every person who tries to contact. Imagine for 10 people using C-R systems; they would all need to authenticate themselves. Who is going to believe that he is not replying to a spammer who is collecting email addresses? And what about automatic messages that might be received – there is no artificial intelligence to deparate "human" messages from automatically generated messages, so challenges just increase the overall mail traffic. Every C-R system doubles the mail traffic and becomes spam problem by itself.
In short, don't use this module for implementing a C-R system to block regular mail to you.
By default the cookie generated uses CRC 32 cksum, but if you have md5, you should use it. The cookie is generated from the reply address and immediately stored to cookie database file with entry
DATE FROM-a COOKIE-a DATE FROM-b COOKIE-b |
If this was a new user or an old user, who has not registered his cookie yet, then original message is sent back to the sender with instructions: "please place the magic string to Subject line and resent the message."
When cookie is returned back, a new line to the database is added, simply by adding a duplicate entry. The file now looks like this:
DATE FROM-a COOKIE-a DATE FROM-b COOKIE-b DATE FROM-a COOKIE-a |
When there is two or more same entries, like FROM-a, the address is supposed to be known and person behind it "cleared".
PMSRC must point to source directory of procmail code. This subroutine will include
ERROR will contain the efective action when this recipe file ends
key is an internal variable in this recipe file and will hold the cookie id in case of "new-user" and "key-mismatch". You may want to use it if you generate your own reply.
This is what I use to prevent unknown people from sending me UBE. It takes a bit extra, but they can easily return the message. Fill in the missing variables, this won't work out of the box for you.
WORK = "(domain1|domain2|domain3)" LISTS = "(procmail|list-2|list-3|list-4)" VALID = "(postmaster|abuse|$LISTS|$WORK)" RC_COOKIE = $PMSRC/pm-jacookie.rc UBE_SPOOL = $HOME/Mail/junk.ube.spool # Save spam here :0 *$ ! From:.*$VALID *$ ! ^FROM_DAEMON { JA_COOKIE_SEND = "yes" # Activate it INCLUDERC = $RC_COOKIE :0 : * ! ERROR ?? known-user $UBE_SPOOL # ... Past this point: it was user in whitelist, so the # recipes after this block will take care of it } |
$RC_COOKIE = $PMSRC/pm-jacookie.rc ...Mailing lists handled here... ...Your work messages filed here.. TO = `formail -rt -zxTo:` # We need this elswhere JA_COOKIE_TO = $TO # For List-X all subscribe requests must # be confirmaed * ^TO_()list-x * ^Subject: +subscribe\> { JA_COOKIE_SEND = "no" INCLUDERC = $RC_COOKIE :0 * ERROR ?? known-user { # User sent the subsribe request again, allow joining # immediately. } :0 E { # Because the Send was set to "no"; we're in charge # to send a reply to the user. # ...generate suitable message with formail -rt } } # End of example |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>
Framework for all cron tasks that can be run once a day. This is a wrapper recipe to your cron task list: when the day changes, you cron includerc is called.
PMSRC must point to source directory of procmail code. This recipe will include
A file JA_CRON_RUN_FLAG which defaults to ~/.yymmdd.run is created when your includerc, that contains list of cron tasks, is run. If new mail arrives while your cron recipes are still running, you should prevent invoking the cron again by checking if this file exists. When all the cron tasks have been run, this flag file is removed. Remember to use "w" flag in your cron recipes where necessary to serialize the work.
Save backups to separate directory, but do cleaning only once a day We do not keep backups from mailing list messages
LISTS = "(procmail|list-1|list-2)" BACKUP_DIR = "$HOME/Mail/backup/." # Store backups: separate files to directory :0 c: *$ ! $LISTS $BACKUP_DIR # Run JA_CRON_RC once a day. It contains all daily cron tasks CRON_RC = $PMSRC/pm-jacron.rc # the framework JA_CRON_RC = $PMSRC/pm-mycron.rc # the tasks to do JA_CRON_RUN_FLAG = $HOME/.cron-running # define this! # Do not enter here if message arrived at the same day when # the cron is already running. The CRON_RC takes care # of deleting the file when cron has finished. :0 *$ ! ? $IS_EXIST $JA_CRON_RUN_FLAG { INCLUDERC = $CRON_RC } |
The pm-jacron.rc file may contain anything. For example to clean the backup directory; you add these statements there
# rm dummy: if ls doesn't return files, make sure rm has # at least one argument. # # ls -t: list files; newest first # # sed: chop $max newest files from the listing, leaving the # old ones max = 32 :0 hwic | cd $BACKUP_DIR && $RM -f dummy `ls -t msg.* | $SED -e 1,${max}d` # End of file pm-mycron.rc |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
When you send a message to a address that had delivery troubles, you get a DAEMON message back explaining the error problem. I usually want to save these daemon mesaages to a different folder and check the folder from time to time. A typical daemon message is like this (shortened)
From: Mail Delivery Subsystem <em>MAILER-DAEMON@my.domain.com</em> Subject: Warning: could not send message for past 4 hours The original message was received at... ----- Transcript of session follows ----- Deferred: Connection timed out ----- Original message follows ----- [YOUR MESSAGE AS YOU SENT IT WITH HEADERS] |
Well, when I read the subjects, I do not like the standard error messages, but I also like to know to which address the delivery failed and what was the original subject. This small recipe changes the daemon message's Subject to
Subject BRIEF-ERROR-REASON, SENT-TO-ADDRESS, ORIGINAL-SUBJECT |
and from that you can immediately tell if you should be worried Eg. if SENT-TO-ADDRESS was your friend's, then you want to take actions immediately, but if it were your complaint to UBE message to postmaster, you don't want to bother reading that daemon message. Here are some real examples:
fatal errors,postmaster,ABUSE (Was: Super Cool Site!) Host unknown,postmaster,ABUSE (Was: A-Credit Information) undeliverable,postmaster,Could you investigate this spam Warning-Returned,friend,Have you looked at this |
PMSRC must point to source directory of procmail code. This subroutine needs scrips
Just add this recipe somewhere in your .procmailrc. The place where you would put this daemon message trapper subroutine is crucial: think carefylly how you order your recipes. One suggested order could be: backup important messages, cron-subroutine, handle duplicates, DAEMON MESSAGES, plus addressed message, server message (file server, ping responder...), MAILING LISTS, send possible vacation replies only after all above, apply kill file, detect mime, save private messages and las FILTER UBE.
PMSRC = $HOME/pm RC_DAEMON = $PMSRC/pm-jadaemon.rc DAEMON_MBOX = $HOME/Mail/junk.daemon.mbox ... INCLUDERC = $RC_DAEMON :0 : # If that was a daemon message, save it * ERROR ?? yes $DAEMON_MBOX |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc parses date from variable INPUT which has string
"Week, Daynbr Month Year" |
Example input
"Tue, 31 Dec 1997" -- without comma "Tue 31 Dec 1997" -- with comma |
Returned values
YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits If available mm = 2 digits If available ss = 2 digits If available TZ = 5 characters If available |
Variable ERROR is set to yes if it couldn't recognize the INPUT and couldn't parse the basic YYYY, YY, MM, DD variables.
PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.
INPUT = string-to-parse |
The INPUT can have anything after "Week, dayNbr Month Year", or before it: you can pass a string like "Thu, 13 Nov 1997 11:43:23 +0200".
The first Received header will tell when the message was received by your mailserver. We parse the date and avoid calling expensive date command.
PMSRC = $HOME/pm RC_DATE_WDMY = $PMSRC/pm-jadate1.rc #Week-Day-Month-Year parser # Get time from first header, it ends like this: # # Received: ... ; Thu, 13 Nov 1997 11:43:50 +0200 :0 *$ ^Received:.*;$s+\/...,$s+$d.* { INPUT = $MATCH # Turn off the logging while executing this part VERBOSE=off INCLUDERC = $RC_DATE_WDMY VERBOSE=on :0 * ERROR ?? yes { # Use some other way to get the time or shout loudly } } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc parses date in format "YYYY-MM-DD hh:mm:ss" like 1997-12-01 and sets following variables whenever called
YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DD = 2 digits hh = 2 digits If avaliable mm = 2 digits If avaliable ss = 2 digits If avaliable |
Variable ERROR is set to yes if it couldn't recognize the INPUT and couldn't parse the basic YYYY, YY, MM, DD variables.
PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.
INPUT = string-to-parse |
Last string in INPUT that matches number sequence NNNN-NN-NN is parsed.
PMSRC = $HOME/pm RC_DATE_ISO = $PMSRC/pm-jadate2.rc # ISO date parser INPUT = "This is 1800-10-11, a very old date" # Turn off the logging while executing this part VERBOSE="off" INCLUDERC=$RC_DATE_ISO VERBOSE="on" |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc parses date from variable INPUT which has string
"Week, Month dayNbr hh:mm:ss yyyy", |
Example
Tue Nov 25 19:32:57 1997 |
Returned values
YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits mm = 2 digits ss = 2 sigits |
Variable ERROR is set to "yes" if it couldn't recognize the INPUT.
PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.
INPUT = string-to-parse |
The first Received header will tell when the message was received by the mailserver. Parse the date and avoid calling expensive date command.
PMSRC = $HOME/pm RC_DATE_WMDT = $PMSRC/pm-jadate4.rc #Week-Month-Day-Time parser # Get time from X-From-Line: Which was added by my MDA # X-From-Line: procmail-request@informatik.rwth-aachen.de \ # Tue Nov 25 19:32:57 1997 :0 c *$ ^X-From-Line:\/.* { INPUT = $MATCH # Turn off the logging while executing subroutine VERBOSE=off INCLUDERC = $RC_DATE_WMDT VERBOSE=on :0 * ERROR ?? yes { # Use some other way to get the time or shout loudly } } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine calls shell command date once and prses the values. This should be your last resort if you haven't got the date values by any other means. This subroutine assumes that the DATE command knows the following % specifier formats (HP-UX)
Y NNNN year h MON month d NN day a WEEK Like "Mon" H NN hour M NN min S NN sec |
Returned values
DATE = RFC date in format "Mon, 1 Dec 1997 17:41:09" This is same as what you would see in From_ YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits mm = 2 digits ss = 2 sigits |
Variable ERROR is set to "yes" if values couldn't be set
PMSRC must point to source directory of procmail code. This subroutine will include
The First Received line will tell when the message was received by the MDA. If thata fails, then get date from the system. If you send test messages to # yourself, you don't usually put From_ header in it and thus there is # no date information in 'dry run' tests.
# Get time from first eader, which is always same in my system # Received: ... ; Thu, 13 Nov 1997 11:43:50 +0200 INCLUDERC = $PMSRC/pm-javar.rc # to get $s $d definitions TODAY # Clear it :0 *$ ^Received:.*;$s+\/...,$s+$d.* { INPUT = $MATCH INCLUDERC = $PMSRC/pm-jadate1.rc TODAY = "$YYYY-$MM-$DD" } # Check that variable did get set, if not then we have to call # another date subroutine: Call shell then to find out date # # You could also do this with ':0 E', but this is more # educational :0 *$ ! $TODAY^0 { INCLUDERC = $PMSRC/pm-jadate4.rc # Get date from Shell then TODAY = $YYYY-$MM-$DD } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc parses date from variable INPUT which has string
"WeekDay Month dayNbr Year" |
Example input
"Fri Jun 19 18:51:56 1998" -- without comma "Fri, Jun 19 18:51:56 1998" -- with comma |
Returned values
YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits If available mm = 2 digits If available ss = 2 digits If available TZ = 5 characters If available |
Variable ERROR is set to "yes" if it couldn't recognize the INPUT and couldn't parse the basic YYYY,YY,MM,DD variables.
PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.
INPUT = string-to-parse |
The INPUT can have anything after "Week, dayNbr Month Year", or before it: you can pass a string like "Fri Jun 19 18:51:56 1998 11:43:23 +0200".
The first Received header will tell when the message was received by your mailserver. We parse the date and avoid calling expensive date command.
PMSRC = $HOME/pm RC_DATE_WDMY = $PMSRC/pm-jadate5.rc #Week-Day-Month-Year parser # Get time from first header, it ends like this: :0 *$ ()\/From .* { INPUT = $MATCH # Turn off the logging while executing this part VERBOSE=off INCLUDERC = $RC_DATE_WDMY VERBOSE=on :0 * ERROR ?? yes { # Use some other way to get the time or shout loudly } } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This recipe will scan several headers to find the date string. When suitable header is found and the parsing has succeeded, the return variables are set. The Date values reflects the arrive time of the message; not the sending time. If nothing works, a shell call date is used as a last resort.
Returned values
YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits if available mm = 2 digits if available ss = 2 digits if available |
PMSRC must point to source directory of procmail code. This subroutine will include
INCLUDERC = $PMSRC/pm-jadate.rc # now we have all date variables that we need # $TODAY = $YYYY-$MM-$DD |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This recipe stores duplicate messages to separate folder
PMSRC must point to source directory of procmail code. This subroutine will include
For simple usage, just put this somewhere after backup recipes
RC_DUP = $PMSRC/pm-jadup.rc ... INCLUDERC = $RC_DUP |
When you are testing messages, you send them over and over to
procmailrc; which means that same message should not be trapped by
duplicate check. You can call procmail with option "-a test" which will
set pseudo variable $1. The recipe below sets flag JA_ID_IGNORE
to "yes" if test is on going and the duplicate filter should be
bypassed.
RC_DUP = $PMSRC/pm-jadup.rc ARG = $1 # Copy pseudo variable to $ARG :0 * ARG ?? test { JA_ID_IGNORE = "yes" } # Some microsoft product is known to send same message ids # over and over. If we detect one, tunr off the duplicate test, # because it would trash every message. # <MAPI.Id.0016.00666479202020203030303430303034@MAPI.to.RFC822> :0 * ! ^X-msmail * ! ^Message-ID: *<em>MAPI.*@MAPI.to.RFC822</em> { JA_ID_IGNORE = "yes" } # Run this command every time a duplicate message is found. # It writes a small log entry to MY_LOG INCLUDERC = $RC_DUP :0 hwic: * ERROR ?? yes | echo " [duplicate]" >> $BIFF |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This simple includerc will define variable BODY_EMPTY to "yes" or "no" when called like this You can file empty messages to separate folder based on this value
INCLUDERC = $PMSRC/pm-jaempty.rc :0 * BODY_EMPTY ?? yes the-empty-mail-folder |
This is more designed to be part of other modules. If you just want to check for empty message, a simpler recipe like this might be better:
INCLUDERC = $PMSRC/pm-javar.rc :0 B: # if body has only whitespace characters *$ ! $NSPC the-empty-mail-folder |
(none)
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc extracts the most likely FROM address from the message. The order of the search is Reply-to, From_, Sender, From and if none found, then as a last resort, call formail. You would usually use the returned value for logging purposes.
Avoiding extra formail call could be usefull if you receive lot of messages per day.
Example input
(none) |
Returned values
OUTPUT, containing the derived FROM field |
PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there. You nee procmail 3.11pre7 in order to use this subroutine. (due to formail -z switch)
INCLUDERC = $PMSRC/pm-jafrom.rc FROM = $OUTPUT # now we have the 'best' FROM field
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc makes it possible to control your message forwarding via simple remote email message. Thanks to Era Eriksson and Timothy J Luoma who gave the initial idea to this forwarding module in the procmail mailing list 1997-10-07.
If you want to activate the forwarding from the local site where this module is, then you could simply write the forward address to the file pointed by JA_FWD_FILE which is ~/.forward-address by default.
% echo Me@somewhere.com > ~/.forward-address |
and when you no longer need forwarding, then remove that file. But really, this module is not used for that purpose, because it is lot easier to write
:0 ! Me@somewhere.com |
as a first statement in your .procmailrc when you want to forward your mail to another account.
Suppose you're on the road and suddenly realize that you want your mail forwarded to the current account, then you send following control message
Subject: forward-on password new-address@bar.com To: my-account@bar.com From: onTheRoad@some.com |
That message is is enough to get the mail forwarded to the address new-address@bar.com This script will respond to address From that the current forwarding is now pointing to address "new-address@bar.com".
The message is very similar, but the Subject header says
Subject: forward-off password |
And no other fields are checked. Not even Reply-To. In this case the confirmation message is sent directly back to From address.
If for some reason you have no control over the headers of email, eg when you send GSM-Mail message from your phone to your account:
EMAIL foo@bar.com FORWARD-ON PASSWORD new-address@bar.com |
The email message looks like this:
From: GenEmail <em>sms@FooBar.net</em> Date: Thu Sep 17, 11:42am +0200 To: "'Foo.Bar'" foo@bar.com Subject: Message 03384874987 FORWARD-ON PASSWORD new-address@bar.com |
Instead of looking at the Subject field, you can get this module to look at the first words in the body field. See variable JA_FWD_CONTROL_FIELD which you want to set to "body".
If you only have persistent accounts, then you should set the JA_FWD_FROM_MUST_MATCH to match those addresses that you have. The following setting says that only control messages sent from these addresses are accepted. Nobody else can't change your forwarding settings.
JA_FWD_FROM_MUST_MATCH = ".*(acc1@a.com|acc2@b.com)" |
Hm, that's not a bullet proof, because someone may in theory forge the From address. You probably should also set this variable to point to accounts where the mail can be legally forwarded to. Then, even if the imposter forges the From address; he can't get the email forwarded anywhere else than to the valid locations.
JA_FWD_TO_MUST_MATCH = $JA_FWD_FROM_MUST_MATCH |
Consider also setting JA_FWD_PASSWORD_CASE to Procmail flag D which causes your control word "forward-on" and password to be case sensitive.
If you don't receive confirmation message, then your control message was ill formed or you're not in the JA_FWD_FROM_MUST_MATCH list. There is no notification sent on failure, so that no attacker can draw conclusions.
PMSRC must point to source directory of procmail code. This subroutine will include
You should preset all necessary variables prior adding the includerc command to your .procmailrc. Here is one simple setup
#JA_FWD_SENDMAIL = "tee $HOME/test.mail" # Uncomment if testing JA_FWD_COPY = no # no copies stored while forwarding JA_FWD_PASSWORD_CASE= "D" # case sensitive JA_FWD_PASSWORD = "MyMagicString" JA_FWD_FROM = $FROM # This is already known. INCLUDERC = $PMSRC/pm-jafwd.rc |
Please realise that when you set the forwarding from a remote site, be very carefull when you type in the forward address or your mail ends up to somebody else's mailbox. Also I recommend that you keep JA_FWD_COPY to yes so that your local account always keep the copy of forwarded message.
A step further would conventionally encrypt(1)'ing your forwarded messages. This way even your top secret messages would be mostly safe even if they end up to someone else's mailbox.
tinybm.el/&tags and tinytab.el for the 4 tab text placement.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine tries to detect and derive the mailing list name as it appears in some of the known methods that ezlm, smarlist, listserv, majordomo etc. normally use. After this subroutine has been applied to message the variable LIST contains the mailing list name. Subroutine adaptively finds new new mailing lists from the messages.
The alternative to subscribing to many mailing lists is to read them from web archives. Even better way is to use NNTP server at http://www.gmane.org which allows you to post as you would to a regular newsgroup. Consider using the NNTP interface and you may save you from receiving lot of messages that can already be found from Gmane's server.
If you just want to jump in and use this module and you noteice that some list isn't trapped, please set
If you want to make some list more unique, like if name "Alert" was detected as a list name, please set
If you can use sendmail type PLUS addressing capabilities, you may not be interested in this module, because you have an alternative way to handle mailing list messages. The extra information after "+" is available to procmail scripts via $ARG pseudo variable when procmail is the LDA. Let's suppose you want to subscribe to procmail mailing list and want to save all messages to folder list.procmail, then you'd subscribe with address:
login+list.procmail@site.com |
If your email host doe snot provide the plus addressing then it the traditiona approach have been to add a piece of recipe to ~/.procmailrc to catch each list. But that's manual work for every list. When you use this subroutine, you no longer need to write separate mailing list recipes to your ~/.procmailrc every time you subscribe to a new mailing list. The detection of a new list will happen automatically.
There is lot of heuristics going on in this module and one thing that you must note:
If 'To:' domain is same as `From/Sender:/Reply-to:' domain then it is considered a mailing list message. |
This causes certain messages to be treated as mailing list messages. The module can't possibly know that the following is not from mailing list, because it doesn't know "what is mailing list", only "how it probably looks like it". This is definitedly categorized as mailing list message, because From and even Reply-to has the same domain foo.bar.net as in To.
To: support@foo.bar.net From: message@foo.bar.net Reply-to: support@foo.bar.net Subject: Vmail See message to Eric |
You must prevent checking messages like this by surrounding call to this subroutine with a check statement:
# Do not check these messages noList = "From.*(foo.bar.net|support.my.com)" :0 *$ ! $noList { INCLUDERC = $RC_LIST # ... save messsag by examining variable LIST (which see) } |
If you find mailing lists that this subroutine does not detect, but which could have been detected by looking the headers in standard way, please send a email to maintainer. There may be cases where it is impossible to detect the mailing list and in those cases you just has to carve a new entry to your ~/.procmailrc. When you keep your procmail log running, you may see message
*** potential list *** |
Which is an indication that some new recipe could be added to to this subroutine to detect that mailing list. If the message you received was from a mailing list, please send all the headers to the maintainer so that support can be added.
You can search for mailing list that interests you at:
http://www.lsoft.com/lists/listref.html |
Python based mailing list manager; the mailman:
http://www.list.org/ |
Bill Houle sent interesting headers which caused to add more heuristic than was feasible to solve the list detection. From the below headers it is practically impossible to derive the original list name. So, the list name is artificially constructed by combining Reply-To's LOGIN with Errors-To field's first host name
Reply-To: news@doodle.foo.net Errors-To: bounced@doodle.foo.net |
The list name formed is "news-doodle". So, If you happen to see an odd name like this which doesn't remind the original list name, it may be due to poor headers that have no clue about the real name. No problem, check below how you would convert this name to better mailbox name.
PMSRC must point to source directory of procmail code. This subroutine will followign extra module, which must have been installed.
This is regexp of sender addresses to ignore so that the if To and From are identical, it is not considered a list messages. This is typical for system generated messages that take form:
From: root@host (Cron Daemon) To: root@host |
If set to "yes" then the list name information detected is saved to separate header. The LIST_DETECTED is the original grabbed word from the headers and the 'LIST' is the final name after possible list name conversions. According to RFC the X- can be user for user headers.
X-List-Detected: $LIST_DETECTED mapped to $LIST |
If grabbed LIST match this regexp at the end of list name, then the postfix match will be removed. It is traditional that many list names are like list1-info, list2-beta, list3-L and ut would be preferable to see names like list1, list2 and list3. The default value will ditch "-(info|beta|L)".
Just like the postfix variable. If this string is matched at the beginning of the LIST, it is removed.
In some cases this list detection recipe "thinks" that the address picked is the list sender. You may have a dedicated address where all you mailing list mails arrive and you have named it like mailing-list@me.here.at, which will effectively trigger: Ah, you have -list in email address, so this message must be from mailing list name 'mailing'. Of course it is not and you have to disallow the heuristics to make such assumption by defining a regexp that rejects a possible choice. For the above example, you would define:
JA_LIST_DISREGARD_EMAIL = "posting-list@me.here.at" |
If you have several such addresses, just add them to the variable separating with normal regular expression "|" OR statement.
This is optional variable, which you can set to match regexp of the mailing list domain address if it slipped through the tests in this module. There are some lists that send messages that don't carry enough information in headers to determine their list status. If you narrow the group by setting JA_LIST_HEADER_REGEXP, then for example lists like these, that identify themselves only through two headers, can be found:
Reply-To: dispatch-faq@cnet.com From: CNET Digital Dispatch <em>dispatch@cnet.com</em> |
For that list you would set
JA_LIST_HEADER_REGEXP = "(@cnet\.com)" |
Don't worry. all the other list detection recipes has already been tried, so this is last test that are carried out and variable JA_LIST_HEADER_REGEXP helps eliminating possible mishist
You don't need set this variable to include all mailing list domains. Only to those ones that were not trapped. The default value for this is:
"(amazon\.com|bookpool\.com)" |
If you're subscribed to many mailing lists, that simply tell that they are news or newsletter, it will be impossible to differiantiate A news from B news. This variable holds regular expression that, if matched, prepend the first host name to the beginning of list name, thus making the list unique:
news@some.com --> some-news news@here.com --> here-news |
The default value matches lists that contain word news, but you may need to set this to more matches.
Note: before using this feature, make sure your LINEBUF is big enough, say 4096 or otherwise the variable's content is truncated. |
Many times the grabbed LIST name is not what you would like to use for your mailbox name. You want to make the name perhaps more shorter, more descriptive or categorize the messages according to hierarchy. Let's say that you have subscribed to following mailing lists:
LIST LIST name Description of mailing list (as grabbed) you want ------------------------------------------------------------- jde java.jde Java Development Env java java.lang Java programming FLAMENCO flamenco Flamenco music tango-l tango Argentine Tango dancing tm-en-help tm-en Emacs TM mime package mailing list w3-beta w3 Emacs WWW mailing list |
First, remember that the variable JA_LIST_KILL_POSTFIX is first applied, so the actual LIST appears as follows:
jde, java, FLAMENCO, tango, tm-en, w3 |
Ok, now we apply the conversion table by defining it as follows. The grabbed LIST name is first, then comes space(s), new name and terminating colon. Repeat this for each list you want to convert.
LIST CONVERSION[,LIST CONVERSION ...] |
This gives us table below: notice that entries tango-l, w3-beta were not included, because the JA_LIST_KILL_POSTFIX already got rid of the postfixes. Also note how the uppercase match FLAMENCO is converted to more suitable lowercase mailbox name. After you have set up this variable you can start saving messages to folders.
JA_LIST_CONVERSION = "\ jde java.jde,\ java java.lang,\ FLAMENCO flamenco,\ " |
The list conversion is done with pure procmail means, so it is very fast. It also means that the conversion is limited to FROM-STRING TO-STRING syntax. No wild cards or regular expressions are allowed.
If you consider using an external process, like sed or perl to convert the grabbed list name to something else (when JA_LIST_CONVERSION method was not enough); think again. For each incoming mailing list message you launch external process. It is not unusual to receive 700 messages from various mailing lists a day, it can be imagined how much load any external process would add to the server. Use the grabbed mailing list name and JA_LIST_CONVERSION table if you care about system load.
If you have many mailing lists that use uppercase names, it may be tedious to add each mailing list name to JA_LIST_CONVERSION. Possible alternative is to use very efficient tr program to convert characters to lowercase. Again; think twice, because any extra process could be avoided if JA_LIST_CONVERSION was used.
:0 * ! LIST ?? ^^^^ { :0 D # still uppercase list name? * LIST ?? [A-Z] { LIST = `echo $LIST | tr A-Z a-z` } :0 : list.$LIST } |
One important thing to keep in mind is that when mailing list manager sends out list messages, the headers may change. This means that the list name grabbed previously changes too. This is unfortunate, but it sometimes happens. Let's see an example. I was previously receiving messages from Cygwin mailing list named gnu-win32
To: <em>gnu-win32@cygnus.com</em>, "Foo Bar" foo@example.com |
However, one day that same list was grabbed under name "cygwin", due to new header
Mailing-List: contact cygwin-help@sourceware.cygnus.com; run by ezmlm |
JA_LIST_CONVERSION = "\ gnu-win32 cygwin32,\ cygwin cygwin32,\ " |
Here is recipe to save all your mailing list to separate folders. If you subscribe to new lists or unsubscribe to lists, you don't need to change anything. The grabbed list name will appear in variable LIST
RC_LIST = $PMSRC/pm-jalist.rc # name the subroutine ... # Handle all mailing lists with one subroutine and recipe # following it. Set also JA_LIST_CONVERSION before # calling this subroutine to cnvert the found list names. INCLUDERC = $RC_LIST imap = # Kill var. Set to "/" to enable :0 # if list name was grabbed * LIST ?? [a-z] { dummy = "Saving mailing list: $LIST" :0 w: ${imap+".INBOX."}list.$LIST$imap } |
What's that IMAP thing there, you may wonder. Normally procmail delivers to standard mailbox, so the name is something like '$MAILDIR/list.abc'. For IMAP, the delivery must happen using principle "one file, one message", so procmail must deliver to a directory. That's what the added $imap is there for. It is also customary that IMAP folders are prefixed with ".INBOX", so the actual name becomes $MAILDIR/.INBOX.list.abc. For IMAP there should also be proper MAILDIR=$HOME/Maildir setting.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
The original father of the decoding scheme used here was presented by Peter Galbraith galbraith@mixing.qc.dfo.ca in procmail mailing list somewhere at the end of 1997.
This subroutine supposes that the header has MIME header Content-Type: text/plain and performs quoted-printable or base64 decoding on the whole message. Note, that if you receive messages that have many mime attachments, then this recipe is not suitable for it.
Procmail is not designed to handle mime attachments and this recipe only applies to whole body.
The pm-jamime-*.rc is really stretching the limits and any serious works hould be delegated to appropriate Perl MIME modules. There is a Perl MIME module which will allow you to manipulate MIME body parts rather elegantly. See http://www.perl.com/CPAN-local/authors/Eryq/ for MIME-tools.
Se also mimedecode at ftp://ftp.dde.dk/pub/mimedecode.c which in included in Debian Linux.
Perl or python is not used, because both are CPU intensive. It would be too expansive for accounts or environments receiving hundreds of mails per day (like from several mailing lists).
RFC 2047 gives possiblity to use MIME iso-8859-1 extensions for mail headers.
Subject: Re: [PIC]: RSA =?iso-8859-1?Q?encryption=B7=B7?= Subject: =?iso-8859-1?Q?=5BEE=5D:TV_&_video_IC=B4s_!!?= |
There is also base64 possibility (although rare):
Subject: =?iso-8859-1?B?zvLi5fI6ICAgICAgTVBMQUIzLjQw?= |
In worst possible case there is even multiple ISO encoded strings in subject. Yes, this is valid, the continued line includes spaces at front to keep it with original just like in Received: headers. This subroutine will not touch headers that have multiple ISO tags - procmail is too limited for that.
Subject: AW: Re: AW: neue =3D?ISO-8859-1?Q?M=3DF6glichkeiten_=3D28was_=3D=C4hn?=3D =3D?ISO-8859-1?Q?lichkeiten_von_=3DDCbungen=3D29?=3D |
Variable PMSRC must point to source directory of procmail code. This subroutine will include
Instead of testing the existence of text/plain in the body, you can force decoding by settings JA_MIME_DECODE_REGEXP to ".*".
RC_MIME_DECODE = $PMSRC/pm-jamime-decode.rc :0 * condition { JA_MIME_DECODE_REGEXP = ".*" } INCLUDERC = $RC_MIME_DECODE # call subroutine. |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Note: If you think this module can do miracles, it cannot. MIME messages are very complex in structure and all this module can do is to detect simple attachements. It cannot be used as - all purpose - all detecting - MIME attachement killer. But the part it can do, is done efficiently, because most of the things are accomplished using procmail and resource friendly awk. |
There are meny programs that add additional information to the messages. Microsoft's mail program is one which may include a 7k application/ms-tnef attachment to the end of message. Many other programs may do the same. This was the idea in 1997 when this module was written; to get rid of the extra cruft which should not land in the mailbox.
This recipe works like this: If email's structure is
--boundary message-text (maybe quoted-printable) --boundary some-unwanted-mime-attachment --boundary |
then the attachment is killed from the body. The message-text part is also decoded if it was quoted printable. This leaves clean text with no MIME anywhere. MIME headers have will be modified as needed due to conversion from multi part and possibly quoted printable to plain text and the final message looks like:
message |
But if email's structure is anything else, like if there were 3 mime sections:
--boundary message-text (maybe quoted-printable) --boundary some-attachment --boundary some-unwanted-mime-attachment --boundary |
then the "unwanted" part is emptyed by replacing with one empty line. The message structure stays the same, but the killed "some-unwanted-mime-attachment" part is labelled as text/plain so that the MUA (Mail User Agent; the email reader program) can decode the MIME message correctly.
The following cases are ncluded on in this module. You need to separately the behavior before this module will start working.
Subject: message From: foo@bar.com X-Lotus-FromDomain: XXX COMPANIES Mime-Version: 1.0 Boundary="0__=cieg4oHxUNf2h3evyOXIsHTGDpFfaZilTDCFhpZSgsw" Content-Type: multipart/mixed; Boundary="0__=cieg4oHxUNf2h3evyOXIsHTGDpFfaZilTDCFhpZSgsw" --0__=cieg4oHxUNf2h3evyOXIsHTGDpFfaZilTDCFhpZSgsw Content-type: application/octet-stream; name="PIC10898.PCX" Content-transfer-encoding: base64 eJ8+IjsQAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcA b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEEkAYAyAEAAAEAAAAQ <AND-THE-REST-OF-BASE64> --0__=cieg4oHxUNf2h3evyOXIsHTGDpFfaZilTDCFhpZSgsw-- |
Subject: message From: foo@bar.com MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="---- =_NextPart_000_01BD04D4.A5AC6B00" Lines: 158 ------ =_NextPart_000_01BD04D4.A5AC6B00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <MESSAGE ITSELF IS HERE> ------ =_NextPart_000_01BD04D4.A5AC6B00 Content-Type: application/ms-tnef Content-Transfer-Encoding: base64 eJ8+IjsQAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcA b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEEkAYAyAEAAAEAAAAQ <AND-THE-REST-OF-BASE64> ------ =_NextPart_000_01BD04D4.A5AC6B00-- |
MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_003A_01BD16E2.C97E27B0" X-Mailer: Microsoft Outlook Express 4.72.2106.4 X-MimeOLE: Produced By Microsoft MimeOLE V4.72.2106.4 This is a multi-part message in MIME format. ------=_NextPart_000_003A_01BD16E2.C97E27B0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <ACTUAL TEXT> ------=_NextPart_000_003A_01BD16E2.C97E27B0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <SAME IN HTML> ------=_NextPart_000_003A_01BD16E2.C97E27B0-- |
X-Mailer: Mozilla 4.04 [en] (X11; U; Linux 2.0.33 i686) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="------------69D9D579CF587DC8BB26C49C" --------------69D9D579CF587DC8BB26C49C Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit <ACTUAL TEXT> --------------69D9D579CF587DC8BB26C49C Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <SAME IN HTML> --------------69D9D579CF587DC8BB26C49C-- |
Content-Type: text/x-vcard; charset=us-ascii; name="vcard.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Laird Nelson Content-Disposition: attachment; filename="vcard.vcf" begin: vcard fn: Laird Nelson n: Nelson;Laird org: Perot Systems Corporation email;internet: ljnelson@unix.amherst.edu title: Software Engineer tel;work: (617) 303-5059 tel;fax: (617) 303-5293 tel;home: (978) 741-3126 note;quoted-printable:Information is for reference only;=0D=0A= please do not abuse it. x-mozilla-cpt: ;0 x-mozilla-html: TRUE version: 2.1 end: vcard |
To handle base64 encoded messages, package called metamail must have been installed to system. It provides program mimencode which is used through variable $MIME_BIN (see pm-javar.rc).
Variable $PMSRC must point to source directory of procmail code. This subroutine will include
First of all, this is primarily a framework recipe to kill any kind of attachment. If you do not set JA_MIME_TYPE before calling this recipe, recipe will try to determine the right value by itself. If the automatic detection fails you need to preset the value of JA_MIME_TYPE beforehand.
It may be possible that some messages are malformed and that they do not contain proper "boundary" definition string in the header. There have been messages that have text/html attachments, but no proper Mime headers. For those cases there is additional variable that will kill all text up till matching line regardless of message content.
That variable is the last resort if the standard MIME detection failed. There must have been some problem in the sender's MUA that composed message. It's dangerous, so make sure you don't set it lightly.
If you see an error message in the log file saying that awk failed:
procmail: Executing awk, ... procmail: Error while writing to "awk" procmail: Rescue of unfiltered data succeeded |
it means that the system's standard awk doesn't support the variable passing syntax. Do the following test:
% awk '{print VAR; exit}' VAR="value" /etc/passwd |
It should print "value". If not, then see if you have nawk or gawk in the system. They should understand the variable passing syntax. The only change needed is to define variable AWK somewhere at the top of ~/.procmailrc.
AWK = "gawk" # Better than standard "awk" |
You should know that the variable JA_MIME_KILL_RE is used to wipe any lines that match that regexp. This is due to MIME structure where continuing header lines exist in the body:
------=_NextPart_000_003A_01BD16E2.C97E27B0 Content-Type: text/plain; charset="iso-8859-1" << kill this line too |
If you want to be absolutely sure that anything valuable won't be accidentally killed (like a code line in programming language scripts), you should set this variable to nonsense value that newer matches:
JA_MIME_KILL_RE = "match_it_never_I_hope" |
Suppose you receive new application/ms type attachment that the default settings doesn't cover. This is a new mime type and you have to instruct this module to kill it. Add this and similar tests for other mime types:
myCustomMimeType = "application/ms" # must be all lowercase :0 *$ $myCustomMimeType { PM_JA_MIME_TYPE = $myCustomMimeType } INCLUDERC = $PMSRC/pm-jamime-kill.rc |
To kill text/html or pdf, postscript and others add something like this to ~/.procmailrc. It demonstrates how the correct MIME types are detected:
# ..................................................... # 1) Uncomment following line if your standard "awk" is broken # AWK = "gawk" # ..................................................... # 2) Set correct value for attachment killing :0 * ^X-Lotus-FromDomain: { # Kill Lotus notes .pcx attachments JA_MIME_TYPE = "application/octet-stream" } :0 * H ?? ^From:.*foo@example.com * B ?? ^Content-Type:.*text/html { # Kill html attachments JA_MIME_TYPE = "text/html" } # ..................................................... # 3) Call module INCLUDERC = $PMSRC/pm-jamime-kill.rc |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc reads MIME boundary string from the message if it exists. The boundary string is typically found from Content-Type header.
Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=9i9nmIyA2yEADZbW |
In addition it will define few other mime variables. See the returned values. You use these variables later in your MIME message processing.
1998-07-28 Brett Glass brett@lariat.org reported in PM-L that there was security exploit in long attachment filenames: http://www.xray.mpe.mpg.de/mailing-lists/procmail/1998-07/msg00248.html
And here is the url to the matter:
http://www.sjmercury.com/business/microsoft/docs/security0728.htm
When you use this module to detect mime messages, you can check the filename length with recipe:
# Recipe after calling $RC_MIME, this module, re = ".........." # regexp with 10 matches too_long = "$re$re$re$re" # allow 40 characters maximum :0 *$ $SUPREME^0 MIME_H_ATTACHMENT ?? $re *$ $SUPREME^0 MIME_B_ATTACHMENT ?? $re { dummy = "** Dangerously long mime attachment filename" dummy = "** $MIME_H_ATTACHMENT $MIME_B_ATTACHMENT" :0 : /var/spool/mail/MimeDanger } |
PMSRC must point to source directory of procmail code. This subroutine will include
INCLUDERC = $PMSRC/pm-jamime.rc
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine supposes that message has been handled by 'pm-jamime-decode.rc'. The purpose is to restore Subject and From headers back to quoted printable format so that messages can be savely saved through IMAP system which may not handle 8-bit messages. If message is stored directly to mailbox and the used Mail user Agent has no problems with dealing 8-bit characters, this module is not needed.
An example where this subroutine could be applied:
Perl or python is not used, because both are CPU intensive. It would be too expansive for accounts or environments receiving hundreds of mails per day (like from several mailing lists).
Variable PMSRC must point to source directory of procmail code. This subroutine will include
To fix Subject header and then make it 7bit clean again. Note, this may not be exactly what you want. The pm-jamime-decode.rc file does a little more than From/Header handling (also modifies message body). Read documentation of each file before using following example
INCLUDERC = $PMSRC/pm-jasubject.rc INCLUDERC = $PMSRC/pm-jamime-decode.rc INCLUDERC = $PMSRC/pm-jamime-recode.rc |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This module saves one simple file attachment (MIME) from he message. The message must define following MIME headers. If "filename=" does not exists, then the message is ignored.
Mime-Version: <version> Content-Type: <type> Content-Disposition: attachment; filename="file.txt" |
The last line can also be in separate line, provided that it is indented according to standard rules:
Mime-Version: <version> Content-Type: <type> Content-Disposition: attachment; filename="file.txt" |
Procmail is not very suitable for saving MIME attachments and you should not think that this the right tool for you. If you receive anything more than 1 attachment, this recipe does nothing, because that's out of our league and you need some more heavy weight mime tools. E.g. Perl CPAN has MIME libraries.
Note: when the attachment is in the body, it is simply written to a disk and the location in message is replaced with test:
Extracted to file:/users/foo/junk/<YYYY-MM-DD-hhmm>.file.txt. |
The existing mime headers that surround the attachment are lect untouched, so don't try to press your Mail Agent's MIME buttons at that point. There is no such file in that spot if you set JA_MIME_SAVE_DEL to yes.
PMSRC must point to source directory of procmail code. This subroutine includes library:
Because procmail uses LINEBUF when filtering messages, a core dump may happen if the attachment being filtered is bigger than the LINEBUF. The current setting accepts 524K attachments, but if you expect to get bigger than that, you want to increase JA_MIME_SAVE_LINEBUF.
Awk is used because it is much more system load friendly than perl. If you see an error message in the log file saying that awk failed:
procmail: Executing awk, ... procmail: Error while writing to "awk" procmail: Rescue of unfiltered data succeeded |
it means that the system's standard awk doesn't support the variable passing syntax. To verify that this is the case, run following test:
% awk '{print VAR; exit}' VAR="value" /etc/passwd |
The proper awk should print "value". If not, then see if you have nawk or gawk in your system, which should understand the variable passing syntax. To change the AWK, you need to set following variable somewhere at the top of your .procmailrc
AWK = "gawk" # if that works better than standard "awk" |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
** THIS MODULE IS OBSOLETE. THE NETMIND SERVICE NO LONGER EXISTS **
...Netmind, or The URL-minder is a free, automatic Web-surfing robot that keeps track of changes to Web pages that are important to you. When the URL-minder detects changes in any of the Web pages you have registered, it sends you e-mail. an effective way to test if the address is known to Internet. You could use this information to see if some automated reply to a address can be sent.
In another words, if you're interested in some URL; say an FAQ page and any updates to them, you can tell Netmind to monitor the page changes for you and it send a message back every time page changes.
This recipe "pretty formats" the announcement sent by Netmind by stripping the message to bare minimum. You usually aren't interested in 4k message which includes "Note from our sponsors", "Try the free online demo" etc. The things saved from the announcement message are:
[Note]
Please let Netmind send you one "pure" message first so that you have a huch what it originally looks like. Then plug int his module and see how the original message is reduced.
[Thank you]
The Doctor What docwhat@holtje-christian-isdn.mis.tandem.com 1998-03-12 send me a patch, where a)body message is more informative b) URL is now included in the body for auto-click browsers c) mime headers were removed.
PMSRC must point to source directory of procmail code. This subroutine will include
INCLUDERC = $PMSRC/pm-janetmind.rc # reformat the message :0: # drop to folder * netmind url.mbox |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine runs nslookup on given INPUT address. This may be an effective way to test if the address is known to Internet. You could use this information to determine if some automated reply to a address can be sent. The know truth is that you can't validate whole email address
to_someone@foo.com |
but you can validate "foo.com"; that's the closest you get.
[Warning: If you don't use cache feature...]
Do not however use this module to regularly check all incoming from addresses with this subroutine for possible bogus UBE addresses, because calling nslookup
You can however check some messages that are likely UBE to verify your doubts.
PMSRC must point to source directory of procmail code. This subroutine will include
If the cache file can be read:
Following conditions trigger "maybe" and no "ns-error" is written into the cache.
If you are going to check some header field, like From:, please explode the content with pm-jaaddr.rc first. Suppose you have string:
"From: foo@ingrid.sps.mot.com (Yoshiaki foo)" |
You have to derive the address from string and pass the site name: Read From: field and address from it.
PMSRC = $HOME/pm RC_NSLOOKUP = $PMSRC/pm-janslookup.rc # name the subroutine :0 * MAYBE_UBE ?? yes * ^From:\/.* { INPUT = $MATCH INCLUDERC = $RC_NSLOOKUP # to nslookup :0 * ERROR ?? yes { # Hm, nslookup failed, can't send anything back to this # address } } |
Second example, check if the address is reachable before sending reply
INPUT = `$FORMAIL -rt -x To:` INCLUDERC = $RC_NSLOOKUP :0 * ERROR ?? no { # okay, at least site address seems to be reachable } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine digs embedded message from the body and replaces current message with it. Copy the message to folder before calling this subroutine if you need original.
NOTE: This is simple tool and the sole purpose is to derive simple embedded messages. Write full fledged perl script if you want better extracting features. The used AWK inside this procmail recipe will fail to find 30% of the cases, mostly due to non-standard way of including the message. The recognized formats are as follows. Anything that differs from these are ignored or incorrectly parsed.
If you're subscribed to mailing lists that regularly sent copies of original message to the list, like forwarding spam to SPAM-L mailing list at http://bounce.to/dmuth, then you'd like to extract the original embedded message which you can then feed to your UBE filter to test if the shield holds.
<em>spam-l-request@peach.ease.lsoft.com</em> subscribe SPAM-L <First name> <Last name> |
This recipe takes simplistic approach and tries it's best to extract embedded message. Idea for this recipe comes from Era Eriksson's posting "recipe to turn list postings back into original spam" 1998-06-25 in Procmail mailing list.
When this recipe ends, the current message has been modified so that it is the original message. Like if you would receive:
HEADER-1 # The poster body-1 # his comments HEADER-2 # The original embedded message body-2 body-1 # And poster's signature or mailing list footer |
The message now looks like
HEADER-2 body-2 body-1 |
And you can save this as original message or feed it to your UBE filter and test if it detects it.
For some reason procmail kept dumping core I write the code in more nicer format like below, but if I made it compact, then it didn't dump core. Go figure. I'm not pleased that I had to sacrafice clarity, but there was no other way.
[The good style] [The forced compact style] if () if () { statement } { statement } |
I have no explanation why this happens, the same AWK code would work just fine most of the cases and then came this message x and caused dumping the code, if I feed some other message, I didn't get core dump. Total mystery to me. Don't let the log message fool you, this had nothing to do regexp "^[> ]*From:.*[a-zA-Z]". If I deletd one line from AWK script, it worked ok, if I added it back the core dump happened with that message x
procmail: Assigning "pfx=[> ]*" procmail: No match on "^[> ]*From:.*[a-zA-Z]" Segmentation fault (core dumped) |
PMSRC must point to source directory of procmail includerc code. This subroutine needs module(s):
Let's assume that you want to feed all forwarded UBE that is posted to spam-l mailing list to your filter and see if it needs improving by checking the logs later. The forwarded UBE to the list is labelled "SPAM:" in the subject line.
$RC_LIST = $PMSRC/pm-jalist.rc # mailing list detector $RC_ORIG = $PMSRC/pm-jaorig.rc # extract original $RC_UBE = $PMSRC/pm-jaube.rc # UBE filter ... INCLUDERC = $RC_LIST # defines variable `LIST' :0 * ! LIST ^^^^ { :0 # spam-l mailing list * LIST ?? spam * Subject: +SPAM: { INCLUDERC = $RC_ORIG # Change it to UBE message # Ok, next feed it to filter, set some variables first # Log = Short log; What filters were applied to message # mbx = If message was trapped, save it here JA_UBE_LOG = "$PMSRC/pm-ube.log" JA_UBE_MBOX = "junk.ube.ok.mbox" INCLUDERC = $RC_UBE :0 : # If comes here, filter failed junk.ube.nok.mbx } :0 : # save normal list messages list.$LIST } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
When I'm on remote site and I don't seem to get throught with telnet or even with Unix ping(1), I want to know if the at least the mail server is up. I can send a ping message and the auto responder will reply immediately.
Sometimes, when you send a message to a person, it would be nice, if you could test that the destination address is valid, before sending a message to a black hole. If the receiver had ping service running; like this, then you would know that you spelled the the right address. (after wondering two weeks; why you don't get response). Nowadays finger(1) command seems to be blocked many times.
This recipe answers to simple ping message like this:
To: you@site.com Subject: ping |
Recipe sends a short message back to the sender.
PMSRC must point to source directory of procmail code. This subroutine will include
JA_PING_MBOX = $HOME/Mail/spool/ping.spool INCLUDERC = $PMSRC/pm-japing.rc |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Ahem, that pop3 is just to draw your attention. This module has nothing to do with pop3. The idea may resemble it though. This module listens pop3 requests, and when it gets one, it sends the whole mailbox content as separate forwarded messages to the account from where you sent the request.
This is kinda "empty my mailbox in account X and send the messages to account Y"
You might have permanent forwarding on in account X, but if that is your secondary account, you can ask what messages has been arrived there with this recipe.
After you have configured your magic pop3 command, which is your password, simply send a message to account X, and this module initiates emptying the mailbox. Here is an example:
Subject: pOp3-send [mailbox] [kill] |
PMSRC must point to source directory of procmail code. This subroutine will include
STATUS will contain mailbox name if valid pop3 request was received. You may wish to save the pop3 requests to separate folder. See example below.
You install this same setup for each site where you have account. This is the account X, from where you want to empty the mailboxes.
RC_POP3 = $PMSRC/pm-japop3.rc .. somewhere in your .procmailrc .. JA_POP3_SUBJECT_CMD = myPoPcmd INCLUDERC = $RC_POP3 # Save all pop3 requests to folder :0 : * STATUS ?? [a-z] mail.pop3-req.mbox |
# The MATCH will contain the host name from where the messages # were moved :0 : *$ X-Loop-Fwd:.*\.rc +\/$NSPC+ mail.fwd.$MATCH.mbox |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Return random line or a line from a file. This subroutine uses shell command awk and possibly wc to be as small burden to the system as possible.
You must have awk that supports VAR=value assignment syntax outside the code block: that is, in the input line. I know no awk that would not have this feature, but at least you know now what it takes.
% awk '{print VAR; exit;}' VAR=1 /etc/passwd |
Try using GNU awk, if your standard awk didn't print 1 in above test. (Put this line to the top of your .procmailrc)
AWK = "gawk" |
If intend to call this subroutine many times, then please calculate the number of lines beforehand and pass it to this subroutine. If the MAX is not set, then wc is called every time to find your the line count.
# Select random line from a file $RC_RANDF = $PMSRC/pm-jarand.rc $COOKIE = $HOME/txt/cookie.lst ...somewhere.. MAX=20 FILE=$COOKIE INCLUDERC=$RC_RANDF # LINE contains randomly read line |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine is part of the TPFS or MPFS file server. Check FILE for nonvalid filenames or other access problems.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine is part of the MPFS file server. Handle BOUNCES to mail server messages, eg if delivery failed due to maximum byte limit.
552 foo@site.com... Message is too large; 100000 bytes max |
(none) This recipe examines headers and body to see if it's daemon bounce.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This module is part of the MPFS file server. Ssnd error notice: file didn't exist.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine is part of the MPFS file server. Compose headers for reply message using formail -rt.
Here is dry run example to test this module
% procmail DEFAULT=/dev/null VERBOSE=on LOGABSTRACT=all \ FORMAIL=/opt/local/bin/formail \ JA_SRV_FORMAIL_FROM=me@here \ JA_SRV_CONTENT_TYPE=content-type \ JA_SRV_XLOOP=xloop \ $HOME/pm/pm-jasrv-from.rc \ < $HOME/any-sample.email |
(none)
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine is part of the TPFS or MPFS file server. Run $CODE and return resutls to to user. Subroutine is meant to be used for informational messages.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine is part of MPFS file server. Send out FILE as multipart MIME message. The message will always be base64 encoded before sending.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This is the MPFS (Mime Procmail File Server) and it can send MIME compliant messages with command
"send <ITEM> [WORD1] [WORD2]" |
Usually only the ITEM arg is used, and the rest of the words are for special uses like password and preventing file encoding. A typical request looks like:
Subject: send help # ask for file named 'help' |
You have to create a directory for the server where the files are kept. Usually I don't put the files there, but whenever I want to make a file available, I draw a hard or softlink to the real file.
% mkdir $HOME/pm-server # Repeat this as needed for files you want to put available % cd $HOME/pm-server % ln -s $HOME/txt/interesting-file.txt interesting-file.txt |
You define the server directory by setting
JA_SRV_FILE_DIR = $HOME/pm-server |
The short server log is written to file pointed by this variable:
JA_SRV_LOG = $HOME/pm-server.log |
The incoming "send" requests are stored to mailbox pointed by following variable. The default value is /dev/null, but you may want to set it to ~/Mail/spool/log.srv.spool which can be used read as a newsgroup by Emacs Gnus [In Gnus create newsgroup with `G` m nnml log.srv]
JA_SRV_MSG_MBOX |
Tweak this variable to commands you want to allow shell to execute in server's directory. This tells when <ITEM> "ls" means command instead of file
JA_SRV_SH_COMMAND = "^(ls|what)$" |
That means that request like this:
Subject: send ls # run "ls" command and return results |
Be sure that the commands exist in your system. See man pages for more if you want to know what these commands do. Commands cannot take switches currently for security reasons. E.g. if you want to give access to "ls -la" listing, put a file "ls-la.txt" available in the directory, user can get it with "send ls-la.txt"
ls -- list directory file -- print file type information. what -- prints all @(#) tags from files ident -- print all $ $ tags from files |
Users want to get a help file with message "send help" and the help is just a file in your server directory. Be sure to supply it prior to any other files. You can always draw a link to a file if you don't want to name it that way (e.g. if you keep several server help files in a RCS tree)
# draw symlink to `help' % ln -s $HOME/txt/srv-public-hlp.txt $HOME/server/help |
The server accepts command in format
"send <ITEM> [CMD|PASSWORD]" |
Where ITEM can be any name as long as it starts with [^ .]. The regexp says: Anything goes as long as FILE does not start with space or period. This gives you quite a much freedom to construct filenames. if you want to hand out file:
.procmailrc |
You can't. Instead make a link to point to plain "procmailrc" without the leading period. There is also additional checks against possible security threat "../" like below; user can't request such file.
../../../gotcha or dir/../../gotcha |
The filename cannot contain special characters like [*?<>{}()].
[conversions]
If some of your files are big, it makes sense to send them in compressed base64 format; which in MIME world is called content-type gzip. You can set a regexp to enforce encoding for your big files before they are sent to user. The following setting will send all text files in compressed format to user.
JA_SRV_XGZIP_REGEXP = "\.txt" |
When the message is composed a header is inserted into the message telling how the message is to be decoded, in case user doesn't have decent MUA that can handle the MIME type:
X-comment: To decode, cat msg| mmencode -u| gzip -d > test.txt |
[noconv and gz]
The WORD1 parameter after the FILE is optional and user can override base64 encoding and request plain file if he uses word "noconv".
Subject: send <FILE> [noconv|gzip] |
However, there are files where noconv must not be obeyed, like
the compressed packages that you have put available in .zip, .gz,
tar.gz or .tgz (GNU tar) format. Following variable controls
when file is always sent as base64:
JA_SRV_BASE64_ALWAYS |
If the WORD1 is "gz" or "gzip", then the gzip is explicitly requested, This may be desirable, because some of the text files in the server directory may be big and some accounts don't accept big messages. A typical bounce looks like:
552 foo@site.com... Message is too large; 100000 bytes max 554 foo@site.com Service unavailable |
These kind of file server bounce messages are handled in separate module which notifies the user that his account didn't accept the sent file.
[case sensitivity]
By default the request word ("send") and ITEM (filename) are not case sensitive, unless you set these flags:
JA_SRV_F_CMD_CASE_SENSITIVE = "yes" JA_SRV_F_FILE_CASE_SENSITIVE = "yes" |
If values are "no", then these are identical commands:
Subject: Send Help Subject SEND HELP |
If you want to deliver big files, you better be sure not to send them as a big file. That blocks the connection between every host along the path that the big file is transferred. The solution is to use MIME multi parts that can be assembled back in the receiving MUA. (In case you don't have multi part assembler receive Perl script to do it).
MIME multiparts are sent out if
When a file meets these criteria, it is read to the BODY of message and base64 encoded. This all happens in memory, so watch procmail logs to see if any problems with very big files. (>30Meg). Next, if the base64 conversion succeeded, the composed is handed to
JA_SRV_MIME_MULTI_SEND |
Which does the actual delivery and splitting. The default program used is splitmail. Make sure you have it or substitute the program with some equivalent one.
Sometimes you're making rearrangements in you file directory or doing some other maintenance and you are unable to respond to send requests. You can stop the server by setting
JA_SRV_IN_USE = "no" |
And when you want to enable the server again; just comment out the statement or assign yes. [The default is yes]. When this variable is set to no, the server sends a message from following variable as a response to any "send" request.
JA_SRV_IN_USE_NO_MSG |
You should be aware that this file server's implementation is public in nature. Anyone who asks for a file is allowed to get it. But it would be good if you could limit the access to documents with some simple way, like if you set up two file servers (see next chapter) where one is public and the other is interesting only to group of people. You can define a string that must be found in Subject field by setting the following variable
JA_SRV_PASSWORD = ".*" # default |
The default value will match anything in the subject, thus making the server public. But if you set it like this
JA_SRV_PASSWORD ".*123" |
Then string "123" must be there somewhere in the line, like here
Subject: send <FILE> 123 |
Yes, "123" is actually a CMD definition, but it doesn't matter because there is no CMD 123. Subject now matches password and the server can be accessed. Of course the following is valid too.
Subject: send <FILE> noconv 123 |
If the password was wrong, server won't tell it. The message just lands to your mailbox in that case and you can investigate who tried to access the restricted server.
The default command string is "send", but you can change it and thus create multiple services. Here is one example, where you have set up two file servers where each has its own directory.
# The public server JA_SRV_CMD_STRING = "send" JA_SRV_FILE_DIR = $HOME/server/public INCLUDERC = $HOME/procmail/pm-jasrv.rc # Company server, only interests fellow workers. # Here "xyz-send" is just magic server request string. # Notice case sensitivity settings. JA_SRV_F_CMD_CASE_SENSITIVE = "yes" JA_SRV_CMD_STRING = "xyz-send" JA_SRV_PASSWORD = ".*12qw" JA_SRV_FILE_DIR = $HOME/server/public/xyz-dir INCLUDERC = $HOME/procmail/pm-jasrv.rc |
[basic Mime type note]
All basic files that you send must be US-ASCII, 7bit. At least that is the default MIME type used. See JA_SRV_CONTENT_TYPE. I once received following message back
----- Transcript of session follows ----- 554 foo@bar... Cannot send 8-bit data to 7-bit destination 501 foo@bar... Data format error |
because in the previous releases, the MIME type headers were not in the message saying that the content really was plain 7bit ascii.
[Sending the file as is]
Note, that the file is included "as is" without any extra start-of-file or end-of-file tags. This is possible, because the file is sent in MIME format.
[Using one line log entry]
It may look very spartan to print a single line log entry. You see messages like above in the file server log. Using one line entry instead of multi line announcements makes it possible to write a small perl tool to parse information from a single line. If you get many file server messages per day, it quicker to look at the single line entries too.
[ja-srv1; sh file; Foo Bar foo@site.com;] [ja-srv1; send xxx-file.txt; Foo Bar foo@site.com;] | Server's request keywords (you may have multiple servers) |
[wish list]
(*) MIME multipart message's mime headers may need some adjustments.
(*) I rely on simple regexp to send out base64 or gzip files. The natural extension would be to use file size threshold: if file is bigger than N bytes, send it out with gzip. And further: if file is more than NN bytes, send it out as multi part MIME.
(*) In fact there is a slight mime type errors: .zip files should be send as application/zip. If you have experience with the mime types, please contact me and help me to sort out proper mime headers.
PMSRC must point to the source directory of procmail code. This subroutine will include many pm-jasrv-*.rc modules and other files from there.
Please test the File Server in your environment before you start using it for every day. For example I had some weird local problem where PATH had /usr/contrib/bin/ where gzip was supposed to be, but in spite of my tries procmail didn't find it along the path. Don't ask why. I now use absolute binary name:
GZIP = /usr/contrib/bin/gzip |
In addition, if your messages are not sent to recipient, but you get daemon message:
... Recipient names must be specified |
That's because you have setting SENDMAIL="sendmail"; which is not enough. It must be
SENDMAIL = "sendmail -oi -t" |
This is my .procmailrc installation. Notice that the file server code is used only if you get "send" request. On the other hand, this double wrapping is not all necessary, you could as well rely on the File server's capability to detect SEND request.
PMSRC = $HOME/pm # directory where the procmail rc files are RC_FSRV = $PMSRC/pm-jasrv.rc mySavedLOGFILE = $LOGFILE # record file server actions elsewhere LOGFILE = $PMSRC/pm-jasrv.log # Listen "send" requests. :0 * ^Subject: +send\> { JA_SRV_FILE_DIR = $HOME/fsrv # Where to get the files JA_SRV_LOG = $HOME/fsrv.log # Write log here INCLUDERC = $RC_FSRV # Use file server now } LOGFILE = $mySavedLOGFILE |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine is part of the MPFS file server. Check if file server request is on the JA_SRV_SUBJECT and do case or incasensitive check.
To Dry run this module use following skeleton. Substitute keywods as needed to reflect your system setup:
% procmail DEFAULT=/dev/null VERBOSE=on LOGABSTRACT=all \ PMSRC=$HOME/txt JA_SRV_CMD_STRING=send \ JA_SRV_SUBJECT="send newbie_article.rtf noconv" \ txt/pm-jasrv-req.rc < ~/test.mail |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine is part of the MPFS file server. Send the requested file. You can dry-run test this module with following command: a) make sure that $HOME/test conatins any simple email message b) define FORMAIL if it isnot found from path.
% procmail DEFAULT=/dev/null VERBOSE=on LOGABSTRACT=all \ PMSRC=$HOME/txt JA_SRV_LOG=/dev/null \ FORMAIL=/opt/local/bin/formail \ file=$HOME/test FILE=test WORD=WORD JA_SRV_FROM=foo@bar \ SENDMAIL="tee -a $HOME/test.send" txt/pm-jasrv-send.rc < ~/test |
The MIME headers here selected previously were:
Content-type: application/octet-stream Content-transfer-encoding: x-gzip64 |
But Defining own CTE such as x-gzip64 is strongly discouraged by the MIME RFC's. Most e-mail clients would be at a loss on how to handle these. Many would just bomb out and not even give you the opportunity to save it to a file. A more correct MIME type is this, which is now used:
Content-type: application/x-gzip Content-transfer-encoding: base64 |
o FILE is the filename(chdir to directory is already done) `file' is _absolute_ filename `WORD' is next word from subject line after FILE word. o JA_SRV_CMD_STRING is flag o JA_SRV_F_SUBJ_NOTIFY is flag
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This subroutine stores the message to file pointed by MBOX. This subroutine is meant to be used with the the other general purpose includerc files. This makes it possible to have a centralized file storage handling for all your rc files.
Regular user doesn't get much out of this rc unless he mixes both
gz and regular files in his .procmailrc
R e p e a t: This module is basis for general purpose procmail rc plug-ins to strre message to mailbox pointed by some rc configuration variable. Normal user can simply say in his .procmailrc:
:0: mail.private |
MBOX must have been set to point to message storage. MBOX_SUFFIX is extension added to MBOX. Default is none. MBOX_MH if "yes" then deliver to MH mailbox with `MBOX_MH_CMD' which is "rcvstore" by default.
otherwise
$RC_MBOX = $PMSRC/pm-jastore.rc :0 * condition { MBOX = $HOME/Mail/spool/junk.mbox INCLUDERC = $RC_MBOX } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
NOTE: If you receive RFC 2047 encoded Subject headers like "Subject: ?ISO-8859-1?Q?=C4hnlichkeiten_von_=DCbungen?", you must first decode it before using this subroutine. Feed the message to pm-jamime-decode.rc first.
There are many different Email programs out there that add their own reply characters to the subject field. The most sad programs come usually from PC platform. Eg. Microsoft has gained a lot of bad reputation due to it's own standards.
There already is a de facto standard where message should contain only single Re: if message has been replied to (no matter how many times). This makes it possible to do efficient message threading by only using Subject and date fields. And grepping same subjects is lot easier than from this horrible mess. Note that all text is on one line, the subject has been broken only for visual reasons:
Subject: re- Re^2: Re[32]: FW: Re: Re(15) Sv: Re[9]: -reply (fwd) [fwd] <fwd> fw: [FWD: [FWD:]] -subj subj: subj: subj- test |
This recipe standardizes any subject (like above) that has been replied to, to de facto format below. That is: "Any number of 'Re:' will be converted to single 'Re:' and any number of 'Fwd:' will be converted to single 'Fwd:'"
Subject: Re: test (fwd) |
If there is In-Reply-to header in the message, but there is not Re: in the subject line, one is added automatically. Some broken Mailers forget to add the Re: to the Subject line.
This is by default yes which causes the original subject to be saved under header field X-Old-Subject. If you don't want that extra header generated, set this variable to no
This is by default yes, which will kill extra forwarding indication words like (fwd) [fwd] <fwd> <f>. If you set this to no, then all the forwarding words are preserved. The de facto forward format is:
Subject: This subject (fwd) |
This subroutine's intention is to make Subject more expressive by deleting redundant information. A simplistic approach has been taken where Subject consists of list of words whose each attribute can be either ok or delete. No attempt has been made to determine the structure of the Subject. You can see the algorithm better from an example:
Re: New subject (was Re: Old subject) |
That should be treated syntactically like "New subject" and forgetting anything between parenthesis. This is however not respected and not even tried. The rule applied here is "One Re: is tolerated", so the subject won't change. It doesn't matter where "Re:" is.
But here the subject is changed. The rule applied is: Delete all unwanted words and then add one Re: to the beginning if OLD content had any Reply indications
Re: New subject (was Re: Old subject) --> Re: New subject (was Old subject) |
Please check that your SHELL variable setting in ~/procmailrc is sh derivate, /bin/sh or /bin/bash. This module won't work with other shells.
awk is a small, effective and much smaller than perl for little tasks. See the verbose log and make sure your awk understands VAR="value" passing syntax. Change it to nawk or gawk if they work better than your standard awk.
AWK = "gawk" # you may need this, try also gawk |
Let's say Polish M$Outlook uses ODP: instead of standard re: and you want to handle that too: Then set:
JA_SUBJECT_KILL = "odp:" # NOTE: all lowercase JA_SUBJECT_SAVE = "no" INCLUDERC = $PMSRC/pm-jasubject.rc |
You ca use JA_SUBJECT_KILL to delete any additional words from the subject line. E.g. if you have good news-reader, you don't need the mailing list prefixes that some mailing lists add to the beginning
Subject: [LIST-xxx] the subject here |
to remove that list prefix, you simply match it
JA_SUBJECT_KILL = "(list-xxx|list-yyy)" |
Important: The regexp must be all lowercase, because when match happens, the words have been converted to lowercase.
You need nothing special, just include this recipe before you save message to folder.
INCLUDERC = $PMSRC/pm-jasubject.rc |
You can dry-run test this module with following command and watching output. Substitute variables as they are in your system. You feed the content of entire example mail where the Subject that needs correction is found.
% procmail SHELL=/bin/sh AWK=gawk VERBOSE=on LOGABSTRACT=all \ DEFAULT=/dev/null LOGFILE=$(tty) \ JA_SUBJECT_KILL="(ace-users)" \ PMSRC=/path/to/install/dir \ /path/to/pm-jasubject.rc \ < ~/test.mail |
Thanks to Tony.Lam@Eng.Sun.Com for his creative improvement suggestions and sending code that this recipe didn't catch at first.
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This includerc parses date from variable INPUT which has string
"hh:mm:ss" |
Example input
"Thu, 13 Nov 1997 11:43:23 +0200" |
Returned values
hh = 2 digits mm = 2 digits ss = 2 digits |
Variable ERROR is set to "yes" if it couldn't recognize the INPUT and couldn't parse all hh, mm, ss variables.
PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.
INPUT = string-to-parse |
The INPUT can be anything as long as it contains NN:NN:NN
Get the time of received message. The From_ header will always have the standard time stamp.
PMSRC = $HOME/pm RC_DATE_TIME = $PMSRC/pm-jatime.rc :0 c * ^From +\/.* { INPUT = $MATCH # Turn off the logging while executing this part VERBOSE=off INCLUDERC = $RC_DATE_TIME VERBOSE=on :0 * ERROR ?? yes { # Should not ever happen, you have broken From_ } } |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This file is part of the "pm-jaube.rc". This subroutine is called when likely UBE message has been triggered.
PMSRC must point to source directory of procmail code. This recipe file will include
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your UBE (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked.
Now, if 50-70 % hit rate is good enough for your starting point, go ahead and read more. This file is supposed to be the last resort, if you really do not have any better tool to analyze messages.
Are you sure you want use this word list based checking?
Think twice before you use this subroutine. It knows nothing about the content your mail. "It's all UBE unless proven otherwise" is the motto. The brutal search tracks words and phrases to find an indication of mass posting and traces of Unsolicited Bulk Email (UBE aka spam). Repeat: Read the first paragraph again before you consider putting this file into action. This filter WILL PASS through unwanted mail and it WILL catch good mail. This is rule based matching, so I suppose you know where you're putting your head with this. Ahem. Alerted? Good.
The Story
There was a man and mail account. The account had limited space, couldn't install any other programs because disk quota would have exceeded. System administrators weren't' interested in installing anything. The Mail server ran behind firewall and had OS that was never heard of - it couldn't run other programs. Or if it could, the Bad system administrator was too scared to install extra programs to the host MTA ran. No joy – no means to stop incoming UBE – Right?
Wrong. There was procmail. The Bad system adminitrator didn't mention that ~/.procmailrc was honored - just the the external programs we a no-no-no (Technical: the MDA host mounted user disks; the server ran on separate host and couldn't use any of the user compiled programs. Statically linked ones filled up the man's disk space).
First line of defense, any defense would do. So, this rule based file was born. Nothing else was installed in that account and the happy word list based matching routine kept chewing mail, mail, mail. And the system administrator was happy - he nurtured the MTA host's CPU resources and noticed nothing alarming. All ticked like clockwork.
Life began again. After 1000 mail bombards a day, the account was usable again.
Motivation
If you can, use the Bayesian filters and forget all rule based ones, word and phrase matching based ones; all static filters. On the other hand, if you want quick solution, even imperfect, until you have time to learn and setup other tools, this subroutine may be of interest.
The best part. You can carry this single file anywhere where procmail lives. No other files are needed. Setup couldn't be simpler.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address is usually forged. Do not increase the network traffic. Instead save the messages to folders and periodically check their contents. It's not nice to be forced to apologize about bounces to a wrong destination.
Procmail is picky about the whitespace in continuing lines, make sure there is not a single spaces left after the continuation backslash. Use good editors or external programs to get rid of the white spaces. In Emacs you would add this line to your ~/.emacs startup file: "(add-hook 'write-file-hooks 'delete-trailing-whitespace)"
:0 * ^Subject:.*(regexp\ |and-more\ |and-more\ ) { # Process it } |
Why are the regexps put into this file and not to a separate regexp file? Good question. It is possible to check message's content with external process, like grep, to see if any matches are found. This kind of methodology is covered in Procmail Tips section "Using grep with file lists to mach messages" at <http://pm-tips.sf.net>. The reason why all the regexp are maintained inside this file is:
None. No dependencies to other procmail modules.
PMSRC = "/path/to/procmail/lib" # Exclude these addresses from tests VALID_FROM = "(my@address.example.com|word@here.example.com)" :0 *$ ! ^From:.*$VALID_FROM * ! FROM_DAEMON { INCLUDERC = $PMSRC/pm-jaube-keywords.rc # Variable "ERROR" is set if message was UBE :0 : * ! ERROR ?? ^^^^ junk.ube.spool } |
The layout of this file is managed by Emacs packages tinyprocmal.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.
YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!
To train:
$ mkdir $HOME/.annoyance $ DB=$HOME/.annoyance/dict.bin; DB2=$HOME/.annoyance/fdict.bin $ annoyance-filter --mail single.msg --prune --write $DB $ annoyance-filter --phrasemax 2 \ --read $DB \ --junk dir/to/bad/messages \ --prune --write $DB $ annoyance-filter -v --read $DB --prune --fwrite $DB2 |
To check message:
$ annoyance-filter --read $DB --test mail.msg $ annoyance-filter --fread $DB2 -v --class mail.msg |
There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program annoyance-filter, which must already have been installed.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.
If annoyance-filter program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.
JA_UBE_ANNOYANCE_PRG = /usr/bin/spamprobe |
If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.
None. No dependencies to other procmail modules.
PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_ANNOYANCE_PRG = "/usr/bin/nice -n 5 /usr/bin/annoyance-filter" JA_UBE_ANNOYANCE_SPAM_DB = $HOME/.annoyance/dict.db INCLUDERC = $PMSRC/pm-jaube-prg-spamprobe.rc # The ERROR will contains word "yes" if message was spam :0 : * ERROR ?? yes junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.
YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!
To train:
$ ls spam/*.mail | xargs -n 1 bmf -s # feed individual messages $ ls good/*.mail | xargs -n 1 bmf -n # feed individual messages |
To test
$ bmf -p < test.mail | less |
There are several bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection.
For serious discussion of strenghts of the different programs, refer to a very good article "Spam Filters" by Sam Holden at 2004-08-16 <http://freecode.net/articles/view/964>. The article evaluated throughly following programs:
This subroutine implements call interface to bmf program. Why whould you need it? Because unfortunately bmf by default use exactly the same headers as spamasassin and the two cannot co-operate together: bmf would overwrite existing spamassasin headers. This subroutine takes care of saving previous headers and move bmf results to their own X-Spam-bmf-* headers.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.
If bmf program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.
JA_UBE_BMF_PRG = "/usr/bin/bmf" |
If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.
None. No dependencies to other procmail modules.
If headers were enabled, they will contain:
X-Spam-bmf-Status: Yes, hits=1.000000 required=0.900000, tests=bmf X-Spam-bmf-Flag: YES |
PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_BMF_PRG = "/usr/bin/nice -n 5 /usr/bin/bmf" INCLUDERC = $PMSRC/pm-jaube-prg-bmf.rc # The ERROR will contains word "yes" if it program classified # the message into "bad" category. :0 : * ERROR ?? yes junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.
YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!
To train:
$ rm -f ~/.bogofilter/*.db # delete database $ bogofilter -B -n good.msg ... $ bogofilter -B -s spam.msg ... |
There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program bogofilter, which must already have been installed.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.
If bogofilter program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.
JA_UBE_BOGOFILTER_PRG = /usr/bin/bogofilter |
If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.
None. No dependencies to other procmail modules.
PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_BOGOFILTER_PRG = "/usr/bin/nice -n 5 /usr/bin/bogogilter" INCLUDERC = $PMSRC/pm-jaube-prg-bogofilter.rc # The ERROR will contains reason if program classified # the message into "bad" category. :0 : * ! ERROR ?? ^^^^ junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.
YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!
To train:
$ bsfilter --add-clean good.msg ... $ bsfilter --add-spam spam.msg ... |
There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program bsfilter, which must already have been installed.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.
If bsfilter program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.
JA_UBE_BSFILTER_PRG = /usr/bin/bsfilter |
If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.
None. No dependencies to other procmail modules.
PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_BSFILTER_PRG = "/usr/bin/nice -n 5 /usr/bin/bsfilter" INCLUDERC = $PMSRC/pm-jaube-prg-bsfilter.rc # The ERROR will contains word "yes" if message was spam :0 : * ERROR ?? yes junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.
YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!
To train:
$ rm ~/.idata # delete database $ echo herbalife | ifile -i spam # initialize database $ ifile -h -i good good.msg ... $ ifile -h -i spam spam.msg ... |
There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program ifile, which must already have been installed.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.
If ifile program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.
JA_UBE_IFILE_PRG = /usr/bin/ifile |
If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.
None. No dependencies to other procmail modules.
If header output is enabled, it will contain the folder name ifile thinks the message belongs to. Assuming that trained folders used for messages were spam and good, then the headers read:
X-Spam-Ifile-Status: spam X-Spam-Ifile-Status: good |
PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_IFILE_PRG = "/usr/bin/nice -n 5 /usr/bin/ifile" INCLUDERC = $PMSRC/pm-jaube-prg-ifile.rc # The ERROR will contains reason if program classified # the message into "bad" category. :0 : * ! ERROR ?? ^^^^ junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
There are several bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. This module is a meta package which will call all other individual modules that interface to these Bayesian programs. The use is simple: define programs that are available in your system and which you have trained (Bayesian programs need to be trained before use), and this this module will query how those programs would classify the message.
PMSRC must point to source directory of procmail code. This subroutine will include
To activate Bayesian program(s), define path to them. Default value for all these variables is "" i.e. is is supposed that no programs have been installed or trained.
Optional variables to set:
All headers are canonicalized to X-Spam-<PROGRAM>- so e.g. in bogofilter's case, the default X-Bogocity header is changed to value X-Spam-Bogofilter-Status and so on. Summaries like below can then be generated:
$ egrep -i '(Subject|From|^X-Spam.*Status)' *.mbox |
PMSRC = $HOME/procmail # procmail recipe dir # ... other checks, mailing lists, work mail etc. # bogofilter and Bayesian Mail Filter available and trained. Use them. JA_UBE_BOGOFILTER_PRG = "/bin/nice -n 5 /bin/bogofilter" JA_UBE_BMF_PRG = "/bin/nice -n 5 /bin/bmf" # Call the "umbrella" module, which will take care of # all the details. INCLUDERC = $PMSRC/pm-jaube-prg-runall.rc # ERROR is set if message was spam. The "()\/" logs reason. :0 : * ERROR ?? ^()\/.+ junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.
YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!
To train:
$ rm -f ~/.spamassassin/bayes* $ sa-learn $opt --local --no-rebuild --ham good.msg ... $ sa-learn $opt --local --no-rebuild --spam spam.msg ... $ sa-learn --rebuild |
There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program spamassassin, which must already have been installed.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.
If spamassassin program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.
JA_UBE_SPAMASSASSIN_PRG = /usr/bin/spamassassin |
If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.
None. No dependencies to other procmail modules.
PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_SPAMASSASSIN_PRG = "/usr/bin/nice -n 5 /usr/bin/bogofilter" INCLUDERC = $PMSRC/pm-jaube-prg-spamassassin.rc # The ERROR will contains reason if program classified # the message into "bad" category. :0 : * ! ERROR ?? ^^^^ junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.
YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!
To train:
$ spamoracle add -v -spam good.msg ... # feed individual messages $ spamoracle add -v -good good.msg ... # feed individual messages |
To test
$ spamoracle test mail.msg | less |
There are several bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection.
Using Spamoracle as sole spam protection is inefficient, because version version 1.4 (2004-09-29) does not accept messages from stdin. Becaus of this message has to be written to a temporary file before calling Spamoracle. Later the temporary file must be removed with rm. All these three shell calls are needed for each message. If you have other detection programs, call them first to identify unsolicited Bulk Email.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.
If spamoracle program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.
JA_UBE_SPAMORACLE_PRG = /usr/bin/spamoracle |
If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.
None. No dependencies to other procmail modules.
If headers were enabled, they will contain these values. The score's values are spam probability 0.0 - 1.0 and the degree of similarity 0-15 of the message with the spam messages in the corpus.
X-Spam-Spamoracle-Status: yes X-Spam-Spamoracle-Score: 1.00 -- 15 X-Spam-Spamoracle-Details: refid:98 $$$$:98 surfing:98 asp:95 click:93 cable:92 instantly:90 https:88 internet:87 www:86 U4:85 isn't:14 month:81 com:75 surf:75 X-Spam-Spamoracle-Attachments: cset="GB2312" type="application/octet-stream" name="Guangwen4.zip" |
PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_SPAMORACLE_PRG = "/usr/bin/nice -n 5 /usr/bin/bmf" INCLUDERC = $PMSRC/pm-jaube-prg-spamoracle.rc # The ERROR will contains word "yes" if it program classified # the message into "bad" category. :0 : * ERROR ?? yes junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.
YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!
To train:
$ spamprobe -8 good good.msg ... $ spamprobe -8 spam spam.msg ... |
Make sure there are no stale lock files, or the spamprobe and this subroutine will hang infinitely:
$ rm -f ~/.spamprobe/lock |
There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program spamprobe, which must already have been installed.
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.
If spamprobe program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.
JA_UBE_SPAMPROBE_PRG = /usr/bin/spamprobe |
If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.
None. No dependencies to other procmail modules.
PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_SPAMPROBE_PRG = "/usr/bin/nice -n 5 /usr/bin/spamprobe" INCLUDERC = $PMSRC/pm-jaube-prg-spamprobe.rc # The ERROR will contains word "yes" if message was spam :0 : * ERROR ?? yes junk.mbox |
The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Put all your UBE (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked.
If you think you can put this recipe as a first line of defence to your mail, you will disappoint. Checking UBE with procmail's rule based means does not work that way. The good messages must be sorted first (like your mailing lists and your important work or friend message) and only then what's left to process can be scanned by static rule based tools, like this procmail module. There are much more better tools that are based on statistical analysis of messages. You really should consider using one or combination of Bayesian tools: Spamassassin, bogofilter, spamprobe, Bayesian Mail Filter, ifile etc.
Repeat: procmail rules are not the tool to UBE control. The pattern matching rules can never keep up with the spammers. That said, if you:
only then consider this module or any other procmail based spam filters in that respect. So, please don't set your expectations high. Spend good time with the configuration variables and check there returned result in variable ERROR carefully. Good luck.
Remember: this is not 100% and there will always be some mishits, so don't just junk messages to /dev/null.
Originally Daniel Smith posted his spam.rc, where he had gathered many tips and heuristics to filter UBE email. This filter here expresses work of many procmail users. Original filters were modified, some rules were left out that catched false email messages and made the package look a bit more general so that it could be included via INCLUDERC in the standard way.
Thanks to Daniel and others, the UBE bomb days can be reduced, when this filter is active. Some UBE messages may still lurk into the mailbox, but that's the problem with all static rule based tools.
A good strategy to follow incoming mail is to log the vital parts like Date, From, Subect to some log file and then a reason what happened to a message. The ~/Mail/mail.log might look like:
1997-12-08 work@example.com Extra Holiday $$$$$ [jaube; Marketing-Big-ExitCode; LEGAL, MONEY-MAKING PHENOMENON] 1997-12-09 Denizen logger@example.com [RePol] hiding 1997-12-09 david X dx@example.com Re: Send list to incoming folder 1997-12-09 david X dx@example.com Re: Send list to incoming folder 1997-12-09 OMC manager <em>omcman@example.fi</em> "Environments updated" [my; work-localenv] 1997-12-09 doodle@example.org Re: Gnus (Emacs Newsreader) FAQ [my; emacs; Re: Gnus (Emacs Newsreader) FAQ ] |
First a UBE message that was identified and saved to folder. Next 3 messages were filed to mailing-list folders and there was no [] action displayed for them (left out due to high volume of these messages). Second Last was internal work message. Lastly someone asked somthign about Emacs.
The basic incoming message log recipe could be like this. Variable TODAY is $YYYY-$MM-$DD whose values are set after calling pm-jadate.rc. The LISTS is user set variable to exclude mailing lists whose activity is not important. Variables FROM and SUBJECT are fields read the message's headers.
BIFF = $HOME/Mail/mail.log INCLUDERC = $PMSRC/pm-jadate.rc ... :0 hwic: *$ ! $LISTS |echo "$TODAY $FROM $FSUBJECT" >> $BIFF |
Here is small perl script to print summary of trapped UBE messages from a log like above. It gives nice overview which recipes catch most of the UBE messages.
perl -ne '/jaube; (\S+)/; $s{$1}++; \ END { $s = (map{$x += $_; $_= $x} values %s)[-1]; \ $i = int $s{$_}/$s *100; \ for (keys %s) { printf "$s{$_} $i $_\n" } \ }' \ mail.log | \ sort -nr |
Here is sample results during two month period There are total of 3248 UBE messages catched.
count % type ------------------------------------------ 554 17 Marketing-CountBigLetterWords 457 14 Marketing 422 12 Marketing-SelectedBigLetterWords 349 10 AddrBogus-ToFrom 263 8 FromReceived-Mismatch 223 6 NoDirectAddress-ToCc 216 6 HdrForgedPegasus 164 5 AddrBogus-To 151 4 MessageId 102 3 BodyHtml 73 2 Received-IPError 63 1 Identical-FromTo 53 1 AddrInvalid 15 0 From-nslookup 9 0 HdrReceivedTime 7 0 HdrX-UIDL 4 0 Marketing-headers |
The general consensus is, that you should not send bounces. The UBE sender is not there, because the address is usually forged. Do not increase the network traffic. Instead save the messages to folders and periodically check their contents. It's not nice to be forced to apologize if you bounced message to a wrong destination. DON'T BOUNCE. Forget all recipe examples that use HOST and EXITCODE and be a good Net citizen.
PMSRC must point to source directory of procmail code. This recipe file will include
Only handful of the most important variables are described here. You really should read all the comments placed in the "user configured section" in this procmail module's code. Most of the defaults should work out of the box.
Alternatively you check content of header JA_UBE_HDR which contains results of the above variables. Possible values for ERROR are:
AddrAOLinvalid AddrBogus-From AddrInvalid-From AddrInvalid-To AddrNumeric AddrNumericDomain AddrUbeLike BodyAttachment-FileIllegalAdditional BodyAttachment-FileIllegalMatch BodyAttachment-FileIllegalOther BodyAttachment-FileSuspect BodyCharacters-Illegal BodyHtml-NonMime BodyHtml-script BodyHtmlBase64 BodyHtmlImage BodyHtmlTags BodyMimeCharset-Illegal EnvelopeFrom-Invalid From-nslookup FromReceived-Mismatch HdrForgedPegasus HdrReceived HdrReceivedTime HdrX-Distribution HdrX-UIDL Header-ApparentlyTo HeaderCharacters-Illegal HeaderMimeCharset-Illegal Html-base64 Identical-FromTo Marketing-Body Marketing-CountBigLetterWords Marketing-SelectedBigLetterWords Marketing-Subject Marketing-SubjectGreeting MegaSpammer MessageId-Invalid MessageId-Empty NoDirectAddress-ToCc NotEnoughHeaders Received-IPError VirusBody VirusHeader |
# - All legimate messages should already been handled and saved before this recipe. # - Activate the filter only for messages that are not from # daemon and not from valid senders: like from "my" domain # and mailing lists and from somewhere else. VALID_FROM = "(my@address.example.com|word@here.example.com)" :0 *$ ! ^From:.*$VALID_FROM *$ ! FROM_DAEMON { # Do not add extra headers. This saves external shell call # (formail). Also do not try to kill the message content, # again saving one external call (awk). With these, the # recipe is faster and more CPU friendly. PM_JA_UBE_HDR = "" JA_UBE_ATTACHMENT_ILLEGAL_KILL = "no" INCLUDERC = $PMSRC/pm-jaube.rc # Variable "ERROR" is set if message was UBE, record error # to log file with "()\/" :0 : * ERROR ?? ()\/[a-z].* { # Don't save those *.exe, *.zip UBE attachements :0 * ERROR ?? attacment.*file /dev/null :0 : spam.mbox } } |
There may be UBE messages that fool FROM_DAEMON test, so you could also use something more finer check. The standard daemon error message almost always has sentence "Transcript of session follows" in the body. This recipe says: "Unless proven otherwise, I don't believe this is daemon message even if it looked like that". Add More "2^1" checks to raise score for other valid daemon cases.
* -1^0 ^FROM_DAEMON * ! 2^1 B ?? Transcript of session follows { # ... Now call UBE checker } |
The layout of this file is managed by Emacs packages tinyprocmal.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
Framework for all programs that need to reply to messages only once. Usually known as "vacation" feature. If you cahnge th cache file, you can attach this recipe to any messages that you want to deal with only once.
PMSRC must point to source directory of procmail code. This subroutine will include
To turn on the vacation feature, create ~/.vac file and recipe below activates vacation. If the vacation is not active, then cache file is removed. (automatic cleanup). The VERBOSE is also turned off when you're on vacation; so that your procmail log will not get filled.
So when you go to vacation, you 'touch ~/.vac' and update ~/vacation.msg. When you come back, you 'rm ~/.vac'. That's it.
IMPORTANT: If you are subscribed to mailing lists, be sure to file messages from those services first and put the vacation recipe only after the list or bot messages. Also add sufficent "!" conditions in order not to reply to other "bot" service messages.
JA_VAC_ID_CACHE = $HOME/.pm-vac.cache :0 *$ ? $IS_EXIST $HOME/.vac { VERBOSE = off JA_VAC = "yes" JA_VAC_RC = $PMSRC/pm-myvac.rc # my vacation recipe INCLUDERC = $PMSRC/pm-javac.rc # framework } :0 E # else * ? $IS_EXIST $JA_VAC_ID_CACHE { dummy = `$RM -f $JA_VAC_ID_CACHE` } |
Here is example of pm-myvac.rc recipe
# Change subject :0 fhw * ^Subject: *\/[^ ].* | $FORMAIL -I "Subject: vacation (was: $MATCH)" :0 fb # put message to body | $CAT $HOME/.vacation.msg :0 # Send it | $SENDMAIL |
Copyright © 1997-2009 Jari Aalto
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.
This file defines common variables that you can use in the recipe's condition line. Procmail does not know about escape sequences like \t or \n and it is therefore much more readable to use variables as substitute for common regular expression atoms. Pay attention that the line starts with "*$ ", where "$" expands the variables: In this file, the variable names represent the well known Perl regular expression names, so that $s is alost like Perl expression \s (whitespace) and $S is almost equivalent to \S (non-whitespace). Similarly, $d is \d (digit) and $D resembled \D (non-digit).
:0 *$ $s+something+$s+$d+$a+ |
The equivalent without variables (you don't see the tabs and spaces here):
:0 # Space + tab * [ ]something[ ][0-9]+[a-z]+ |
In addition all system dependent variables are defined in this module. For example if you have Gnu awk, it is strongly suggest that you set:
AWK = "/path/to/gawk" # in Linux, this would be /usr/bin/awk |
You can define these variables before or after the module, just make sure the binaries reflect your operating system's paths. In general, if you "port" your setup to several system, dont' include absolute paths. In the other hand, if your setup is in the same place using absolute paths will speed up executions by a factor of 3 or more. (depending on how long your PATH is)
See pm-tips.txt file for full explanation or look at the source code.
SPC WSPC NSPC SPCL # Whitespace, Non Whitespace, W+linefeed \s \d \D \w \W and \a \A # perl equivalents |
In order to boost procmail and to save extra CPU cycles, this module defines variable JA_FROM_DAEMON that caches the information of ^FROM_DAEMON. You can refer to JA_FROM_DAEMON as you would to big brother FROM_DAEMON. This has the advantage that procmail has already computed the result and the variable JA_FROM_DAEMON is used as a cache, thus avoiding repeated FROM_DAEMON regexp tests, which are expensive. Variable JA_FROM_DAEMON_match contains "" or the result of matched daemon text.
*$ $JA_FROM_DAEMON |
or the familiar
*$ ! $JA_FROM_DAEMON |
Instead of using the regexp parsing with
* ^FROM_DAEMON |
and
* ! ^FROM_DAEMON |
Works like JA_FROM_DAEMON variable but in respect to FROM_MAILER. The matches text is in JA_FROM_MAILER_MATCH
For your .procmailrc, you can simply put this, because you want to load the variables at startup
PMSRC = "/path/to/install/location/of/this/library" INCLUDERC = $PMSRC/pm-javar.rc |
If you're developing your own modules that use these variables put these lines at the beginning. ~/.procmailrc. It checks if WSPC variable does not include a space --> load the variable definitions. If the variable is already defined, the file is not loaded. The test line is something alike #ifdef – #endif in C/C++ language or a conditional "import" command in other languages.
:0 * ! WSPC ?? [ ] { INCLUDERC = $PMSRC/pm-javar.rc } |
After this file loads, you can refer to any module with $RC_JA_MODULE. E.g. to call email spit module in your code you would use following. See at the end of this file for all defined module names.
INCLUDERC = $RC_JA_UBE |