Document id

License: This material may be distributed only subject to the terms and conditions set forth in GNU General Public License v2 or later; or, at your option, distributed under the terms of GNU Free Documentation License version 1.2 or later (GNU FDL).

This page documents of part of the files included in Procmail Module Library project. Each modules is can be used as a plug-in. Some can be used as subroutines to your own programs and some are self containing recipes. There is no user or site specific settings in these modules.

An attempt to minimize the use of external processes was the design goal. The modules try to use pure procmail way as much as possible e.g. to get date and directly from message without calling expensive shell date. It is important to remember that procmail is run on every incoming message and every CPU tick spent counts.

Document control

This document has been automatically generated from the procmail files with 2 small perl programs in the following manner:

% perl -S ripdoc.pl `ls pm-ja*.rc|sort` > pm-lib.raw % perl -S t2html.pl \ --html-frame \ --base http://freecode.net/projects/procmail-lib \ --button-previous http://freecode.net/projects/procmail-lib \ --title "Procmail module documentation" \ --author "Jari Aalto" \ --meta-keywords "procmail, sendmail, programming, library" \ --meta-description "Procmail plug-in module documentation" \ --name-uniq \ --Out \ pm-lib.txt

The perl program assume that the documentation sections have been written in Technical Text Format. The perl program ripdoc.pl can be found at CPAN entry http://www.cpan.org/modules/by-authors/id/J/JA/JARIAALTO/ and t2html.pl is available at project perl-text2html.

Pm-jaaddr.rc – extract 'foo@some.com' email address from variable INPUT

File id

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details at <http://www.gnu.org/copyleft/gpl.html>.

Description

This includerc extracts the various components of email address from variable INPUT. You can do quite a lot interesting things with your email address. One of the tricks that you could use if you don't have sendmail plus addressing capabilities, is that you put the additional infomation to the RFC comment. Eg. If you read and followup to posts in usenet games groups, you could use:

From: login@site.com (John Doe+usenet.games)

Or if your email address's localpart (that's characters before @) already signify your First and surname, you don't need to repeat it in comment. However, place special marker "+" to mark additional information part for your procmail recipes:

From: first.surname@site.com (+usenet.games)

The use of RFC comment should work everywhere because RFC requires that comments are preserved along with the address information. If you would have sendmail plus addressing capabilities you would have used:

From: login+usenet.games@site.com (John Doe)

The idea is that the list infomation is readily available from the email. The following recipe will derive the plus information and use it directly as a mailbox where to drop the message. If The Editor: Emacs, means anything to you, you can program it to generate the appropriate From headers automatically when you send mail from Gnus Mail/Newsreader MUA. Drop me a message if you need an example how a piece of Emacs lisp code makes those magic RFC plus addresses in the background while you compose the body of the message.

RC_EMAIL = $PMSRC/pm-jaaddr.rc TOME = "(login1|login2)" :0 *$ ^TO\/.*$TOME.* { INPUT = $MATCH INCLUDERC = $RC_EMAIL PLUS = $COMMENT_PLUS # If COMMENT_PLUS was defined, we found "+" # address which contain "usenet.games". Save it to # folder. :0 : * PLUS ?? [a-z] $PLUS }

Notes

1998-05 David Hunt dh@west.net also mentioned that "you need to remember that some MTAs, (qmail for one, and soon vmail) use a dash ( - ) as the subaddress delimiter. So you'll want to allow for that in your code". For this reason the email part accepts both "-" and "+". The RFC comment however accepts only "+" and "--".

Example input

"From: foo+procmail@this.site.com (Mr. foo)" traditional "From: foo-procmail@this.site.com (Mr. foo)" new styled

NOTE: M$SOFT mailers tend to send idiotic smart quotes "'Mr. foo'" and this recipe ignores these two quotes ["'] as if message had only the standard ["]

Returned values

ADDRESS "foo+procmail@this.site.com" containing the email address without <> ACCOUNT "foo+procmail" all characters before @ ACCOUNT1 "foo" characters before plus: account1+account2@site Note, if there is no "+", this is same as ACCOUNT. ACCOUNT2 "procmail" _only_ set if plus found: account1+account2@site SITE "this.site.com" all characters after @ DOMAIN "site.com" the main domain, preceding words in site are considered subdomain (local) addresses. sub.sub.domain.net SUB "this.site" all the sub-domain names without the NET part. SUB1 "site" The first subdomain counted from the _RIGHT_ after NET SUB2 "this" Second subdomain. SUB3 "" Third subdomain. SUB4 "" Fourth subdomain. NET "com" last characters after last period ( net,com,edu ...) COMMENT Anything unside parenthesis (Mr. Foo) or if no parentheses found, then anything between quotes "Mr. Foo" COMMENT_PLUS Anything after the "+" in the comment, like "Mr Foo+mail.usenet" --> "mail.usenet" Note: some MTA's don't allow + character, so use alternatively '--': "Mr Foo--mail.usenet" --> "mail.usenet"

Additionally there is variables DOT1 DOT2, which behave like ACCOUNT1 and ACCOUNT2, but in respect to dotted firstname.surname type address:

john.doe@site.com ACCOUNT1 = john.doe ACCOUNT2 = <empty> DOT1 = john DOT2 = doe

If there is plus, the ACCOUNT2 is defined

john.doe+foo@site.com ACCOUNT1 = john.doe ACCOUNT2 = foo DOT1 = john (in respect to ACCOUNT1) DOT2 = doe (in respect to ACCOUNT1)

Variable ERROR is set to "yes" if INPUT wasn't recognized or parsing the address failed.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.

Call arguments (variables to set before calling)

INPUT = string-to-parse

Usage example

Read From field and address from it. This is lot faster than using external formail call.

PMSRC = $HOME/pm RC_ADDR = $PMSRC/pm-jaaddr.rc :0 * ^From:\/.*@.* { INPUT = $MATCH # Turn off the logging while executing this part VERBOSE="off" INCLUDERC = $RC_ADDR VERBOSE="on" :0 * ERROR ?? yes { # Hmm, no std email address found. Any other ideas? } }

Pm-jabup.rc – Keeep N arriving message backup in separate directory

File id

Description

Preserve last N arriving messages in a separate sub-directory. This should be your safety-belt recipe that you put to the beginning of your .procmailrc.

Procmail saves the backup files with names like: msg.rcG msg.scG msg.3YS1, msg.4YS1, msg.VYS1, msg.fYS1 to the backup directory.

Note: this recipe will alawys call shell commands for each message you recive. That is needed because cleaning of the backup directory. If you receive only small number of messages per day, the performance drop of your .procmailrc is not crucial. But if you store many messages per day, then the shell calls may be a performance problem.

In that case, consider moving the cleanup to the pm-jacron.rc module (The cleanup is run only once a day, not for every message)

John Gianni send his simple bsckup script to Jari, who packaged and generalized the code. The code is reused with John's permission and maintaining responsibility was transferred to Jari

Required settings

(none)

Call arguments (variables to set before calling)

JA_BUP_MAX, How many messages to keep at maximum. 32 is default
JA_BUP_DIR, Where to store the messages. $HOME/Mail/bup by default
JA_BUP_FILES, regexp to match the saved files. Procmail default.
JA_BUP_CHECK_DIR. Once you have verified that this recipe works, that directories are ok, please set this flag to "no" to prevent running unnecessary test command for each email.

Usage example

You only want to keep backup of messages that are not from mailing lists. You may want to use TO_ macro to detect addresses better, this example matches against all headers

LISTS = "(procmail|list-1|list-2)" JA_BUP_DIR = $HOME/Mail/backup/. # Create the path too JA_BUP_MAX = 42 # this should be enough :0 *$ ! $LISTS { INCLUDERC = $PMSRC/pm-jabup.rc }

If you get many messages, please don't use this module. Instead see pm-jacron.rc where similar backup work is done better.

Pm-jacookie1.rc – Generate unique id from INPUT variable.

File id

Description

When given a string, this subroutine returns a unique number representing a string, a cookie.

Required settings

(none)

Call arguments (variables to set before calling)

INPUT, String from which the magic-cookie is calculated
JA_COOKIE_CMD: shell command to read INPUT and return decimal or hex cookie string as one continuous block of characters. It decaults to HP-UX cksum, but your system may have md5 or chksum

Return values

Variable OUTPUT will contain the cookie.

Example usage

INPUT = "foo@site.com" JA_COOKIE_CMD = "md5" # or chksum INCLUDERC = $PMSRC/pm-jacookie1.rc cookie = $OUTPUT

Pm-jacookie.rc – Handle cookie (unique id) confirmations

File id

You should have received a copy of the GNU General Public License along with program. If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

Visit <http://www.gnu.org/copyleft/gpl.html>

Overview of features

Each user must register himself to cookie cache before he is considered "known"
Unless user return the generated cookie string; which is typically a decimal or hex number, he is not considered as known and should not have access to services you provide.
Can be used as a "doorbell" spam/UBE shield

Description

This recipe handles generating the cookie to new users, comparing the returned cookie against the original one and passing known users through if they already had returned their cookie.

When you run automatised scripts, eg. to manage mailing lists where users can subscribe and unsubscribe, you have better to install safety measure so that someone can not subscribe his enemy to 30 mailing lists.

The cookie is any continuous block of random characters that is sent to person who wanted to use the service. He must send back the cookie before the service starts an action, like subscribe. If someone forges the From address to pretend to be someone else and then subscribes as-beeing-someone-else to a mailing list, the cookie protects this from happening.

The cookie is sent to someone-else, and he must return the cookie before the "subscribe" service is activated. Obviously this someone-else will not be interested in sending back the cookie and thus the forgery fails. Isn't that simple, but efective protection against misuse?

Should I use this as Challenge-Response Spam shield?

Unsolicited Bulk Email aka Spam is crawling from every possible domain thinkable, so you might think that a challenge-response policy could be deployed to regular email communication as well. The idea would be that unknown people are requested to "join" to a white list, before discussion is initiated with them. Bulk email shotguns do not reply to challenges (here: cookies), so confirmations are not returned. Individual people that want to talk, may want to return the cookies.

Sounds like a perfect Unsolicited Bulk Email shield? No more non-invited mail? Wrong. Don't use this module for that. The whole idea of challenge-response is flawed and causes trouble for every person who tries to contact. Imagine for 10 people using C-R systems; they would all need to authenticate themselves. Who is going to believe that he is not replying to a spammer who is collecting email addresses? And what about automatic messages that might be received – there is no artificial intelligence to deparate "human" messages from automatically generated messages, so challenges just increase the overall mail traffic. Every C-R system doubles the mail traffic and becomes spam problem by itself.

In short, don't use this module for implementing a C-R system to block regular mail to you.

How it works

By default the cookie generated uses CRC 32 cksum, but if you have md5, you should use it. The cookie is generated from the reply address and immediately stored to cookie database file with entry

DATE FROM-a COOKIE-a DATE FROM-b COOKIE-b

If this was a new user or an old user, who has not registered his cookie yet, then original message is sent back to the sender with instructions: "please place the magic string to Subject line and resent the message."

When cookie is returned back, a new line to the database is added, simply by adding a duplicate entry. The file now looks like this:

DATE FROM-a COOKIE-a DATE FROM-b COOKIE-b DATE FROM-a COOKIE-a

When there is two or more same entries, like FROM-a, the address is supposed to be known and person behind it "cleared".

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jadate.rc
pm-jacookie1.rc
pm-jastore.rc

Call arguments (variables to set before calling)

JA_COOKIE_SEND, flag. Default is "yes". Set to "no" if you want to take full control of the message returned to user. You can check variable ERROR and use key which holds the unique cookie
JA_COOKIE_CACHE, cache to determine if this is new user or not.
JA_COOKIE_AUTO_KEY, flag. If set to "yes"; the cookie is initially put to the Subject when the message is bounched back. Receiver only has to press "r" to reply to send the cookie and message back (convenient). You set this flag to "no" when you want to avoid accidnebts eg. when receiver is about to subscribe to a mailing lists: he has to manually insert the cookie into subject. But keep flag to "yes" if you use this module to get your friends registered easily.
JA_COOKIE_KEYS, the cookie database. Email address and person's access cookie.
JA_COOKIE_RC, dubroutine to generate the cookie id from INPUT. By default uses CRC 32.
JA_COOKIE, the string from which the cookie will be generated. If you already have the return addres for the sender derived, you should assing a value to this to save unnecessary formail call.

Returned values

ERROR will contain the efective action when this recipe file ends

"new-user", This is first message from sender.
"known-user", message has email that has been "cleared" ie. cookie had been returned and user registered.
"key-mismatch", This is at least second message from sender. But he dind't send the confirmation in this message.

key is an internal variable in this recipe file and will hold the cookie id in case of "new-user" and "key-mismatch". You may want to use it if you generate your own reply.

Example usage for UBE shield

This is what I use to prevent unknown people from sending me UBE. It takes a bit extra, but they can easily return the message. Fill in the missing variables, this won't work out of the box for you.

WORK = "(domain1|domain2|domain3)" LISTS = "(procmail|list-2|list-3|list-4)" VALID = "(postmaster|abuse|$LISTS|$WORK)" RC_COOKIE = $PMSRC/pm-jacookie.rc UBE_SPOOL = $HOME/Mail/junk.ube.spool # Save spam here :0 *$ ! From:.*$VALID *$ ! ^FROM_DAEMON { JA_COOKIE_SEND = "yes" # Activate it INCLUDERC = $RC_COOKIE :0 : * ! ERROR ?? known-user $UBE_SPOOL # ... Past this point: it was user in whitelist, so the # recipes after this block will take care of it }

Example usage for subscriptions

$RC_COOKIE = $PMSRC/pm-jacookie.rc ...Mailing lists handled here... ...Your work messages filed here.. TO = `formail -rt -zxTo:` # We need this elswhere JA_COOKIE_TO = $TO # For List-X all subscribe requests must # be confirmaed * ^TO_()list-x * ^Subject: +subscribe\> { JA_COOKIE_SEND = "no" INCLUDERC = $RC_COOKIE :0 * ERROR ?? known-user { # User sent the subsribe request again, allow joining # immediately. } :0 E { # Because the Send was set to "no"; we're in charge # to send a reply to the user. # ...generate suitable message with formail -rt } } # End of example

Pm-jacron.rc – Procmail: Run cron once a day

File id

Description

Framework for all cron tasks that can be run once a day. This is a wrapper recipe to your cron task list: when the day changes, you cron includerc is called.

Required settings

PMSRC must point to source directory of procmail code. This recipe will include

pm-javar.rc
pm-jadate.rc

Call arguments (variables to set before calling)

JA_CRON_RUN_FLAG, You must define this flag file.
JA_CRON_DATE_FILE, File where the date information, last cron run, is kept. Defaults to $HOME/.yymmdd
JA_CRON_RC, your includerc which is run when cron triggers.

A file JA_CRON_RUN_FLAG which defaults to ~/.yymmdd.run is created when your includerc, that contains list of cron tasks, is run. If new mail arrives while your cron recipes are still running, you should prevent invoking the cron again by checking if this file exists. When all the cron tasks have been run, this flag file is removed. Remember to use "w" flag in your cron recipes where necessary to serialize the work.

Return values

(none)

Usage example

Save backups to separate directory, but do cleaning only once a day We do not keep backups from mailing list messages

LISTS = "(procmail|list-1|list-2)" BACKUP_DIR = "$HOME/Mail/backup/." # Store backups: separate files to directory :0 c: *$ ! $LISTS $BACKUP_DIR # Run JA_CRON_RC once a day. It contains all daily cron tasks CRON_RC = $PMSRC/pm-jacron.rc # the framework JA_CRON_RC = $PMSRC/pm-mycron.rc # the tasks to do JA_CRON_RUN_FLAG = $HOME/.cron-running # define this! # Do not enter here if message arrived at the same day when # the cron is already running. The CRON_RC takes care # of deleting the file when cron has finished. :0 *$ ! ? $IS_EXIST $JA_CRON_RUN_FLAG { INCLUDERC = $CRON_RC }

The pm-jacron.rc file may contain anything. For example to clean the backup directory; you add these statements there

# rm dummy: if ls doesn't return files, make sure rm has # at least one argument. # # ls -t: list files; newest first # # sed: chop $max newest files from the listing, leaving the # old ones max = 32 :0 hwic | cd $BACKUP_DIR && $RM -f dummy `ls -t msg.* | $SED -e 1,${max}d` # End of file pm-mycron.rc

Pm-jadaemon.rc – Handle DAEMON messages by changing subject

File id

Description

When you send a message to a address that had delivery troubles, you get a DAEMON message back explaining the error problem. I usually want to save these daemon mesaages to a different folder and check the folder from time to time. A typical daemon message is like this (shortened)

From: Mail Delivery Subsystem MAILER-DAEMON@my.domain.com Subject: Warning: could not send message for past 4 hours The original message was received at... ----- Transcript of session follows ----- Deferred: Connection timed out ----- Original message follows ----- [YOUR MESSAGE AS YOU SENT IT WITH HEADERS]

Well, when I read the subjects, I do not like the standard error messages, but I also like to know to which address the delivery failed and what was the original subject. This small recipe changes the daemon message's Subject to

Subject BRIEF-ERROR-REASON, SENT-TO-ADDRESS, ORIGINAL-SUBJECT

and from that you can immediately tell if you should be worried Eg. if SENT-TO-ADDRESS was your friend's, then you want to take actions immediately, but if it were your complaint to UBE message to postmaster, you don't want to bother reading that daemon message. Here are some real examples:

fatal errors,postmaster,ABUSE (Was: Super Cool Site!) Host unknown,postmaster,ABUSE (Was: A-Credit Information) undeliverable,postmaster,Could you investigate this spam Warning-Returned,friend,Have you looked at this

Required settings

PMSRC must point to source directory of procmail code. This subroutine needs scrips

pm-javar.rc

Call arguments (variables to set before calling)

JA_DAEMON_SAVE. This is by default yes which causes the original subject to be saved under header field X-Old-Daemon-Subject. If you don't want that extra header generated, set this variable to no
JA_DAEMON_REGEXP, which messages to trigger

Return values

Variable ERROR will be set to "yes" if daemon message was handled otherwise; value is "no"

Usage example

Just add this recipe somewhere in your .procmailrc. The place where you would put this daemon message trapper subroutine is crucial: think carefylly how you order your recipes. One suggested order could be: backup important messages, cron-subroutine, handle duplicates, DAEMON MESSAGES, plus addressed message, server message (file server, ping responder...), MAILING LISTS, send possible vacation replies only after all above, apply kill file, detect mime, save private messages and las FILTER UBE.

PMSRC = $HOME/pm RC_DAEMON = $PMSRC/pm-jadaemon.rc DAEMON_MBOX = $HOME/Mail/junk.daemon.mbox ... INCLUDERC = $RC_DAEMON :0 : # If that was a daemon message, save it * ERROR ?? yes $DAEMON_MBOX

Pm-jadate1.rc – 'Tue, 31 Dec 1997' date parser from variable INPUT

File id

Description

This includerc parses date from variable INPUT which has string

"Week, Daynbr Month Year"

Example input

"Tue, 31 Dec 1997" -- without comma "Tue 31 Dec 1997" -- with comma

Returned values

YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits If available mm = 2 digits If available ss = 2 digits If available TZ = 5 characters If available

Variable ERROR is set to yes if it couldn't recognize the INPUT and couldn't parse the basic YYYY, YY, MM, DD variables.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.

Call arguments (variables to set before calling)

INPUT = string-to-parse

The INPUT can have anything after "Week, dayNbr Month Year", or before it: you can pass a string like "Thu, 13 Nov 1997 11:43:23 +0200".

Usage example

The first Received header will tell when the message was received by your mailserver. We parse the date and avoid calling expensive date command.

PMSRC = $HOME/pm RC_DATE_WDMY = $PMSRC/pm-jadate1.rc #Week-Day-Month-Year parser # Get time from first header, it ends like this: # # Received: ... ; Thu, 13 Nov 1997 11:43:50 +0200 :0 *$ ^Received:.*;$s+\/...,$s+$d.* { INPUT = $MATCH # Turn off the logging while executing this part VERBOSE=off INCLUDERC = $RC_DATE_WDMY VERBOSE=on :0 * ERROR ?? yes { # Use some other way to get the time or shout loudly } }

Pm-jadate2.rc – 'YYYY-MM-DD' ISO date parser from variable INPUT

File id

Description

This includerc parses date in format "YYYY-MM-DD hh:mm:ss" like 1997-12-01 and sets following variables whenever called

YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DD = 2 digits hh = 2 digits If avaliable mm = 2 digits If avaliable ss = 2 digits If avaliable

Variable ERROR is set to yes if it couldn't recognize the INPUT and couldn't parse the basic YYYY, YY, MM, DD variables.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.

Call arguments (variables to set before calling)

INPUT = string-to-parse

Last string in INPUT that matches number sequence NNNN-NN-NN is parsed.

Usage example

PMSRC = $HOME/pm RC_DATE_ISO = $PMSRC/pm-jadate2.rc # ISO date parser INPUT = "This is 1800-10-11, a very old date" # Turn off the logging while executing this part VERBOSE="off" INCLUDERC=$RC_DATE_ISO VERBOSE="on"

Pm-jadate3.rc – 'Tue Nov 25 19:32:57' date parser from variable INPUT

File id

Description

This includerc parses date from variable INPUT which has string

"Week, Month dayNbr hh:mm:ss yyyy",

Example

Tue Nov 25 19:32:57 1997

Returned values

YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits mm = 2 digits ss = 2 sigits

Variable ERROR is set to "yes" if it couldn't recognize the INPUT.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.

Call arguments (variables to set before calling)

INPUT = string-to-parse

Usage example

The first Received header will tell when the message was received by the mailserver. Parse the date and avoid calling expensive date command.

PMSRC = $HOME/pm RC_DATE_WMDT = $PMSRC/pm-jadate4.rc #Week-Month-Day-Time parser # Get time from X-From-Line: Which was added by my MDA # X-From-Line: procmail-request@informatik.rwth-aachen.de \ # Tue Nov 25 19:32:57 1997 :0 c *$ ^X-From-Line:\/.* { INPUT = $MATCH # Turn off the logging while executing subroutine VERBOSE=off INCLUDERC = $RC_DATE_WMDT VERBOSE=on :0 * ERROR ?? yes { # Use some other way to get the time or shout loudly } }

Pm-jadate4.rc – make RFC 'Mon, 1 Dec 1997 17:41:09' and parse values

File id

Description

This subroutine calls shell command date once and prses the values. This should be your last resort if you haven't got the date values by any other means. This subroutine assumes that the DATE command knows the following % specifier formats (HP-UX)

Y NNNN year h MON month d NN day a WEEK Like "Mon" H NN hour M NN min S NN sec

Returned values

DATE = RFC date in format "Mon, 1 Dec 1997 17:41:09" This is same as what you would see in From_ YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits mm = 2 digits ss = 2 sigits

Variable ERROR is set to "yes" if values couldn't be set

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jadate1.rc

Call arguments (variables to set before calling)

(none)

Usage example

The First Received line will tell when the message was received by the MDA. If thata fails, then get date from the system. If you send test messages to # yourself, you don't usually put From_ header in it and thus there is # no date information in 'dry run' tests.

# Get time from first eader, which is always same in my system # Received: ... ; Thu, 13 Nov 1997 11:43:50 +0200 INCLUDERC = $PMSRC/pm-javar.rc # to get $s $d definitions TODAY # Clear it :0 *$ ^Received:.*;$s+\/...,$s+$d.* { INPUT = $MATCH INCLUDERC = $PMSRC/pm-jadate1.rc TODAY = "$YYYY-$MM-$DD" } # Check that variable did get set, if not then we have to call # another date subroutine: Call shell then to find out date # # You could also do this with ':0 E', but this is more # educational :0 *$ ! $TODAY^0 { INCLUDERC = $PMSRC/pm-jadate4.rc # Get date from Shell then TODAY = $YYYY-$MM-$DD }

Pm-jadate5.rc – 'Fri Jun 19 18:51:56 1998' date parser from var INPUT

File id

Description

This includerc parses date from variable INPUT which has string

"WeekDay Month dayNbr Year"

Example input

"Fri Jun 19 18:51:56 1998" -- without comma "Fri, Jun 19 18:51:56 1998" -- with comma

Returned values

Variable ERROR is set to "yes" if it couldn't recognize the INPUT and couldn't parse the basic YYYY,YY,MM,DD variables.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.

Call arguments (variables to set before calling)

INPUT = string-to-parse

The INPUT can have anything after "Week, dayNbr Month Year", or before it: you can pass a string like "Fri Jun 19 18:51:56 1998 11:43:23 +0200".

Usage example

The first Received header will tell when the message was received by your mailserver. We parse the date and avoid calling expensive date command.

PMSRC = $HOME/pm RC_DATE_WDMY = $PMSRC/pm-jadate5.rc #Week-Day-Month-Year parser # Get time from first header, it ends like this: :0 *$ ()\/From .* { INPUT = $MATCH # Turn off the logging while executing this part VERBOSE=off INCLUDERC = $RC_DATE_WDMY VERBOSE=on :0 * ERROR ?? yes { # Use some other way to get the time or shout loudly } }

Pm-jadate.rc – Read date from the message hdrs: From_, Receved:

File id

Description

This recipe will scan several headers to find the date string. When suitable header is found and the parsing has succeeded, the return variables are set. The Date values reflects the arrive time of the message; not the sending time. If nothing works, a shell call date is used as a last resort.

Returned values

YYYY = 4 digits YY = 2 digits MON = 3 characters MM = 2 digits DAY = 3 characters DD = 2 digits hh = 2 digits if available mm = 2 digits if available ss = 2 digits if available

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jadate1.rc
pm-jadate3.rc
pm-jadate4.rc

Call arguments (variables to set before calling)

(none)

Usage example

INCLUDERC = $PMSRC/pm-jadate.rc # now we have all date variables that we need # $TODAY = $YYYY-$MM-$DD

Pm-jadup.rc – Procmail: Handle duplicates; store to separate folder

File id

Description

This recipe stores duplicate messages to separate folder

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jastore.rc

Call arguments (variables to set before calling)

JA_ID_CACHE, Where to keep the Message-Id cache.
JA_ID_CACHE_SIZE, how big cache, defualt is 8192
JA_ID_MBOX, where to store duplicate messages when delivering message to duplicate mbox.
JA_ID_IGNORE, if set to "yes", then ignore duplicate check

Return values

Variable ERROR is set to "yes" if duplicate message was trapped, otherwise value is "no"

Usage Example

For simple usage, just put this somewhere after backup recipes

RC_DUP = $PMSRC/pm-jadup.rc ... INCLUDERC = $RC_DUP

When you are testing messages, you send them over and over to procmailrc; which means that same message should not be trapped by
duplicate check. You can call procmail with option "-a test" which will set pseudo variable $1. The recipe below sets flag JA_ID_IGNORE to "yes" if test is on going and the duplicate filter should be bypassed.

RC_DUP = $PMSRC/pm-jadup.rc ARG = $1 # Copy pseudo variable to $ARG :0 * ARG ?? test { JA_ID_IGNORE = "yes" } # Some microsoft product is known to send same message ids # over and over. If we detect one, tunr off the duplicate test, # because it would trash every message. # <MAPI.Id.0016.00666479202020203030303430303034@MAPI.to.RFC822> :0 * ! ^X-msmail * ! ^Message-ID: *MAPI.*@MAPI.to.RFC822 { JA_ID_IGNORE = "yes" } # Run this command every time a duplicate message is found. # It writes a small log entry to MY_LOG INCLUDERC = $RC_DUP :0 hwic: * ERROR ?? yes | echo " [duplicate]" >> $BIFF

Pm-jaempty.rc – check if message body is empty

File id

Description

This simple includerc will define variable BODY_EMPTY to "yes" or "no" when called like this You can file empty messages to separate folder based on this value

INCLUDERC = $PMSRC/pm-jaempty.rc :0 * BODY_EMPTY ?? yes the-empty-mail-folder

This is more designed to be part of other modules. If you just want to check for empty message, a simpler recipe like this might be better:

INCLUDERC = $PMSRC/pm-javar.rc :0 B: # if body has only whitespace characters *$ ! $NSPC the-empty-mail-folder

Required settings

(none)

Pm-jafrom.rc – get message's best FROM field without calling `formail'

File id

Description

This includerc extracts the most likely FROM address from the message. The order of the search is Reply-to, From_, Sender, From and if none found, then as a last resort, call formail. You would usually use the returned value for logging purposes.

Avoiding extra formail call could be usefull if you receive lot of messages per day.

Example input

(none)

Returned values

OUTPUT, containing the derived FROM field

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there. You nee procmail 3.11pre7 in order to use this subroutine. (due to formail -z switch)

Call arguments (variables to set before calling)

(none)

Usage example

INCLUDERC = $PMSRC/pm-jafrom.rc FROM = $OUTPUT # now we have the 'best' FROM field

Pm-jafwd.rc – Controlling forwarding remotedly

File id

Overview of features

Requires latest procmail (formail -z switch)
You can send forward-on and forward-off control messages via email to control the forwarding in remote site.

Description

This includerc makes it possible to control your message forwarding via simple remote email message. Thanks to Era Eriksson and Timothy J Luoma who gave the initial idea to this forwarding module in the procmail mailing list 1997-10-07.

Activating the forwarding by hand

If you want to activate the forwarding from the local site where this module is, then you could simply write the forward address to the file pointed by JA_FWD_FILE which is ~/.forward-address by default.

% echo Me@somewhere.com > ~/.forward-address

and when you no longer need forwarding, then remove that file. But really, this module is not used for that purpose, because it is lot easier to write

:0 ! Me@somewhere.com

as a first statement in your .procmailrc when you want to forward your mail to another account.

Activating the forwarding by remote email

Suppose you're on the road and suddenly realize that you want your mail forwarded to the current account, then you send following control message

Subject: forward-on password new-address@bar.com To: my-account@bar.com From: onTheRoad@some.com

That message is is enough to get the mail forwarded to the address new-address@bar.com This script will respond to address From that the current forwarding is now pointing to address "new-address@bar.com".

Deactivating forwarding by remote email

The message is very similar, but the Subject header says

Subject: forward-off password

And no other fields are checked. Not even Reply-To. In this case the confirmation message is sent directly back to From address.

Activating forwarding via body message

If for some reason you have no control over the headers of email, eg when you send GSM-Mail message from your phone to your account:

EMAIL foo@bar.com FORWARD-ON PASSWORD new-address@bar.com

The email message looks like this:

From: GenEmail sms@FooBar.net Date: Thu Sep 17, 11:42am +0200 To: "'Foo.Bar'" foo@bar.com Subject: Message 03384874987 FORWARD-ON PASSWORD new-address@bar.com

Instead of looking at the Subject field, you can get this module to look at the first words in the body field. See variable JA_FWD_CONTROL_FIELD which you want to set to "body".

Restricting the control message aceptance

If you only have persistent accounts, then you should set the JA_FWD_FROM_MUST_MATCH to match those addresses that you have. The following setting says that only control messages sent from these addresses are accepted. Nobody else can't change your forwarding settings.

JA_FWD_FROM_MUST_MATCH = ".*(acc1@a.com|acc2@b.com)"

Hm, that's not a bullet proof, because someone may in theory forge the From address. You probably should also set this variable to point to accounts where the mail can be legally forwarded to. Then, even if the imposter forges the From address; he can't get the email forwarded anywhere else than to the valid locations.

JA_FWD_TO_MUST_MATCH = $JA_FWD_FROM_MUST_MATCH

Consider also setting JA_FWD_PASSWORD_CASE to Procmail flag D which causes your control word "forward-on" and password to be case sensitive.

Diagnostics

If you don't receive confirmation message, then your control message was ill formed or you're not in the JA_FWD_FROM_MUST_MATCH list. There is no notification sent on failure, so that no attacker can draw conclusions.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc

Installation

You should preset all necessary variables prior adding the includerc command to your .procmailrc. Here is one simple setup

#JA_FWD_SENDMAIL = "tee $HOME/test.mail" # Uncomment if testing JA_FWD_COPY = no # no copies stored while forwarding JA_FWD_PASSWORD_CASE= "D" # case sensitive JA_FWD_PASSWORD = "MyMagicString" JA_FWD_FROM = $FROM # This is already known. INCLUDERC = $PMSRC/pm-jafwd.rc

Comments from the author

Please realise that when you set the forwarding from a remote site, be very carefull when you type in the forward address or your mail ends up to somebody else's mailbox. Also I recommend that you keep JA_FWD_COPY to yes so that your local account always keep the copy of forwarded message.

A step further would conventionally encrypt(1)'ing your forwarded messages. This way even your top secret messages would be mostly safe even if they end up to someone else's mailbox.

File layout

tinybm.el/&tags and tinytab.el for the 4 tab text placement.

Pm-jalist.rc – Subroutine to detect mailing LIST from message.

File id

Description

This subroutine tries to detect and derive the mailing list name as it appears in some of the known methods that ezlm, smarlist, listserv, majordomo etc. normally use. After this subroutine has been applied to message the variable LIST contains the mailing list name. Subroutine adaptively finds new new mailing lists from the messages.

The alternative to subscribing to many mailing lists is to read them from web archives. Even better way is to use NNTP server at http://www.gmane.org which allows you to post as you would to a regular newsgroup. Consider using the NNTP interface and you may save you from receiving lot of messages that can already be found from Gmane's server.

Quick start

If you just want to jump in and use this module and you noteice that some list isn't trapped, please set

JA_LIST_HEADER_REGEXP to match the From: field

If you want to make some list more unique, like if name "Alert" was detected as a list name, please set

JA_LIST_MAKE_UNIQUE to match the list name, like "Alert". After that the list name will be converted to HOST-LIST format.

Sendmail plus type method for list subscription

If you can use sendmail type PLUS addressing capabilities, you may not be interested in this module, because you have an alternative way to handle mailing list messages. The extra information after "+" is available to procmail scripts via $ARG pseudo variable when procmail is the LDA. Let's suppose you want to subscribe to procmail mailing list and want to save all messages to folder list.procmail, then you'd subscribe with address:

If your email host doe snot provide the plus addressing then it the traditiona approach have been to add a piece of recipe to ~/.procmailrc to catch each list. But that's manual work for every list. When you use this subroutine, you no longer need to write separate mailing list recipes to your ~/.procmailrc every time you subscribe to a new mailing list. The detection of a new list will happen automatically.

What you need to know before using this module

There is lot of heuristics going on in this module and one thing that you must note:

If 'To:' domain is same as `From/Sender:/Reply-to:' domain then it is considered a mailing list message.

This causes certain messages to be treated as mailing list messages. The module can't possibly know that the following is not from mailing list, because it doesn't know "what is mailing list", only "how it probably looks like it". This is definitedly categorized as mailing list message, because From and even Reply-to has the same domain foo.bar.net as in To.

To: support@foo.bar.net From: message@foo.bar.net Reply-to: support@foo.bar.net Subject: Vmail See message to Eric

You must prevent checking messages like this by surrounding call to this subroutine with a check statement:

# Do not check these messages noList = "From.*(foo.bar.net|support.my.com)" :0 *$ ! $noList { INCLUDERC = $RC_LIST # ... save messsag by examining variable LIST (which see) }

Ask for help

If you find mailing lists that this subroutine does not detect, but which could have been detected by looking the headers in standard way, please send a email to maintainer. There may be cases where it is impossible to detect the mailing list and in those cases you just has to carve a new entry to your ~/.procmailrc. When you keep your procmail log running, you may see message

*** potential list ***

Which is an indication that some new recipe could be added to to this subroutine to detect that mailing list. If the message you received was from a mailing list, please send all the headers to the maintainer so that support can be added.

Code note: Errors-To

Bill Houle sent interesting headers which caused to add more heuristic than was feasible to solve the list detection. From the below headers it is practically impossible to derive the original list name. So, the list name is artificially constructed by combining Reply-To's LOGIN with Errors-To field's first host name

Reply-To: news@doodle.foo.net Errors-To: bounced@doodle.foo.net

The list name formed is "news-doodle". So, If you happen to see an odd name like this which doesn't remind the original list name, it may be due to poor headers that have no clue about the real name. No problem, check below how you would convert this name to better mailbox name.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will followign extra module, which must have been installed.

pm-javar.rc

Variable JA_LIST_FROM_TO_IGNORE

This is regexp of sender addresses to ignore so that the if To and From are identical, it is not considered a list messages. This is typical for system generated messages that take form:

From: root@host (Cron Daemon) To: root@host

Variable JA_LIST_SAVE

If set to "yes" then the list name information detected is saved to separate header. The LIST_DETECTED is the original grabbed word from the headers and the 'LIST' is the final name after possible list name conversions. According to RFC the X- can be user for user headers.

X-List-Detected: $LIST_DETECTED mapped to $LIST

Variable JA_LIST_KILL_POSTFIX

If grabbed LIST match this regexp at the end of list name, then the postfix match will be removed. It is traditional that many list names are like list1-info, list2-beta, list3-L and ut would be preferable to see names like list1, list2 and list3. The default value will ditch "-(info|beta|L)".

Variable JA_LIST_KILL_PREFIX

Just like the postfix variable. If this string is matched at the beginning of the LIST, it is removed.

Variable JA_LIST_DISREGARD_EMAIL

In some cases this list detection recipe "thinks" that the address picked is the list sender. You may have a dedicated address where all you mailing list mails arrive and you have named it like mailing-list@me.here.at, which will effectively trigger: Ah, you have -list in email address, so this message must be from mailing list name 'mailing'. Of course it is not and you have to disallow the heuristics to make such assumption by defining a regexp that rejects a possible choice. For the above example, you would define:

JA_LIST_DISREGARD_EMAIL = "posting-list@me.here.at"

If you have several such addresses, just add them to the variable separating with normal regular expression "|" OR statement.

Variable JA_LIST_HEADER_REGEXP

This is optional variable, which you can set to match regexp of the mailing list domain address if it slipped through the tests in this module. There are some lists that send messages that don't carry enough information in headers to determine their list status. If you narrow the group by setting JA_LIST_HEADER_REGEXP, then for example lists like these, that identify themselves only through two headers, can be found:

Reply-To: dispatch-faq@cnet.com From: CNET Digital Dispatch dispatch@cnet.com

For that list you would set

JA_LIST_HEADER_REGEXP = "(@cnet\.com)"

Don't worry. all the other list detection recipes has already been tried, so this is last test that are carried out and variable JA_LIST_HEADER_REGEXP helps eliminating possible mishist

You don't need set this variable to include all mailing list domains. Only to those ones that were not trapped. The default value for this is:

"(amazon\.com|bookpool\.com)"

Variable JA_LIST_MAKE_UNIQUE

If you're subscribed to many mailing lists, that simply tell that they are news or newsletter, it will be impossible to differiantiate A news from B news. This variable holds regular expression that, if matched, prepend the first host name to the beginning of list name, thus making the list unique:

news@some.com --> some-news news@here.com --> here-news

The default value matches lists that contain word news, but you may need to set this to more matches.

Variable JA_LIST_CONVERSION

Note: before using this feature, make sure your LINEBUF is big enough, say 4096 or otherwise the variable's content is truncated.

Many times the grabbed LIST name is not what you would like to use for your mailbox name. You want to make the name perhaps more shorter, more descriptive or categorize the messages according to hierarchy. Let's say that you have subscribed to following mailing lists:

LIST LIST name Description of mailing list (as grabbed) you want ------------------------------------------------------------- jde java.jde Java Development Env java java.lang Java programming FLAMENCO flamenco Flamenco music tango-l tango Argentine Tango dancing tm-en-help tm-en Emacs TM mime package mailing list w3-beta w3 Emacs WWW mailing list

First, remember that the variable JA_LIST_KILL_POSTFIX is first applied, so the actual LIST appears as follows:

jde, java, FLAMENCO, tango, tm-en, w3

Ok, now we apply the conversion table by defining it as follows. The grabbed LIST name is first, then comes space(s), new name and terminating colon. Repeat this for each list you want to convert.

LIST CONVERSION[,LIST CONVERSION ...]

This gives us table below: notice that entries tango-l, w3-beta were not included, because the JA_LIST_KILL_POSTFIX already got rid of the postfixes. Also note how the uppercase match FLAMENCO is converted to more suitable lowercase mailbox name. After you have set up this variable you can start saving messages to folders.

JA_LIST_CONVERSION = "\ jde java.jde,\ java java.lang,\ FLAMENCO flamenco,\ "

The list conversion is done with pure procmail means, so it is very fast. It also means that the conversion is limited to FROM-STRING TO-STRING syntax. No wild cards or regular expressions are allowed.

If you consider using an external process, like sed or perl to convert the grabbed list name to something else (when JA_LIST_CONVERSION method was not enough); think again. For each incoming mailing list message you launch external process. It is not unusual to receive 700 messages from various mailing lists a day, it can be imagined how much load any external process would add to the server. Use the grabbed mailing list name and JA_LIST_CONVERSION table if you care about system load.

If you have many mailing lists that use uppercase names, it may be tedious to add each mailing list name to JA_LIST_CONVERSION. Possible alternative is to use very efficient tr program to convert characters to lowercase. Again; think twice, because any extra process could be avoided if JA_LIST_CONVERSION was used.

:0 * ! LIST ?? ^^^^ { :0 D # still uppercase list name? * LIST ?? [A-Z] { LIST = `echo $LIST | tr A-Z a-z` } :0 : list.$LIST }

List name is not always the same

One important thing to keep in mind is that when mailing list manager sends out list messages, the headers may change. This means that the list name grabbed previously changes too. This is unfortunate, but it sometimes happens. Let's see an example. I was previously receiving messages from Cygwin mailing list named gnu-win32

To: gnu-win32@cygnus.com, "Foo Bar" foo@example.com

However, one day that same list was grabbed under name "cygwin", due to new header

Mailing-List: contact cygwin-help@sourceware.cygnus.com; run by ezmlm

Now I had two list names that both should be going to the same mailbox. No worries, just add new entry to the translate table to convert the new list name to mailbox name:

JA_LIST_CONVERSION = "\ gnu-win32 cygwin32,\ cygwin cygwin32,\ "

Example: basic installation

Here is recipe to save all your mailing list to separate folders. If you subscribe to new lists or unsubscribe to lists, you don't need to change anything. The grabbed list name will appear in variable LIST

RC_LIST = $PMSRC/pm-jalist.rc # name the subroutine ... # Handle all mailing lists with one subroutine and recipe # following it. Set also JA_LIST_CONVERSION before # calling this subroutine to cnvert the found list names. INCLUDERC = $RC_LIST imap = # Kill var. Set to "/" to enable :0 # if list name was grabbed * LIST ?? [a-z] { dummy = "Saving mailing list: $LIST" :0 w: ${imap+".INBOX."}list.$LIST$imap }

What's that IMAP thing there, you may wonder. Normally procmail delivers to standard mailbox, so the name is something like '$MAILDIR/list.abc'. For IMAP, the delivery must happen using principle "one file, one message", so procmail must deliver to a directory. That's what the added $imap is there for. It is also customary that IMAP folders are prefixed with ".INBOX", so the actual name becomes $MAILDIR/.INBOX.list.abc. For IMAP there should also be proper MAILDIR=$HOME/Maildir setting.

Pm-jamime-decode.rc – decode MIME body contents; quoted-printable, base64

File id

Documentation

The original father of the decoding scheme used here was presented by Peter Galbraith galbraith@mixing.qc.dfo.ca in procmail mailing list somewhere at the end of 1997.

This subroutine supposes that the header has MIME header Content-Type: text/plain and performs quoted-printable or base64 decoding on the whole message. Note, that if you receive messages that have many mime attachments, then this recipe is not suitable for it.

Procmail is not designed to handle mime attachments and this recipe only applies to whole body.

The pm-jamime-*.rc is really stretching the limits and any serious works hould be delegated to appropriate Perl MIME modules. There is a Perl MIME module which will allow you to manipulate MIME body parts rather elegantly. See http://www.perl.com/CPAN-local/authors/Eryq/ for MIME-tools.

Se also mimedecode at ftp://ftp.dde.dk/pub/mimedecode.c which in included in Debian Linux.

Notes

Perl or python is not used, because both are CPU intensive. It would be too expansive for accounts or environments receiving hundreds of mails per day (like from several mailing lists).

RFC 2047 gives possiblity to use MIME iso-8859-1 extensions for mail headers.

Subject: Re: [PIC]: RSA =?iso-8859-1?Q?encryption=B7=B7?= Subject: =?iso-8859-1?Q?=5BEE=5D:TV_&_video_IC=B4s_!!?=

There is also base64 possibility (although rare):

Subject: =?iso-8859-1?B?zvLi5fI6ICAgICAgTVBMQUIzLjQw?=

In worst possible case there is even multiple ISO encoded strings in subject. Yes, this is valid, the continued line includes spaces at front to keep it with original just like in Received: headers. This subroutine will not touch headers that have multiple ISO tags - procmail is too limited for that.

Subject: AW: Re: AW: neue =3D?ISO-8859-1?Q?M=3DF6glichkeiten_=3D28was_=3D=C4hn?=3D =3D?ISO-8859-1?Q?lichkeiten_von_=3DDCbungen=3D29?=3D

Required settings

Variable PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc, pm-jamime.rc
Programs $MIME_BIN, $MIME_BIN_QP and $MIME_BIN_64 must have been installed (see pm-javar.rc).

Call arguments (variables to set before calling)

JA_MIME_DECODE_TREAT_SUBJECT, default "yes". Decode Subject header by removing mime.
JA_MIME_DECODE_TREAT_FROM, default "no". Decode From header by removing mime.
JA_MIME_DECODE_TREAT_BODY, default "no". Decode body of message by removing quoted-printable from a message that contains only one part. Messages with multiple parts are not handled.

Return values

PM_JAMIME_COMPLEX_SUBJECT is set to "yes" if Subject header contains ISO encoding several times, it cannot be handled by this module. This flag is set to indicate that some other program shoudl handle the message.

Examples

Instead of testing the existence of text/plain in the body, you can force decoding by settings JA_MIME_DECODE_REGEXP to ".*".

RC_MIME_DECODE = $PMSRC/pm-jamime-decode.rc :0 * condition { JA_MIME_DECODE_REGEXP = ".*" } INCLUDERC = $RC_MIME_DECODE # call subroutine.

Pm-jamime-kill.rc – General MIME attachment killer (vcards, html)

File id

Description

Note: If you think this module can do miracles, it cannot. MIME messages are very complex in structure and all this module can do is to detect simple attachements. It cannot be used as - all purpose - all detecting - MIME attachement killer. But the part it can do, is done efficiently, because most of the things are accomplished using procmail and resource friendly awk.

There are meny programs that add additional information to the messages. Microsoft's mail program is one which may include a 7k application/ms-tnef attachment to the end of message. Many other programs may do the same. This was the idea in 1997 when this module was written; to get rid of the extra cruft which should not land in the mailbox.

This recipe works like this: If email's structure is

--boundary message-text (maybe quoted-printable) --boundary some-unwanted-mime-attachment --boundary

then the attachment is killed from the body. The message-text part is also decoded if it was quoted printable. This leaves clean text with no MIME anywhere. MIME headers have will be modified as needed due to conversion from multi part and possibly quoted printable to plain text and the final message looks like:

message

But if email's structure is anything else, like if there were 3 mime sections:

--boundary message-text (maybe quoted-printable) --boundary some-attachment --boundary some-unwanted-mime-attachment --boundary

then the "unwanted" part is emptyed by replacing with one empty line. The message structure stays the same, but the killed "some-unwanted-mime-attachment" part is labelled as text/plain so that the MUA (Mail User Agent; the email reader program) can decode the MIME message correctly.

Applications for other mime attachments

The following cases are ncluded on in this module. You need to separately the behavior before this module will start working.

Lotus Notes attachment.
Microsoft Express attachement. It sends a copy of message in HTML format.
Mozilla's Netscape attachement. It sends a copy of message in HTML.
Vcard attachments.
Openmail attachment. It sends 10-20 line base64 attachments WINMAIL.DAT.

Example of lotus notes attachment

Subject: message From: foo@bar.com X-Lotus-FromDomain: XXX COMPANIES Mime-Version: 1.0 Boundary="0__=cieg4oHxUNf2h3evyOXIsHTGDpFfaZilTDCFhpZSgsw" Content-Type: multipart/mixed; Boundary="0__=cieg4oHxUNf2h3evyOXIsHTGDpFfaZilTDCFhpZSgsw" --0__=cieg4oHxUNf2h3evyOXIsHTGDpFfaZilTDCFhpZSgsw Content-type: application/octet-stream; name="PIC10898.PCX" Content-transfer-encoding: base64 eJ8+IjsQAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcA b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEEkAYAyAEAAAEAAAAQ <AND-THE-REST-OF-BASE64> --0__=cieg4oHxUNf2h3evyOXIsHTGDpFfaZilTDCFhpZSgsw--

Example of MS Explorer's ms-tnef message

Subject: message From: foo@bar.com MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="---- =_NextPart_000_01BD04D4.A5AC6B00" Lines: 158 ------ =_NextPart_000_01BD04D4.A5AC6B00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <MESSAGE ITSELF IS HERE> ------ =_NextPart_000_01BD04D4.A5AC6B00 Content-Type: application/ms-tnef Content-Transfer-Encoding: base64 eJ8+IjsQAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcA b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEEkAYAyAEAAAEAAAAQ <AND-THE-REST-OF-BASE64> ------ =_NextPart_000_01BD04D4.A5AC6B00--

Example of MS Express's HTML message

MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_003A_01BD16E2.C97E27B0" X-Mailer: Microsoft Outlook Express 4.72.2106.4 X-MimeOLE: Produced By Microsoft MimeOLE V4.72.2106.4 This is a multi-part message in MIME format. ------=_NextPart_000_003A_01BD16E2.C97E27B0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <ACTUAL TEXT> ------=_NextPart_000_003A_01BD16E2.C97E27B0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <SAME IN HTML> ------=_NextPart_000_003A_01BD16E2.C97E27B0--

Example of Netscape's HTML attachment

X-Mailer: Mozilla 4.04 [en] (X11; U; Linux 2.0.33 i686) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="------------69D9D579CF587DC8BB26C49C" --------------69D9D579CF587DC8BB26C49C Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit <ACTUAL TEXT> --------------69D9D579CF587DC8BB26C49C Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <SAME IN HTML> --------------69D9D579CF587DC8BB26C49C--

Example of Netscape's vcard attachment.

Content-Type: text/x-vcard; charset=us-ascii; name="vcard.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Laird Nelson Content-Disposition: attachment; filename="vcard.vcf" begin: vcard fn: Laird Nelson n: Nelson;Laird org: Perot Systems Corporation email;internet: ljnelson@unix.amherst.edu title: Software Engineer tel;work: (617) 303-5059 tel;fax: (617) 303-5293 tel;home: (978) 741-3126 note;quoted-printable:Information is for reference only;=0D=0A= please do not abuse it. x-mozilla-cpt: ;0 x-mozilla-html: TRUE version: 2.1 end: vcard

Required settings

To handle base64 encoded messages, package called metamail must have been installed to system. It provides program mimencode which is used through variable $MIME_BIN (see pm-javar.rc).

Variable $PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jamime.rc

Call arguments (variables to set before calling)

First of all, this is primarily a framework recipe to kill any kind of attachment. If you do not set JA_MIME_TYPE before calling this recipe, recipe will try to determine the right value by itself. If the automatic detection fails you need to preset the value of JA_MIME_TYPE beforehand.

JA_MIME_TYPE is a case sensitive AWK REGEXP. Always use lowercase letters in this regexp because the line is lowercased before match is made. This regexp determines if the kill recipe is applied to the message or not. Value is set for MS explorer, MS express, Netscape and Lotus Notes etc. messages by default.
JA_MIME_KILL_RE, additional REGEXP to kill lines from the message. Value is case sensitive awk regexp and by default matches Lotus notes tag: name="XXX.PCX".
JA_MIME_EXTRA_HEADER, name of header added to the message if the MIME portion was killed. Default value is "X-Mime-Type-Killed".

It may be possible that some messages are malformed and that they do not contain proper "boundary" definition string in the header. There have been messages that have text/html attachments, but no proper Mime headers. For those cases there is additional variable that will kill all text up till matching line regardless of message content.

JA_MIME_KILL2_RE is set to "text/html|application/ms-tnef". Update this to match attchements you receive. Set variable to "" if you don't want to change the body of non-compliant MIME message.

That variable is the last resort if the standard MIME detection failed. There must have been some problem in the sender's MUA that composed message. It's dangerous, so make sure you don't set it lightly.

Possible conflict with your awk

If you see an error message in the log file saying that awk failed:

procmail: Executing awk, ... procmail: Error while writing to "awk" procmail: Rescue of unfiltered data succeeded

it means that the system's standard awk doesn't support the variable passing syntax. Do the following test:

% awk '{print VAR; exit}' VAR="value" /etc/passwd

It should print "value". If not, then see if you have nawk or gawk in the system. They should understand the variable passing syntax. The only change needed is to define variable AWK somewhere at the top of ~/.procmailrc.

AWK = "gawk" # Better than standard "awk"

Warnings

You should know that the variable JA_MIME_KILL_RE is used to wipe any lines that match that regexp. This is due to MIME structure where continuing header lines exist in the body:

------=_NextPart_000_003A_01BD16E2.C97E27B0 Content-Type: text/plain; charset="iso-8859-1" << kill this line too

If you want to be absolutely sure that anything valuable won't be accidentally killed (like a code line in programming language scripts), you should set this variable to nonsense value that newer matches:

JA_MIME_KILL_RE = "match_it_never_I_hope"

Usage example: Customizing the attachment killing

Suppose you receive new application/ms type attachment that the default settings doesn't cover. This is a new mime type and you have to instruct this module to kill it. Add this and similar tests for other mime types:

myCustomMimeType = "application/ms" # must be all lowercase :0 *$ $myCustomMimeType { PM_JA_MIME_TYPE = $myCustomMimeType } INCLUDERC = $PMSRC/pm-jamime-kill.rc

Usage example

To kill text/html or pdf, postscript and others add something like this to ~/.procmailrc. It demonstrates how the correct MIME types are detected:

# ..................................................... # 1) Uncomment following line if your standard "awk" is broken # AWK = "gawk" # ..................................................... # 2) Set correct value for attachment killing :0 * ^X-Lotus-FromDomain: { # Kill Lotus notes .pcx attachments JA_MIME_TYPE = "application/octet-stream" } :0 * H ?? ^From:.*foo@example.com * B ?? ^Content-Type:.*text/html { # Kill html attachments JA_MIME_TYPE = "text/html" } # ..................................................... # 3) Call module INCLUDERC = $PMSRC/pm-jamime-kill.rc

Pm-jamime.rc – subroutine to read mime boundary etc. variables

File id

Documentation

This includerc reads MIME boundary string from the message if it exists. The boundary string is typically found from Content-Type header.

Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=9i9nmIyA2yEADZbW

In addition it will define few other mime variables. See the returned values. You use these variables later in your MIME message processing.

Mime Notes

1998-07-28 Brett Glass brett@lariat.org reported in PM-L that there was security exploit in long attachment filenames: http://www.xray.mpe.mpg.de/mailing-lists/procmail/1998-07/msg00248.html

And here is the url to the matter:

http://www.sjmercury.com/business/microsoft/docs/security0728.htm

When you use this module to detect mime messages, you can check the filename length with recipe:

# Recipe after calling $RC_MIME, this module, re = ".........." # regexp with 10 matches too_long = "$re$re$re$re" # allow 40 characters maximum :0 *$ $SUPREME^0 MIME_H_ATTACHMENT ?? $re *$ $SUPREME^0 MIME_B_ATTACHMENT ?? $re { dummy = "** Dangerously long mime attachment filename" dummy = "** $MIME_H_ATTACHMENT $MIME_B_ATTACHMENT" :0 : /var/spool/mail/MimeDanger }

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc

Call arguments (variables to set before calling)

(none)

Return values

Variable MIME is set to "yes" or "no" if messages has mime version string
MIME_VER contains the mime version string from the header.
MIME_TYPE contains the Content-Type from the header.
MIME_CTE contains Content-Transfer-Encoding from the header.
MIME_H_QP is "yes" if Content-Transfer-Encoding: quoted-printable is in the header.
MIME_B_QP is "yes" if Content-Transfer-Encoding: quoted-printable is found from the body.
MIME_BOUNDARY contains the boundary string, which is used to differentiate mime sections in the body.
MIME_BOUNDARY_COUNT is the number of boundary strings found from the body. The value is 3 if there is two mime sections, and 4 if 3 etc. MIME_BOUNDARY_COUNT -1 = count of sections.
MIME_H_ATTACHMENT, contains the filename if there was attachement filename in the header. Content-Disposition: attachment; filename="..."
MIME_B_ATTACHMENT. body file attachment. Note however that this is the match of first string in the body. There may be several attachments. MIME_B_ATTACHMENT_FILE_COUNT tells you how many filenames are in the body.

Usage example

INCLUDERC = $PMSRC/pm-jamime.rc

Pm-jamime-recode.rc – re-encode MIME Header: Subject, From as quoted-printable

File id

Documentation

This subroutine supposes that message has been handled by 'pm-jamime-decode.rc'. The purpose is to restore Subject and From headers back to quoted printable format so that messages can be savely saved through IMAP system which may not handle 8-bit messages. If message is stored directly to mailbox and the used Mail user Agent has no problems with dealing 8-bit characters, this module is not needed.

An example where this subroutine could be applied:

Feed message to pm-jamime-decode.rc
Feed message pm-jasubject.rc (to clean multiple Re: Re: Fwd ..)
Restore From/Subject encodings with pm-jamime-recode.rc
Save message to mailbox

Notes

Perl or python is not used, because both are CPU intensive. It would be too expansive for accounts or environments receiving hundreds of mails per day (like from several mailing lists).

Required settings

Variable PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jamime.rc
Program $MIME_BIN_QP_E must have been installed (see pm-javar.rc).
pm-jamine-decode.rc must have been called and message must contain headers X-Mime-Header-Decoded-*

Call arguments (variables to set before calling)

JA_MIME_RECODE_TREAT_SUBJECT, default "yes". Decode Subject header by removing mime.
JA_MIME_RECODE_TREAT_FROM, default "no". Decode From header by removing mime.

Return values

(none)

Examples

To fix Subject header and then make it 7bit clean again. Note, this may not be exactly what you want. The pm-jamime-decode.rc file does a little more than From/Header handling (also modifies message body). Read documentation of each file before using following example

INCLUDERC = $PMSRC/pm-jasubject.rc INCLUDERC = $PMSRC/pm-jamime-decode.rc INCLUDERC = $PMSRC/pm-jamime-recode.rc

Pm-jamime-save.rc – save message's MIME attachement (one file) to a file

File id

Documentation

This module saves one simple file attachment (MIME) from he message. The message must define following MIME headers. If "filename=" does not exists, then the message is ignored.

Mime-Version: <version> Content-Type: <type> Content-Disposition: attachment; filename="file.txt"

The last line can also be in separate line, provided that it is indented according to standard rules:

Mime-Version: <version> Content-Type: <type> Content-Disposition: attachment; filename="file.txt"

Procmail is not very suitable for saving MIME attachments and you should not think that this the right tool for you. If you receive anything more than 1 attachment, this recipe does nothing, because that's out of our league and you need some more heavy weight mime tools. E.g. Perl CPAN has MIME libraries.

Note: when the attachment is in the body, it is simply written to a disk and the location in message is replaced with test:

Extracted to file:/users/foo/junk/<YYYY-MM-DD-hhmm>.file.txt.

The existing mime headers that surround the attachment are lect untouched, so don't try to press your Mail Agent's MIME buttons at that point. There is no such file in that spot if you set JA_MIME_SAVE_DEL to yes.

Required settings

PMSRC must point to source directory of procmail code. This subroutine includes library:

pm-javar.rc
pm-jamime.rc
pm-jamime-decode.rc
pm-jadate.rc (which will call other pm-jadate*.rc files)

Call arguments (variables to set before calling)

JA_MIME_SAVE_DIR, point this to directory where you want to store attachments.
JA_MIME_SAVE_DECODE, set this to "yes", if you want that attachment is decoded before written to disk. This usually opens quoted printable or base64 encoding.
JA_MIME_SAVE_DEL, set this to "yes", if you want to remove the attachment from the body of the message after it has been filed. Be vary careful if you use this option. If you keep backup cache of incoming mail, then you might try "yes".
JA_MIME_SAVE_OVERWRITE, set this to "yes" if it's okay to overwrite to an existing filename found from attachment. If you get periodic attachments always with same name, then you would want to set this to yes.

Core dump note

Because procmail uses LINEBUF when filtering messages, a core dump may happen if the attachment being filtered is bigger than the LINEBUF. The current setting accepts 524K attachments, but if you expect to get bigger than that, you want to increase JA_MIME_SAVE_LINEBUF.

Possible conflict with your awk

Awk is used because it is much more system load friendly than perl. If you see an error message in the log file saying that awk failed:

procmail: Executing awk, ... procmail: Error while writing to "awk" procmail: Rescue of unfiltered data succeeded

it means that the system's standard awk doesn't support the variable passing syntax. To verify that this is the case, run following test:

% awk '{print VAR; exit}' VAR="value" /etc/passwd

The proper awk should print "value". If not, then see if you have nawk or gawk in your system, which should understand the variable passing syntax. To change the AWK, you need to set following variable somewhere at the top of your .procmailrc

AWK = "gawk" # if that works better than standard "awk"

Return values (none)

Pm-janetmind.rc – handle http://minder.netmind.com/ messages

File id

Description

** THIS MODULE IS OBSOLETE. THE NETMIND SERVICE NO LONGER EXISTS **

http://minder.netmind.com/

...Netmind, or The URL-minder is a free, automatic Web-surfing robot that keeps track of changes to Web pages that are important to you. When the URL-minder detects changes in any of the Web pages you have registered, it sends you e-mail. an effective way to test if the address is known to Internet. You could use this information to see if some automated reply to a address can be sent.

In another words, if you're interested in some URL; say an FAQ page and any updates to them, you can tell Netmind to monitor the page changes for you and it send a message back every time page changes.

This recipe "pretty formats" the announcement sent by Netmind by stripping the message to bare minimum. You usually aren't interested in 4k message which includes "Note from our sponsors", "Try the free online demo" etc. The things saved from the announcement message are:

The changed url, which is moved to subject
Cancellation url pointer
url to the lists of your monitored urls
your id number

[Note]

Please let Netmind send you one "pure" message first so that you have a huch what it originally looks like. Then plug int his module and see how the original message is reduced.

[Thank you]

The Doctor What docwhat@holtje-christian-isdn.mis.tandem.com 1998-03-12 send me a patch, where a)body message is more informative b) URL is now included in the body for auto-click browsers c) mime headers were removed.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
If you se variable JA_NETMIND_SUBJECT to "yes", then the changed url http pointer is put to subject line.

Usage example

INCLUDERC = $PMSRC/pm-janetmind.rc # reformat the message :0: # drop to folder * netmind url.mbox

Pm-janslookup.rc – run nslookup on variable INPUT

File id

Description

This subroutine runs nslookup on given INPUT address. This may be an effective way to test if the address is known to Internet. You could use this information to determine if some automated reply to a address can be sent. The know truth is that you can't validate whole email address

to_someone@foo.com

but you can validate "foo.com"; that's the closest you get.

[Warning: If you don't use cache feature...]

Do not however use this module to regularly check all incoming from addresses with this subroutine for possible bogus UBE addresses, because calling nslookup

may be slow, building to connection and querying the results may take several seconds. (some times, usually it's quote fast)
consumes quite a lot resources.

You can however check some messages that are likely UBE to verify your doubts.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jaaddr.rc

Call arguments (variables to set before calling)

INPUT, the address (only strict domain part) which is checked. Eg. "this.domain.com". See examples for more. If this string contains "@" character, then additional subroutine pm-jaaddr.rc is called to extraxt the domain name INPUT = "John Doe foo@site.com" --> INPUT = "site.com"
JA_NSLOOKUP_CACHE, filename. If exists, cache is used and updated.
JA_NSLOOKUP_FORCE, if "yes", then cache is not used but a forced nslookup is performed.
JA_NSLOOKUP_OPT, is currently empty, but you could see if you you want to use "-querytype=MX". However this option may give you response: "No mail exchanger (MX) records available", which is flagged as nslookup failure.
JA_NSLOOKUP_SERVER, optional, the server to user for nslookup

If the cache file can be read:

Each entry has format "address.com ns-error". The error indication is added to the line if the nslookup failed when address was checked. Otherwise line contains "address.com".
The cache is always checked first. If there is no entry matching the current address, only then is nslookup called and new entry added to cache.

Return values

Variable ERROR will be set to "yes" if nslookup failed or to "no" if nslookup succeeded. It can also contain "maybe" if nslookup returned "No address (A) records available"
ERROR_MATCH contains one line lookup failure reason.

Following conditions trigger "maybe" and no "ns-error" is written into the cache.

"No address (A) records available for xxx"

Usage example

If you are going to check some header field, like From:, please explode the content with pm-jaaddr.rc first. Suppose you have string:

"From: foo@ingrid.sps.mot.com (Yoshiaki foo)"

You have to derive the address from string and pass the site name: Read From: field and address from it.

PMSRC = $HOME/pm RC_NSLOOKUP = $PMSRC/pm-janslookup.rc # name the subroutine :0 * MAYBE_UBE ?? yes * ^From:\/.* { INPUT = $MATCH INCLUDERC = $RC_NSLOOKUP # to nslookup :0 * ERROR ?? yes { # Hm, nslookup failed, can't send anything back to this # address } }

Second example, check if the address is reachable before sending reply

INPUT = `$FORMAIL -rt -x To:` INCLUDERC = $RC_NSLOOKUP :0 * ERROR ?? no { # okay, at least site address seems to be reachable }

Pm-jaorig.rc – Extract embedded original message (simple recipe)

File id

Documentation

This subroutine digs embedded message from the body and replaces current message with it. Copy the message to folder before calling this subroutine if you need original.

NOTE: This is simple tool and the sole purpose is to derive simple embedded messages. Write full fledged perl script if you want better extracting features. The used AWK inside this procmail recipe will fail to find 30% of the cases, mostly due to non-standard way of including the message. The recognized formats are as follows. Anything that differs from these are ignored or incorrectly parsed.

Message is embedded left flushed "as is". With full headers or Minimum of From: Subject Received
The embedded message is quoted with > with optional one space.

Where you would use this module

If you're subscribed to mailing lists that regularly sent copies of original message to the list, like forwarding spam to SPAM-L mailing list at http://bounce.to/dmuth, then you'd like to extract the original embedded message which you can then feed to your UBE filter to test if the shield holds.

spam-l-request@peach.ease.lsoft.com subscribe SPAM-L <First name> <Last name>

This recipe takes simplistic approach and tries it's best to extract embedded message. Idea for this recipe comes from Era Eriksson's posting "recipe to turn list postings back into original spam" 1998-06-25 in Procmail mailing list.

Body must contain headers
Remove all > quotations.
extract everything to the end of message. (There are no means to get rid of the attached signatures that ot forwarding poster or list server may have attached.

How the message is extracted

When this recipe ends, the current message has been modified so that it is the original message. Like if you would receive:

HEADER-1 # The poster body-1 # his comments HEADER-2 # The original embedded message body-2 body-1 # And poster's signature or mailing list footer

The message now looks like

HEADER-2 body-2 body-1

And you can save this as original message or feed it to your UBE filter and test if it detects it.

Code note: procmail or awk core dump

For some reason procmail kept dumping core I write the code in more nicer format like below, but if I made it compact, then it didn't dump core. Go figure. I'm not pleased that I had to sacrafice clarity, but there was no other way.

[The good style] [The forced compact style] if () if () { statement } { statement }

I have no explanation why this happens, the same AWK code would work just fine most of the cases and then came this message x and caused dumping the code, if I feed some other message, I didn't get core dump. Total mystery to me. Don't let the log message fool you, this had nothing to do regexp "^[> ]*From:.*[a-zA-Z]". If I deletd one line from AWK script, it worked ok, if I added it back the core dump happened with that message x

procmail: Assigning "pfx=[> ]*" procmail: No match on "^[> ]*From:.*[a-zA-Z]" Segmentation fault (core dumped)

Required settings

PMSRC must point to source directory of procmail includerc code. This subroutine needs module(s):

pm-javar.rc

Call arguments (variables to set before calling)

(none)

Usage Example

Let's assume that you want to feed all forwarded UBE that is posted to spam-l mailing list to your filter and see if it needs improving by checking the logs later. The forwarded UBE to the list is labelled "SPAM:" in the subject line.

$RC_LIST = $PMSRC/pm-jalist.rc # mailing list detector $RC_ORIG = $PMSRC/pm-jaorig.rc # extract original $RC_UBE = $PMSRC/pm-jaube.rc # UBE filter ... INCLUDERC = $RC_LIST # defines variable `LIST' :0 * ! LIST ^^^^ { :0 # spam-l mailing list * LIST ?? spam * Subject: +SPAM: { INCLUDERC = $RC_ORIG # Change it to UBE message # Ok, next feed it to filter, set some variables first # Log = Short log; What filters were applied to message # mbx = If message was trapped, save it here JA_UBE_LOG = "$PMSRC/pm-ube.log" JA_UBE_MBOX = "junk.ube.ok.mbox" INCLUDERC = $RC_UBE :0 : # If comes here, filter failed junk.ube.nok.mbx } :0 : # save normal list messages list.$LIST }

Pm-japing.rc – reply shortly to message "Subject: ping"; account ok

File id

Description

When I'm on remote site and I don't seem to get throught with telnet or even with Unix ping(1), I want to know if the at least the mail server is up. I can send a ping message and the auto responder will reply immediately.

Sometimes, when you send a message to a person, it would be nice, if you could test that the destination address is valid, before sending a message to a black hole. If the receiver had ping service running; like this, then you would know that you spelled the the right address. (after wondering two weeks; why you don't get response). Nowadays finger(1) command seems to be blocked many times.

This recipe answers to simple ping message like this:

To: you@site.com Subject: ping

Recipe sends a short message back to the sender.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jastore.rc

Call arguments (variables to set before calling)

(see Usage)

Usage example

JA_PING_MBOX = $HOME/Mail/spool/ping.spool INCLUDERC = $PMSRC/pm-japing.rc

Pm-japop3.rc – Remotedly download messages by mail command request

File id

Description

Ahem, that pop3 is just to draw your attention. This module has nothing to do with pop3. The idea may resemble it though. This module listens pop3 requests, and when it gets one, it sends the whole mailbox content as separate forwarded messages to the account from where you sent the request.

This is kinda "empty my mailbox in account X and send the messages to account Y"

You might have permanent forwarding on in account X, but if that is your secondary account, you can ask what messages has been arrived there with this recipe.

After you have configured your magic pop3 command, which is your password, simply send a message to account X, and this module initiates emptying the mailbox. Here is an example:

Subject: pOp3-send [mailbox] [kill]

mailbox, is optional folder name which you want to process. it is $DEFAULT if not given in subject. Use absolute path if you specify one.
if word kill is found, the mailbox will be emptied after forwarding. If the word is not found, messages are preserved.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-japop3.rc # Phew! We include ourself, we are recursive.

Call arguments (variables to set before calling)

JA_POP3_SUBJECT_CMD is your personal access command string. If this is the first word in the subject line, forwarding starts. This string is case sensitive.
JA_POP3_TMP is the file where mailbox is moved before starting to forward the messages. Do not put to point to your $HOME, becaus that may exceed the quota.
JA_POP3_TO_MUST_MATCH must contain regexp that match the email addresses where the pop3 messages are allowed to send. BE SURE TO DEFINE this. If you have account X,Y,Z where you want to receive pop3 messages, set this regexp to match those site's email addresses.
JA_POP3_LOGFILE is the log where you can see how the forked procmail processes send each pop3 mail. You may want to set this to different location than your default $LOGFILE.

Return value

STATUS will contain mailbox name if valid pop3 request was received. You may wish to save the pop3 requests to separate folder. See example below.

Example usage

You install this same setup for each site where you have account. This is the account X, from where you want to empty the mailboxes.

RC_POP3 = $PMSRC/pm-japop3.rc .. somewhere in your .procmailrc .. JA_POP3_SUBJECT_CMD = myPoPcmd INCLUDERC = $RC_POP3 # Save all pop3 requests to folder :0 : * STATUS ?? [a-z] mail.pop3-req.mbox

In account Y, from where you send the pop3 requests. Following code stores

The received messages to separate folder

# The MATCH will contain the host name from where the messages # were moved :0 : *$ X-Loop-Fwd:.*\.rc +\/$NSPC+ mail.fwd.$MATCH.mbox

Pm-jarandf.rc – pick (rand)om line from (f)ile

File id

Description

Return random line or a line from a file. This subroutine uses shell command awk and possibly wc to be as small burden to the system as possible.

Required settings

You must have awk that supports VAR=value assignment syntax outside the code block: that is, in the input line. I know no awk that would not have this feature, but at least you know now what it takes.

% awk '{print VAR; exit;}' VAR=1 /etc/passwd

Try using GNU awk, if your standard awk didn't print 1 in above test. (Put this line to the top of your .procmailrc)

AWK = "gawk"

Call arguments (variables to set before calling)

If intend to call this subroutine many times, then please calculate the number of lines beforehand and pass it to this subroutine. If the MAX is not set, then wc is called every time to find your the line count.

FILE, from what file to select. Make sure this exists; existence is not checked here.
[MAX] optional, number of lines in the FILE.

Returned value

variable LINE

Example usage

# Select random line from a file $RC_RANDF = $PMSRC/pm-jarand.rc $COOKIE = $HOME/txt/cookie.lst ...somewhere.. MAX=20 FILE=$COOKIE INCLUDERC=$RC_RANDF # LINE contains randomly read line

Pm-jasrv-check.rc – check FILE validity, subroutine for File Server

File id

Description

This subroutine is part of the TPFS or MPFS file server. Check FILE for nonvalid filenames or other access problems.

Input

JA_SRV_F_FILE_CASE_SENSITIVE, flag
FILE, filename to check. possibly converted to lowercase.

Output

stat, set to "ok" if filename is acceptable. Otherwise contains brief error reason;

Pm-jasrv-daemon.rc – server request check, subroutine for File Server

File id

Description

This subroutine is part of the MPFS file server. Handle BOUNCES to mail server messages, eg if delivery failed due to maximum byte limit.

552 foo@site.com... Message is too large; 100000 bytes max

Input

(none) This recipe examines headers and body to see if it's daemon bounce.

Output

stat, set to "daemon" if message was handled.

Pm-jasrv-err.rc – send message, subroutine for File Server

File id

Description

This module is part of the MPFS file server. Ssnd error notice: file didn't exist.

Input

FILE, file or command that did ot exist.

Output

fld, additional field to be added to the saved mbox log message

Pm-jasrv-from.rc – compose reply, subroutine for File Server

File id

Description

This subroutine is part of the MPFS file server. Compose headers for reply message using formail -rt.

Here is dry run example to test this module

% procmail DEFAULT=/dev/null VERBOSE=on LOGABSTRACT=all \ FORMAIL=/opt/local/bin/formail \ JA_SRV_FORMAIL_FROM=me@here \ JA_SRV_CONTENT_TYPE=content-type \ JA_SRV_XLOOP=xloop \ $HOME/pm/pm-jasrv-from.rc \ < $HOME/any-sample.email

Input

JA_SRV_FORMAIL_FROM, JA_SRV_XLOOP
JA_SRV_CONTENT_TYPE

Output

(none)

Pm-jasrv-msg.rc – send message, subroutine for File Server

File id

Description

This subroutine is part of the TPFS or MPFS file server. Run $CODE and return resutls to to user. Subroutine is meant to be used for informational messages.

Input

code, code to run in shell
stat, status message for user

Pm-jasrv-multi.rc – send multipart MIME message, subroutine for FileSrv

File id

Description

This subroutine is part of MPFS file server. Send out FILE as multipart MIME message. The message will always be base64 encoded before sending.

Input

JA_SRV_MIME_MULTI_SEND, command to feed the composed and message which will handle sending it as multipart MIME.
JA_SRV_MULTIPART_THRESHOLD is the hunk size for slitting mail.
FILE, only filename part. Included in MIME headers.
file, absolute path to send

Pm-jasrv.rc – MIME capable Procmail File server

File id

Description

This is the MPFS (Mime Procmail File Server) and it can send MIME compliant messages with command

"send <ITEM> [WORD1] [WORD2]"

Usually only the ITEM arg is used, and the rest of the words are for special uses like password and preventing file encoding. A typical request looks like:

Subject: send help # ask for file named 'help'

Overview of features

MIME types gzip and text/plain are supported.
.gz .zip etc. files are sent out as base64 attachments
.gz .tar.gz files that exceed 100K are sent out as MIME multiparts
requires procmail 3.11+ and MATCH operator \/
requires mmencode and gzip executables to be present in PATH.

Install: server file directory

You have to create a directory for the server where the files are kept. Usually I don't put the files there, but whenever I want to make a file available, I draw a hard or softlink to the real file.

% mkdir $HOME/pm-server # Repeat this as needed for files you want to put available % cd $HOME/pm-server % ln -s $HOME/txt/interesting-file.txt interesting-file.txt

You define the server directory by setting

JA_SRV_FILE_DIR = $HOME/pm-server

The short server log is written to file pointed by this variable:

JA_SRV_LOG = $HOME/pm-server.log

The incoming "send" requests are stored to mailbox pointed by following variable. The default value is /dev/null, but you may want to set it to ~/Mail/spool/log.srv.spool which can be used read as a newsgroup by Emacs Gnus [In Gnus create newsgroup with `G` m nnml log.srv]

JA_SRV_MSG_MBOX

Install: special files

Tweak this variable to commands you want to allow shell to execute in server's directory. This tells when <ITEM> "ls" means command instead of file

JA_SRV_SH_COMMAND = "^(ls|what)$"

That means that request like this:

Subject: send ls # run "ls" command and return results

Be sure that the commands exist in your system. See man pages for more if you want to know what these commands do. Commands cannot take switches currently for security reasons. E.g. if you want to give access to "ls -la" listing, put a file "ls-la.txt" available in the directory, user can get it with "send ls-la.txt"

ls -- list directory file -- print file type information. what -- prints all @(#) tags from files ident -- print all $ $ tags from files

Install: file `help'

Users want to get a help file with message "send help" and the help is just a file in your server directory. Be sure to supply it prior to any other files. You can always draw a link to a file if you don't want to name it that way (e.g. if you keep several server help files in a RCS tree)

# draw symlink to `help' % ln -s $HOME/txt/srv-public-hlp.txt $HOME/server/help

Basic usage in details

The server accepts command in format

"send <ITEM> [CMD|PASSWORD]"

Where ITEM can be any name as long as it starts with [^ .]. The regexp says: Anything goes as long as FILE does not start with space or period. This gives you quite a much freedom to construct filenames. if you want to hand out file:

.procmailrc

You can't. Instead make a link to point to plain "procmailrc" without the leading period. There is also additional checks against possible security threat "../" like below; user can't request such file.

../../../gotcha or dir/../../gotcha

The filename cannot contain special characters like [*?<>{}()].

Advanced usage

[conversions]

If some of your files are big, it makes sense to send them in compressed base64 format; which in MIME world is called content-type gzip. You can set a regexp to enforce encoding for your big files before they are sent to user. The following setting will send all text files in compressed format to user.

JA_SRV_XGZIP_REGEXP = "\.txt"

When the message is composed a header is inserted into the message telling how the message is to be decoded, in case user doesn't have decent MUA that can handle the MIME type:

X-comment: To decode, cat msg| mmencode -u| gzip -d > test.txt

[noconv and gz]

The WORD1 parameter after the FILE is optional and user can override base64 encoding and request plain file if he uses word "noconv".

Subject: send <FILE> [noconv|gzip]

However, there are files where noconv must not be obeyed, like the compressed packages that you have put available in .zip, .gz, tar.gz or .tgz (GNU tar) format. Following variable controls
when file is always sent as base64:

JA_SRV_BASE64_ALWAYS

If the WORD1 is "gz" or "gzip", then the gzip is explicitly requested, This may be desirable, because some of the text files in the server directory may be big and some accounts don't accept big messages. A typical bounce looks like:

552 foo@site.com... Message is too large; 100000 bytes max 554 foo@site.com Service unavailable

These kind of file server bounce messages are handled in separate module which notifies the user that his account didn't accept the sent file.

[case sensitivity]

By default the request word ("send") and ITEM (filename) are not case sensitive, unless you set these flags:

JA_SRV_F_CMD_CASE_SENSITIVE = "yes" JA_SRV_F_FILE_CASE_SENSITIVE = "yes"

If values are "no", then these are identical commands:

Subject: Send Help Subject SEND HELP

Multi part mime messages

If you want to deliver big files, you better be sure not to send them as a big file. That blocks the connection between every host along the path that the big file is transferred. The solution is to use MIME multi parts that can be assembled back in the receiving MUA. (In case you don't have multi part assembler receive Perl script to do it).

MIME multiparts are sent out if

Filename matches JA_SRV_BASE64_ALWAYS, typically tar.gz, zip
Filesize is bigger than JA_SRV_MULTIPART_THRESHOLD, where default chunk size is 100K.

When a file meets these criteria, it is read to the BODY of message and base64 encoded. This all happens in memory, so watch procmail logs to see if any problems with very big files. (>30Meg). Next, if the base64 conversion succeeded, the composed is handed to

JA_SRV_MIME_MULTI_SEND

Which does the actual delivery and splitting. The default program used is splitmail. Make sure you have it or substitute the program with some equivalent one.

Stopping server

Sometimes you're making rearrangements in you file directory or doing some other maintenance and you are unable to respond to send requests. You can stop the server by setting

JA_SRV_IN_USE = "no"

And when you want to enable the server again; just comment out the statement or assign yes. [The default is yes]. When this variable is set to no, the server sends a message from following variable as a response to any "send" request.

JA_SRV_IN_USE_NO_MSG

Using password to validate file requests

You should be aware that this file server's implementation is public in nature. Anyone who asks for a file is allowed to get it. But it would be good if you could limit the access to documents with some simple way, like if you set up two file servers (see next chapter) where one is public and the other is interesting only to group of people. You can define a string that must be found in Subject field by setting the following variable

JA_SRV_PASSWORD = ".*" # default

The default value will match anything in the subject, thus making the server public. But if you set it like this

JA_SRV_PASSWORD ".*123"

Then string "123" must be there somewhere in the line, like here

Subject: send <FILE> 123

Yes, "123" is actually a CMD definition, but it doesn't matter because there is no CMD 123. Subject now matches password and the server can be accessed. Of course the following is valid too.

Subject: send <FILE> noconv 123

If the password was wrong, server won't tell it. The message just lands to your mailbox in that case and you can investigate who tried to access the restricted server.

Changing server's command string (multiple servers)

The default command string is "send", but you can change it and thus create multiple services. Here is one example, where you have set up two file servers where each has its own directory.

# The public server JA_SRV_CMD_STRING = "send" JA_SRV_FILE_DIR = $HOME/server/public INCLUDERC = $HOME/procmail/pm-jasrv.rc # Company server, only interests fellow workers. # Here "xyz-send" is just magic server request string. # Notice case sensitivity settings. JA_SRV_F_CMD_CASE_SENSITIVE = "yes" JA_SRV_CMD_STRING = "xyz-send" JA_SRV_PASSWORD = ".*12qw" JA_SRV_FILE_DIR = $HOME/server/public/xyz-dir INCLUDERC = $HOME/procmail/pm-jasrv.rc

Notes from the author

[basic Mime type note]

All basic files that you send must be US-ASCII, 7bit. At least that is the default MIME type used. See JA_SRV_CONTENT_TYPE. I once received following message back

----- Transcript of session follows ----- 554 foo@bar... Cannot send 8-bit data to 7-bit destination 501 foo@bar... Data format error

because in the previous releases, the MIME type headers were not in the message saying that the content really was plain 7bit ascii.

[Sending the file as is]

Note, that the file is included "as is" without any extra start-of-file or end-of-file tags. This is possible, because the file is sent in MIME format.

[Using one line log entry]

It may look very spartan to print a single line log entry. You see messages like above in the file server log. Using one line entry instead of multi line announcements makes it possible to write a small perl tool to parse information from a single line. If you get many file server messages per day, it quicker to look at the single line entries too.

[ja-srv1; sh file; Foo Bar foo@site.com;] [ja-srv1; send xxx-file.txt; Foo Bar foo@site.com;] | Server's request keywords (you may have multiple servers)

[wish list]

(*) MIME multipart message's mime headers may need some adjustments.

(*) I rely on simple regexp to send out base64 or gzip files. The natural extension would be to use file size threshold: if file is bigger than N bytes, send it out with gzip. And further: if file is more than NN bytes, send it out as multi part MIME.

(*) In fact there is a slight mime type errors: .zip files should be send as application/zip. If you have experience with the mime types, please contact me and help me to sort out proper mime headers.

Required settings

PMSRC must point to the source directory of procmail code. This subroutine will include many pm-jasrv-*.rc modules and other files from there.

Please test the File Server in your environment before you start using it for every day. For example I had some weird local problem where PATH had /usr/contrib/bin/ where gzip was supposed to be, but in spite of my tries procmail didn't find it along the path. Don't ask why. I now use absolute binary name:

GZIP = /usr/contrib/bin/gzip

In addition, if your messages are not sent to recipient, but you get daemon message:

... Recipient names must be specified

That's because you have setting SENDMAIL="sendmail"; which is not enough. It must be

SENDMAIL = "sendmail -oi -t"

Usage example

This is my .procmailrc installation. Notice that the file server code is used only if you get "send" request. On the other hand, this double wrapping is not all necessary, you could as well rely on the File server's capability to detect SEND request.

PMSRC = $HOME/pm # directory where the procmail rc files are RC_FSRV = $PMSRC/pm-jasrv.rc mySavedLOGFILE = $LOGFILE # record file server actions elsewhere LOGFILE = $PMSRC/pm-jasrv.log # Listen "send" requests. :0 * ^Subject: +send\> { JA_SRV_FILE_DIR = $HOME/fsrv # Where to get the files JA_SRV_LOG = $HOME/fsrv.log # Write log here INCLUDERC = $RC_FSRV # Use file server now } LOGFILE = $mySavedLOGFILE

Pm-jasrv-req.rc – server request check, subroutine for File Server

File id

Description

This subroutine is part of the MPFS file server. Check if file server request is on the JA_SRV_SUBJECT and do case or incasensitive check.

To Dry run this module use following skeleton. Substitute keywods as needed to reflect your system setup:

% procmail DEFAULT=/dev/null VERBOSE=on LOGABSTRACT=all \ PMSRC=$HOME/txt JA_SRV_CMD_STRING=send \ JA_SRV_SUBJECT="send newbie_article.rtf noconv" \ txt/pm-jasrv-req.rc < ~/test.mail

Input

JA_SRV_F_CMD_CASE_SENSITIVE; if "yes" then server request is case sensitive.
JA_SRV_FORMAIL_FROM. the email From field

Output

stat, set to "ok" if request is accepted

Pm-jasrv-send.rc – server request check, subroutine for File Server

File id

Description

This subroutine is part of the MPFS file server. Send the requested file. You can dry-run test this module with following command: a) make sure that $HOME/test conatins any simple email message b) define FORMAIL if it isnot found from path.

% procmail DEFAULT=/dev/null VERBOSE=on LOGABSTRACT=all \ PMSRC=$HOME/txt JA_SRV_LOG=/dev/null \ FORMAIL=/opt/local/bin/formail \ file=$HOME/test FILE=test WORD=WORD JA_SRV_FROM=foo@bar \ SENDMAIL="tee -a $HOME/test.send" txt/pm-jasrv-send.rc < ~/test

Note:

The MIME headers here selected previously were:

Content-type: application/octet-stream Content-transfer-encoding: x-gzip64

But Defining own CTE such as x-gzip64 is strongly discouraged by the MIME RFC's. Most e-mail clients would be at a loss on how to handle these. Many would just bomb out and not even give you the opportunity to save it to a file. A more correct MIME type is this, which is now used:

Content-type: application/x-gzip Content-transfer-encoding: base64

Input

o FILE is the filename(chdir to directory is already done) `file' is _absolute_ filename `WORD' is next word from subject line after FILE word. o JA_SRV_CMD_STRING is flag o JA_SRV_F_SUBJ_NOTIFY is flag

Output

FILE_ERROR is set to "yes" if file is not found.

Pm-jastore.rc – Store messagee to inbox or gzip inbox

File id

Description

This subroutine stores the message to file pointed by MBOX. This subroutine is meant to be used with the the other general purpose includerc files. This makes it possible to have a centralized file storage handling for all your rc files.

Regular user doesn't get much out of this rc unless he mixes both gz and regular files in his .procmailrc

R e p e a t: This module is basis for general purpose procmail rc plug-ins to strre message to mailbox pointed by some rc configuration variable. Normal user can simply say in his .procmailrc:

:0: mail.private

Required settings

(none)

Call arguments (variables to set before calling)

MBOX must have been set to point to message storage. MBOX_SUFFIX is extension added to MBOX. Default is none. MBOX_MH if "yes" then deliver to MH mailbox with `MBOX_MH_CMD' which is "rcvstore" by default.

message is delivered to MH mailbox using MBOX_MH_CMD

otherwise

If MBOX is some.mbox the message is stored as is.
If MBOX is some.mbox.gz the message is gzipped to folder.
If MBOX is some-dir/. then deliver as individual files

Example usage

$RC_MBOX = $PMSRC/pm-jastore.rc :0 * condition { MBOX = $HOME/Mail/spool/junk.mbox INCLUDERC = $RC_MBOX }

Pm-jasubject.rc – Subject field cleaner and canonicalizer (Re:)

File id

Description

NOTE: If you receive RFC 2047 encoded Subject headers like "Subject: ?ISO-8859-1?Q?=C4hnlichkeiten_von_=DCbungen?", you must first decode it before using this subroutine. Feed the message to pm-jamime-decode.rc first.

There are many different Email programs out there that add their own reply characters to the subject field. The most sad programs come usually from PC platform. Eg. Microsoft has gained a lot of bad reputation due to it's own standards.

MS Explorer can use localized reply strings, Eg Vs: or vast: seems to be Finnish Vastaus.
MS product Outlook (??) can be configured similarly. I have received swedish Sv: -Svar for Svaring (eng: reply)
MS mail uses FW: in forwarded mails.
Intelligent MUAs try to keep count of replies with Re2: or Re[2]
Japanese MUA Denshin 8 Go V321.1b7 has sent Re^2:
Some mua uses Re>
Lotus notes (in French version) uses Ref:
Some MS product sends UQ:
XXX uses -reply
Forwarding schemes: (fwd) [fwd] <fwd> fw: [FWD: [FWD:]]
Subject references: -subj subj- subj:

There already is a de facto standard where message should contain only single Re: if message has been replied to (no matter how many times). This makes it possible to do efficient message threading by only using Subject and date fields. And grepping same subjects is lot easier than from this horrible mess. Note that all text is on one line, the subject has been broken only for visual reasons:

Subject: re- Re^2: Re[32]: FW: Re: Re(15) Sv: Re[9]: -reply (fwd) [fwd] <fwd> fw: [FWD: [FWD:]] -subj subj: subj: subj- test

This recipe standardizes any subject (like above) that has been replied to, to de facto format below. That is: "Any number of 'Re:' will be converted to single 'Re:' and any number of 'Fwd:' will be converted to single 'Fwd:'"

Subject: Re: test (fwd)

About In-Reply-To header

If there is In-Reply-to header in the message, but there is not Re: in the subject line, one is added automatically. Some broken Mailers forget to add the Re: to the Subject line.

Variable JA_SUBJECT_SAVE

This is by default yes which causes the original subject to be saved under header field X-Old-Subject. If you don't want that extra header generated, set this variable to no

Variable JA_SUBJECT_FWD_KILL

This is by default yes, which will kill extra forwarding indication words like (fwd) [fwd] <fwd> <f>. If you set this to no, then all the forwarding words are preserved. The de facto forward format is:

Subject: This subject (fwd)

Code note

This subroutine's intention is to make Subject more expressive by deleting redundant information. A simplistic approach has been taken where Subject consists of list of words whose each attribute can be either ok or delete. No attempt has been made to determine the structure of the Subject. You can see the algorithm better from an example:

Re: New subject (was Re: Old subject)

That should be treated syntactically like "New subject" and forgetting anything between parenthesis. This is however not respected and not even tried. The rule applied here is "One Re: is tolerated", so the subject won't change. It doesn't matter where "Re:" is.

But here the subject is changed. The rule applied is: Delete all unwanted words and then add one Re: to the beginning if OLD content had any Reply indications

Re: New subject (was Re: Old subject) --> Re: New subject (was Old subject)

IMPORTANT notice

Please check that your SHELL variable setting in ~/procmailrc is sh derivate, /bin/sh or /bin/bash. This module won't work with other shells.

Awk usage note

awk is a small, effective and much smaller than perl for little tasks. See the verbose log and make sure your awk understands VAR="value" passing syntax. Change it to nawk or gawk if they work better than your standard awk.

AWK = "gawk" # you may need this, try also gawk

Customizations

Let's say Polish M$Outlook uses ODP: instead of standard re: and you want to handle that too: Then set:

JA_SUBJECT_KILL = "odp:" # NOTE: all lowercase JA_SUBJECT_SAVE = "no" INCLUDERC = $PMSRC/pm-jasubject.rc

You ca use JA_SUBJECT_KILL to delete any additional words from the subject line. E.g. if you have good news-reader, you don't need the mailing list prefixes that some mailing lists add to the beginning

Subject: [LIST-xxx] the subject here

to remove that list prefix, you simply match it

JA_SUBJECT_KILL = "(list-xxx|list-yyy)"

Important: The regexp must be all lowercase, because when match happens, the words have been converted to lowercase.

Example usage

You need nothing special, just include this recipe before you save message to folder.

INCLUDERC = $PMSRC/pm-jasubject.rc

Debugging

You can dry-run test this module with following command and watching output. Substitute variables as they are in your system. You feed the content of entire example mail where the Subject that needs correction is found.

% procmail SHELL=/bin/sh AWK=gawk VERBOSE=on LOGABSTRACT=all \ DEFAULT=/dev/null LOGFILE=$(tty) \ JA_SUBJECT_KILL="(ace-users)" \ PMSRC=/path/to/install/dir \ /path/to/pm-jasubject.rc \ < ~/test.mail

Thank you

Thanks to Tony.Lam@Eng.Sun.Com for his creative improvement suggestions and sending code that this recipe didn't catch at first.

Pm-jatime.rc – "hh:mm:ss" time parser from variable INPUT

File id

Description

This includerc parses date from variable INPUT which has string

"hh:mm:ss"

Example input

"Thu, 13 Nov 1997 11:43:23 +0200"

Returned values

hh = 2 digits mm = 2 digits ss = 2 digits

Variable ERROR is set to "yes" if it couldn't recognize the INPUT and couldn't parse all hh, mm, ss variables.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include pm-javar.rc from there.

Call arguments (variables to set before calling)

INPUT = string-to-parse

The INPUT can be anything as long as it contains NN:NN:NN

Usage example

Get the time of received message. The From_ header will always have the standard time stamp.

PMSRC = $HOME/pm RC_DATE_TIME = $PMSRC/pm-jatime.rc :0 c * ^From +\/.* { INPUT = $MATCH # Turn off the logging while executing this part VERBOSE=off INCLUDERC = $RC_DATE_TIME VERBOSE=on :0 * ERROR ?? yes { # Should not ever happen, you have broken From_ } }

Pm-jaube1.rc – Jari's UBE filter. Subroutine 1

File id

Documentation

This file is part of the "pm-jaube.rc". This subroutine is called when likely UBE message has been triggered.

Required settings

PMSRC must point to source directory of procmail code. This recipe file will include

pm-jastore.rc

Pm-jaube-keywords.rc – Bare bones word list based UBE filter

File id

Warning

Put all your UBE (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked.

Now, if 50-70 % hit rate is good enough for your starting point, go ahead and read more. This file is supposed to be the last resort, if you really do not have any better tool to analyze messages.

Overview of features

Word and phrase based matching
50-70 % success rate. 90 % never achieved. That's a guarantee.
Extremely fast and a dream to CPU resources. Implemented in pure procmail and you can almost hear the humming sound of its regular expression engine shredding UBE messages to pieces.

Description

Are you sure you want use this word list based checking?

Think twice before you use this subroutine. It knows nothing about the content your mail. "It's all UBE unless proven otherwise" is the motto. The brutal search tracks words and phrases to find an indication of mass posting and traces of Unsolicited Bulk Email (UBE aka spam). Repeat: Read the first paragraph again before you consider putting this file into action. This filter WILL PASS through unwanted mail and it WILL catch good mail. This is rule based matching, so I suppose you know where you're putting your head with this. Ahem. Alerted? Good.

The Story

There was a man and mail account. The account had limited space, couldn't install any other programs because disk quota would have exceeded. System administrators weren't' interested in installing anything. The Mail server ran behind firewall and had OS that was never heard of - it couldn't run other programs. Or if it could, the Bad system administrator was too scared to install extra programs to the host MTA ran. No joy – no means to stop incoming UBE – Right?

Wrong. There was procmail. The Bad system adminitrator didn't mention that ~/.procmailrc was honored - just the the external programs we a no-no-no (Technical: the MDA host mounted user disks; the server ran on separate host and couldn't use any of the user compiled programs. Statically linked ones filled up the man's disk space).

First line of defense, any defense would do. So, this rule based file was born. Nothing else was installed in that account and the happy word list based matching routine kept chewing mail, mail, mail. And the system administrator was happy - he nurtured the MTA host's CPU resources and noticed nothing alarming. All ticked like clockwork.

Life began again. After 1000 mail bombards a day, the account was usable again.

Motivation

If you can, use the Bayesian filters and forget all rule based ones, word and phrase matching based ones; all static filters. On the other hand, if you want quick solution, even imperfect, until you have time to learn and setup other tools, this subroutine may be of interest.

The best part. You can carry this single file anywhere where procmail lives. No other files are needed. Setup couldn't be simpler.

About bouncing message back

The general consensus is, that you should not send bounces. The UBE sender is not there, because the address is usually forged. Do not increase the network traffic. Instead save the messages to folders and periodically check their contents. It's not nice to be forced to apologize about bounces to a wrong destination.

Code Note

Procmail is picky about the whitespace in continuing lines, make sure there is not a single spaces left after the continuation backslash. Use good editors or external programs to get rid of the white spaces. In Emacs you would add this line to your ~/.emacs startup file: "(add-hook 'write-file-hooks 'delete-trailing-whitespace)"

:0 * ^Subject:.*(regexp\ |and-more\ |and-more\ ) { # Process it }

Why are the regexps put into this file and not to a separate regexp file? Good question. It is possible to check message's content with external process, like grep, to see if any matches are found. This kind of methodology is covered in Procmail Tips section "Using grep with file lists to mach messages" at <http://pm-tips.sf.net>. The reason why all the regexp are maintained inside this file is:

Simplicity. One file - no extra configuration files or regexp databases.
Self standing. Does not call external processes, so it's a little faster than possible grep and fgrep solution.

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_KEYWORD_HEADER, if set, then the results are put to messages headers. By default this variable is not defined to save from external formail process call. Suggestion: "X-Spam-JaubeKwd"; without trailing colon.

Return values

ERROR_STATUS is set to word "Bad" otherwise empty.
ERROR is set to short descriptive word that indicate which rule was matched. Values: Header-FromKeywords, Header-SubjectKeywords and Body-Keywords
ERROR_MATCH is set to some words that happened to trigger UBE catch rule.

Usage example

PMSRC = "/path/to/procmail/lib" # Exclude these addresses from tests VALID_FROM = "(my@address.example.com|word@here.example.com)" :0 *$ ! ^From:.*$VALID_FROM * ! FROM_DAEMON { INCLUDERC = $PMSRC/pm-jaube-keywords.rc # Variable "ERROR" is set if message was UBE :0 : * ! ERROR ?? ^^^^ junk.ube.spool }

File layout

The layout of this file is managed by Emacs packages tinyprocmal.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-spamprobe.rc – Interface to Annoyance Filter program

File id

Warning

Put all your Unsolicited Bulk Emacil (aka spam) filters towards the end of your ~/.procmailrc. The idea is that valid messages are filed first (mailing lists, your work and private mail, bounces) and only the uncategorized messages are checked last.

YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!

To train:

$ mkdir $HOME/.annoyance $ DB=$HOME/.annoyance/dict.bin; DB2=$HOME/.annoyance/fdict.bin $ annoyance-filter --mail single.msg --prune --write $DB $ annoyance-filter --phrasemax 2 \ --read $DB \ --junk dir/to/bad/messages \ --prune --write $DB $ annoyance-filter -v --read $DB --prune --fwrite $DB2

To check message:

$ annoyance-filter --read $DB --test mail.msg $ annoyance-filter --fread $DB2 -v --class mail.msg

Overview of features

Implements interface to http://sourceforge.net/projects/annoyancefilter/ project. See article "Training Annoyance Filter to combat spam" by Corrado Cau at http://www.newsforge.com/software/03/10/24/2046238.shtml?tid=74
variable ERROR is set if the message was UBE.
Results are available by default in header X-Spam-Annoyance-Status.

Description

There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program annoyance-filter, which must already have been installed.

About bouncing message back

The general consensus is, that you should not send bounces. The UBE sender is not there, because the address forged. Do not increase the network traffic; you will not do any good to anybody by bouncing messgas – you just increase mail traffic even more. Instead save the messages to folders and periodically periodically check their contents.

Required settings

If annoyance-filter program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.

JA_UBE_ANNOYANCE_PRG = /usr/bin/spamprobe

If you do not have program installed, do not leave the variable lying aroung, because it will keep this subroutine active. Calling a non existing program is not a good idea, so it better to empty the variable if the program is not available.

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_ANNOYANCE_PRG, path to the program [required].
JA_UBE_ANNOYANCE_SPAM_DB, path to the dictionary database [required]. E.g. $HOME/.annoyance/dict.db.
JA_UBE_ANNOYANCE_SPAM_DB_OPT, type of dictionary to read. Default is "--read", but this could be fast dictionary option "--fread".
JA_UBE_ANNOYANCE_HEADER, the header name where the results are put. If not defined, no header is added. Defaults to X-Spam-Annoyance-Status
JA_UBE_ANNOYANCE_FORCE, if set to yes then call program no matter what. Normally if there already is X-Spam-Annoyance-Status header, it is assumed that the message has already been checked and no new checking is needed.

Return values

ERROR, is set to the return value of the program.

Usage example

PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_ANNOYANCE_PRG = "/usr/bin/nice -n 5 /usr/bin/annoyance-filter" JA_UBE_ANNOYANCE_SPAM_DB = $HOME/.annoyance/dict.db INCLUDERC = $PMSRC/pm-jaube-prg-spamprobe.rc # The ERROR will contains word "yes" if message was spam :0 : * ERROR ?? yes junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-bmf.rc – Interface to Bayesian Mail Filter program

File id

Warning

YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!

To train:

$ ls spam/*.mail | xargs -n 1 bmf -s # feed individual messages $ ls good/*.mail | xargs -n 1 bmf -n # feed individual messages

To test

$ bmf -p < test.mail | less

Overview of features

Implements interface to http://www.sf.net/projects/bmf "Bayesian Mail Filter" project. The called binary is "bmf" hence the name of this subroutine. Bmf program uses well know statistical analysis which is much more reliable than any hand made procmail scripts could ever achieve.
Variable ERROR is set if the message was UBE.
Results are available in headers X-Spam-bmf-Status and X-Spam-bmf-Flag for further analysis.

Description

There are several bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection.

For serious discussion of strenghts of the different programs, refer to a very good article "Spam Filters" by Sam Holden at 2004-08-16 <http://freecode.net/articles/view/964>. The article evaluated throughly following programs:

Bayesian Mail Filter (bayesian)
Bogofilter (bayesian)
dbacl (bayesian; multiple wordlists)
Quick Spam Filter (bayesian)
SpamAssassin (perl matching + bayesian)
SpamProbe (bayesian)
SPASTIC (procmail recipes)

This subroutine implements call interface to bmf program. Why whould you need it? Because unfortunately bmf by default use exactly the same headers as spamasassin and the two cannot co-operate together: bmf would overwrite existing spamassasin headers. This subroutine takes care of saving previous headers and move bmf results to their own X-Spam-bmf-* headers.

About bouncing message back

Required settings

If bmf program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.

JA_UBE_BMF_PRG = "/usr/bin/bmf"

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_BMF_PRG, path to program
JA_UBE_BMF_HEADER_PREFIX, the header name where the results are put. If not defined, no headers are added. Default value is X-Spam-bmf.
JA_UBE_BMF_FORCE, if set to yes then call program no matter what. Normally if there already are X-Spam-bmf-* headers, it is assumed that the message has already been checked and no new checking is needed.

Return values

ERROR, is set to short ube trigger recipe reason. Contains content of X-Spam-bmf-Status header which you can check for values
ERROR_MATCH contains detailed content of X-Spam-bmf-Status header.

If headers were enabled, they will contain:

X-Spam-bmf-Status: Yes, hits=1.000000 required=0.900000, tests=bmf X-Spam-bmf-Flag: YES

Usage example

PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_BMF_PRG = "/usr/bin/nice -n 5 /usr/bin/bmf" INCLUDERC = $PMSRC/pm-jaube-prg-bmf.rc # The ERROR will contains word "yes" if it program classified # the message into "bad" category. :0 : * ERROR ?? yes junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-bogofilter – Interface to bogofilter program

File id

Warning

YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!

To train:

$ rm -f ~/.bogofilter/*.db # delete database $ bogofilter -B -n good.msg ... $ bogofilter -B -s spam.msg ...

Overview of features

Implements interface to http://www.sf.net/projects/bogofilter project.
variable ERROR is set if message was likely spam.
Results are available by default in header X-Spam-Bogofilter-Status.

Description

There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program bogofilter, which must already have been installed.

About bouncing message back

Required settings

If bogofilter program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.

JA_UBE_BOGOFILTER_PRG = /usr/bin/bogofilter

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_BOGOFILTER_PRG, path to the program
JA_UBE_BOGOFILTER_HEADER_NEW, the header name where the results are put. If not defined, no header is added. Defaults to X-Spam-Bogofilter-Status
JA_UBE_BOGOFILTER_FORCE, if set to yes then call program no matter what. Normally if there already is X-Spam-* header, it is assumed that the message has already been checked and no new checking is needed.

Return values

ERROR, is set to the return value of program if message was spam.
ERROR_INFO, is set if case is "unsure".

Usage example

PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_BOGOFILTER_PRG = "/usr/bin/nice -n 5 /usr/bin/bogogilter" INCLUDERC = $PMSRC/pm-jaube-prg-bogofilter.rc # The ERROR will contains reason if program classified # the message into "bad" category. :0 : * ! ERROR ?? ^^^^ junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-bsfilter.rc – Interface to Bsfilter program

File id

Warning

YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!

To train:

$ bsfilter --add-clean good.msg ... $ bsfilter --add-spam spam.msg ...

Overview of features

Implements interface to project http://packages.debian.org/testing/mail/bsfilter
variable ERROR is set if the message was UBE.
Results are available by default in header X-Spam-Bsfilter-Status.

Description

About bouncing message back

Required settings

If bsfilter program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.

JA_UBE_BSFILTER_PRG = /usr/bin/bsfilter

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_BSFILTER_PRG, path to the program.
JA_UBE_BSFILTER_HEADER, the header prefix name where the results are put. If not defined, no header is added. Defaults to X-Spam-Bsfilter-
JA_UBE_BSFILTER_FORCE, if set to yes then call program no matter what. Normally if there already is X-Spam-Bsfilter- header, it is assumed that the message has already been checked and no new checking is needed.

Return values

ERROR, is set to the return value of bsfilter program.

Usage example

PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_BSFILTER_PRG = "/usr/bin/nice -n 5 /usr/bin/bsfilter" INCLUDERC = $PMSRC/pm-jaube-prg-bsfilter.rc # The ERROR will contains word "yes" if message was spam :0 : * ERROR ?? yes junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-ifile – Interface to ifile program

File id

Warning

YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!

To train:

$ rm ~/.idata # delete database $ echo herbalife | ifile -i spam # initialize database $ ifile -h -i good good.msg ... $ ifile -h -i spam spam.msg ...

Overview of features

Implements interface to http://freecode.net/projects/ifile project.
variable ERROR is set to the result of ifile check. This usually holds the "folder" name the ifile was trained at the time. E.g. if the folder used for training Unsolicited Bulk Email was "ifile -i spam", then the return value is "spam".
Results are available by default in header X-Spamifile.

Description

About bouncing message back

Required settings

If ifile program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.

JA_UBE_IFILE_PRG = /usr/bin/ifile

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_IFILE_PRG, path to the program
JA_UBE_IFILE_HEADER, the header name where the results are put. If not defined, no header is added. Defaults to X-Spam-Ifile-Status
JA_UBE_IFILE_FORCE, if set to yes then call program no matter what. Normally if there already is header, it is assumed that the message has already been checked and no new checking is needed.

Return values

ERROR, is set to the return value of ifile program.

If header output is enabled, it will contain the folder name ifile thinks the message belongs to. Assuming that trained folders used for messages were spam and good, then the headers read:

X-Spam-Ifile-Status: spam X-Spam-Ifile-Status: good

Usage example

PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_IFILE_PRG = "/usr/bin/nice -n 5 /usr/bin/ifile" INCLUDERC = $PMSRC/pm-jaube-prg-ifile.rc # The ERROR will contains reason if program classified # the message into "bad" category. :0 : * ! ERROR ?? ^^^^ junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-runall.rc – Interface to all Bayesian filter programs

File id

Overview of features

To detect spam reliably, run all Bayesian programs one by one to see if any of them classifies the message as spam.
Programs supported: bogofilter, spamprobe, Bayesian Mail Filter, Annoyance Filter, Bsfilter, Spamoracle and Spamassassin.

Description

There are several bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. This module is a meta package which will call all other individual modules that interface to these Bayesian programs. The use is simple: define programs that are available in your system and which you have trained (Bayesian programs need to be trained before use), and this this module will query how those programs would classify the message.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc
pm-jaube-prg-spamprobe.rc
pm-jaube-prg-spamoracle.rc
pm-jaube-prg-annoyance-filter.rc
pm-jaube-prg-bsfilter.rc
pm-jaube-prg-bmf.rc

Call arguments (variables to set before calling)

To activate Bayesian program(s), define path to them. Default value for all these variables is "" i.e. is is supposed that no programs have been installed or trained.

JA_UBE_BOGOFILTER_PRG, path to bogofilter program.
JA_UBE_SPAMPROBE_PRG, Path to spamprobe program
JA_UBE_BMF_PRG, Path to Bayesian Mail Filter bmf program.
JA_UBE_SPAMASSASSIN_PRG, path to spamassassin program. If daemon version is available, set this to spamc program.
JA_UBE_SPAMORACLE_PRG, path to spamoracle program.
JA_UBE_ANNOYANCE_PRG, path to annoyance-filter program. You must also set JA_UBE_ANNOYANCE_SPAM_DB to fast dictionary database location.
JA_UBE_BSFILTER_PRG, path to bsfilter program.

Optional variables to set:

JA_UBE_BOGOFILTER_OPT. Default is "-p" passthrough. Option "-e" will report exit code to procmail.
JA_UBE_SPAMASSASSIN_OPT. Default is "".
JA_UBE_SPAMASSASSIN_MAX_SIZE. Default is 256000 (256k). Spamassassin is a Perl program, which is slow at startup, so checking e.g. long attachements consumes lot of resources. Keep this value relatively small.

Important notes

All headers are canonicalized to X-Spam-<PROGRAM>- so e.g. in bogofilter's case, the default X-Bogocity header is changed to value X-Spam-Bogofilter-Status and so on. Summaries like below can then be generated:

$ egrep -i '(Subject|From|^X-Spam.*Status)' *.mbox

Return values

ERROR variable's first word is set to program that classified the spam: bogofilter, bmf (Bayesian Mail Filter), spamassassin etc. It is followed by semicolon ";" and detailed return status from the program.
ERROR_INFO is set only in bogofilter's case if it thinks the message is neither spam nor ham ("Unsure").

Usage example

PMSRC = $HOME/procmail # procmail recipe dir # ... other checks, mailing lists, work mail etc. # bogofilter and Bayesian Mail Filter available and trained. Use them. JA_UBE_BOGOFILTER_PRG = "/bin/nice -n 5 /bin/bogofilter" JA_UBE_BMF_PRG = "/bin/nice -n 5 /bin/bmf" # Call the "umbrella" module, which will take care of # all the details. INCLUDERC = $PMSRC/pm-jaube-prg-runall.rc # ERROR is set if message was spam. The "()\/" logs reason. :0 : * ERROR ?? ^()\/.+ junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-spamassassin – Interface to spamassassin program

File id

Warning

YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!

To train:

$ rm -f ~/.spamassassin/bayes* $ sa-learn $opt --local --no-rebuild --ham good.msg ... $ sa-learn $opt --local --no-rebuild --spam spam.msg ... $ sa-learn --rebuild

Overview of features

Implements interface to http://www.spamassassin.org/ project.
variable ERROR is set if message was spam.
Results are available in default headers (X-Spam-*)

Description

There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program spamassassin, which must already have been installed.

About bouncing message back

Required settings

If spamassassin program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.

JA_UBE_SPAMASSASSIN_PRG = /usr/bin/spamassassin

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_SPAMASSASSIN_PRG, path to the program
JA_UBE_SPAMASSASSIN_MIN_SIZE, minumum message size. Default is 100 bytes.
JA_UBE_SPAMASSASSIN_MAX_SIZE, maximum message size. Default is 256 000 bytes (about 256k).
JA_UBE_SPAMASSASSIN_FORCE, if set to yes then call program no matter what. Normally if there already is X-Spam-* header, it is assumed that the message has already been checked and no new checking is needed.

Return values

ERROR, is set to the return value of program if message was spam.
ERROR_INFO, is set if case is "unsure".

Usage example

PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_SPAMASSASSIN_PRG = "/usr/bin/nice -n 5 /usr/bin/bogofilter" INCLUDERC = $PMSRC/pm-jaube-prg-spamassassin.rc # The ERROR will contains reason if program classified # the message into "bad" category. :0 : * ! ERROR ?? ^^^^ junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-spamoracle.rc – Interface to Spamoracle program

File id

Warning

YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!

To train:

$ spamoracle add -v -spam good.msg ... # feed individual messages $ spamoracle add -v -good good.msg ... # feed individual messages

To test

$ spamoracle test mail.msg | less

Overview of features

Implements interface to http://freecode.net/projects/spamoracle OCaml language based Bayesian Mail program.
Variable ERROR is set to "yes" if the message was UBE.
Results are available in headers X-Spam-Spamoracle-Status, X-Spam-Spamoracle-Score, X-Spam-Spamoracle-Details and X-Spam-Spamoracle-Attachment for further analysis.

Description

Using Spamoracle as sole spam protection is inefficient, because version version 1.4 (2004-09-29) does not accept messages from stdin. Becaus of this message has to be written to a temporary file before calling Spamoracle. Later the temporary file must be removed with rm. All these three shell calls are needed for each message. If you have other detection programs, call them first to identify unsolicited Bulk Email.

About bouncing message back

Required settings

If spamoracle program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.

JA_UBE_SPAMORACLE_PRG = /usr/bin/spamoracle

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_SPAMORACLE_PRG, path to program
JA_UBE_SPAMORACLE_HEADER_PREFIX, the header name where the results are put. If not defined, no headers are added. Default value is X-Spam-Spamoracle'.
JA_UBE_SPAMORACLE_FORCE, if set to yes then call program no matter what. Normally if there already are X-Spam-Spamoracle- headers, it is assumed that the message has already been checked and no new checking is needed.
JA_UBE_SPAMORACLE_REGEXP, regexp to match for spam probability. Defaul value will match probabbility of 0.8 with 5 interesting words. The match is tried agains X-Spam-Spamoracle-Score header.

Return values

ERROR, value "yes" if JA_UBE_SPAMORACLE_REGEXP matched.
ERROR_MATCH contains detailed content of X-Spam-Spamoracle-Score header.

If headers were enabled, they will contain these values. The score's values are spam probability 0.0 - 1.0 and the degree of similarity 0-15 of the message with the spam messages in the corpus.

X-Spam-Spamoracle-Status: yes X-Spam-Spamoracle-Score: 1.00 -- 15 X-Spam-Spamoracle-Details: refid:98 $$$$:98 surfing:98 asp:95 click:93 cable:92 instantly:90 https:88 internet:87 www:86 U4:85 isn't:14 month:81 com:75 surf:75 X-Spam-Spamoracle-Attachments: cset="GB2312" type="application/octet-stream" name="Guangwen4.zip"

Usage example

PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_SPAMORACLE_PRG = "/usr/bin/nice -n 5 /usr/bin/bmf" INCLUDERC = $PMSRC/pm-jaube-prg-spamoracle.rc # The ERROR will contains word "yes" if it program classified # the message into "bad" category. :0 : * ERROR ?? yes junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube-prg-spamprobe.rc – Interface to Spamprobe program

File id

Warning

YOU CANNOT USE THIS PROCMAIL SUBROUTINE UNLESS YOU HAVE TRAINED THE BAYESIAN PROGRAM FIRST!

To train:

$ spamprobe -8 good good.msg ... $ spamprobe -8 spam spam.msg ...

Make sure there are no stale lock files, or the spamprobe and this subroutine will hang infinitely:

$ rm -f ~/.spamprobe/lock

Overview of features

Implements interface to http://freecode.net/projects/spamprobe/ project.
variable ERROR is set if the message was UBE.
Results are available by default in header X-Spam-Spamprobe-Status.

Description

There are several Bayesian based statistical analysis programs that study the message's tokens and then classify it into two categories: good or bad, or if you like, ham and spam. All the Bayesian programs are not the same, so if you want to achive magic 99.99% probability the only methodology to do that is to chain several programs in serially. There is no single program that can solve the UBE detection. This procmail subroutine implements call interface to program spamprobe, which must already have been installed.

About bouncing message back

Required settings

If spamprobe program is available, define this variable in your ~/.procmailrc. Use absolute path to make the external shell quick; it'll save server load considerably.

JA_UBE_SPAMPROBE_PRG = /usr/bin/spamprobe

Required settings

None. No dependencies to other procmail modules.

Call arguments (variables to set before calling)

JA_UBE_SPAMPROBE_PRG, path to the program.
JA_UBE_SPAMPROBE_HEADER, the header name where the results are put. If not defined, no header is added. Defaults to X-Spam-Spamprobe-Status
JA_UBE_SPAMPROBE_FORCE, if set to yes then call program no matter what. Normally if there already is X-Spam-Spamprobe-Status header, it is assumed that the message has already been checked and no new checking is needed.

Return values

ERROR, is set to the return value of spamprobe program.

Usage example

PMSRC = $HOME/procmail # procmail recipe dir <other checks, mailing lists, work mail etc.> JA_UBE_SPAMPROBE_PRG = "/usr/bin/nice -n 5 /usr/bin/spamprobe" INCLUDERC = $PMSRC/pm-jaube-prg-spamprobe.rc # The ERROR will contains word "yes" if message was spam :0 : * ERROR ?? yes junk.mbox

File layout

The layout of this file is managed by Emacs packages tinyprocmail.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-jaube.rc – Unsolicited Bulk Email (UBE) filter.

File id

Warning

Are you sure you want to use procmail for UBE?

If you think you can put this recipe as a first line of defence to your mail, you will disappoint. Checking UBE with procmail's rule based means does not work that way. The good messages must be sorted first (like your mailing lists and your important work or friend message) and only then what's left to process can be scanned by static rule based tools, like this procmail module. There are much more better tools that are based on statistical analysis of messages. You really should consider using one or combination of Bayesian tools: Spamassassin, bogofilter, spamprobe, Bayesian Mail Filter, ifile etc.

Repeat: procmail rules are not the tool to UBE control. The pattern matching rules can never keep up with the spammers. That said, if you:

Can bare a 70-80 % UBE detection rate.
Can bare 10 % false hits; you need to check you UBE folder regularly for messaged that did not belong there.
Have an account that does not get large number of UBE messages.
Or if procmail is all you have in the system.

only then consider this module or any other procmail based spam filters in that respect. So, please don't set your expectations high. Spend good time with the configuration variables and check there returned result in variable ERROR carefully. Good luck.

Overview of features

Requires procmail 3.11pre7+
You don't need external files: site block lists, the heuristics nail most of the UBE messages. Just plug in this module and you have UBE shield active.
Header based filtering: Minimum headers, Pegasus bulk mail, X-uidl validity check, bogus From-To combination,
Address based filtering: Numeric address, Invalid address (eg. me@myMarketing.global), UBE-like(friend,remove request.)
Text filtering: no html accepted, common advertising slogans, unnecessary many capitalized words, HTML message body detection,
And many more check that just not were listed here.

Remember: this is not 100% and there will always be some mishits, so don't just junk messages to /dev/null.

Description

Originally Daniel Smith posted his spam.rc, where he had gathered many tips and heuristics to filter UBE email. This filter here expresses work of many procmail users. Original filters were modified, some rules were left out that catched false email messages and made the package look a bit more general so that it could be included via INCLUDERC in the standard way.

Thanks to Daniel and others, the UBE bomb days can be reduced, when this filter is active. Some UBE messages may still lurk into the mailbox, but that's the problem with all static rule based tools.

Logging the events

A good strategy to follow incoming mail is to log the vital parts like Date, From, Subect to some log file and then a reason what happened to a message. The ~/Mail/mail.log might look like:

1997-12-08 work@example.com Extra Holiday $$$$$ [jaube; Marketing-Big-ExitCode; LEGAL, MONEY-MAKING PHENOMENON] 1997-12-09 Denizen logger@example.com [RePol] hiding 1997-12-09 david X dx@example.com Re: Send list to incoming folder 1997-12-09 david X dx@example.com Re: Send list to incoming folder 1997-12-09 OMC manager omcman@example.fi "Environments updated" [my; work-localenv] 1997-12-09 doodle@example.org Re: Gnus (Emacs Newsreader) FAQ [my; emacs; Re: Gnus (Emacs Newsreader) FAQ ]

First a UBE message that was identified and saved to folder. Next 3 messages were filed to mailing-list folders and there was no [] action displayed for them (left out due to high volume of these messages). Second Last was internal work message. Lastly someone asked somthign about Emacs.

The basic incoming message log recipe could be like this. Variable TODAY is $YYYY-$MM-$DD whose values are set after calling pm-jadate.rc. The LISTS is user set variable to exclude mailing lists whose activity is not important. Variables FROM and SUBJECT are fields read the message's headers.

BIFF = $HOME/Mail/mail.log INCLUDERC = $PMSRC/pm-jadate.rc ... :0 hwic: *$ ! $LISTS |echo "$TODAY $FROM $FSUBJECT" >> $BIFF

Here is small perl script to print summary of trapped UBE messages from a log like above. It gives nice overview which recipes catch most of the UBE messages.

perl -ne '/jaube; (\S+)/; $s{$1}++; \ END { $s = (map{$x += $_; $_= $x} values %s)[-1]; \ $i = int $s{$_}/$s *100; \ for (keys %s) { printf "$s{$_} $i $_\n" } \ }' \ mail.log | \ sort -nr

Here is sample results during two month period There are total of 3248 UBE messages catched.

count % type ------------------------------------------ 554 17 Marketing-CountBigLetterWords 457 14 Marketing 422 12 Marketing-SelectedBigLetterWords 349 10 AddrBogus-ToFrom 263 8 FromReceived-Mismatch 223 6 NoDirectAddress-ToCc 216 6 HdrForgedPegasus 164 5 AddrBogus-To 151 4 MessageId 102 3 BodyHtml 73 2 Received-IPError 63 1 Identical-FromTo 53 1 AddrInvalid 15 0 From-nslookup 9 0 HdrReceivedTime 7 0 HdrX-UIDL 4 0 Marketing-headers

About bouncing message back

Required settings

PMSRC must point to source directory of procmail code. This recipe file will include

pm-javar.rc
pm-janslookup.rc
pm-jaaddr.rc

Call arguments (variables to set before calling)

Only handful of the most important variables are described here. You really should read all the comments placed in the "user configured section" in this procmail module's code. Most of the defaults should work out of the box.

JA_UBE_VALID_ADDR, your email addresses or other valid from addresses that will say "this is mail addressed directly to you".
JA_UBE_HDR, If non-empty, a new header is added which tells which recipe was triggered. The header is not added to message, if there is nothing to report; i.e. message passed all tests.
Various flags: Some of the ube detecting recipes give more false hits than nail real ube. Experiment with yourself and turn on or off the recipes that work for the kind of ube messages you receive.
JA_UBE_MAX_BIG_WORDS, the maximum count of big letter words in the message that is tolerated. The current count 5 is rather conservative and it is suggested you to increase it to prevent trapping too many false hits. Alternatively update JA_UBE_CAPS_OK to include accepted words.
JA_UBE_APPARENTLY_TO_MAX, how many Apparently-To headers are tolerated. Default is 3.
JA_UBE_MAX_HTML_TAGS, maximum count of html tags allowed in the body.
JA_UBE_ATTACHMENT_ILLEGAL_KILL, if set to "yes" (default), then illegal attachment from body is ripped off. This is brute way to truncate the message abruptly to save mailbox space. You still see the headers for tracking, but the body is gone. The regexp to test is set in JA_UBE_ATTACHMENT_ILLEGAL_REGEXP.
JA_UBE_ATTACHMENT_SUSPECT_KILL, if set to "yes" (default "no"), kill suspectible characters in attachement filename. The regexp to test is set in JA_UBE_ATTACHMENT_SUSPECT_NAME_REGEXP.
JA_UBE_CHARSET_LEGAL, if set, accept only these character. The default value detect messages with 7bit only (english speaking. For foreign language you may want to set this something like $CHAR_7BIT_SET$CHAR_LIST_FINLAD for Finnish. See pm-javar.rc for available character sets.

Return values

ERROR_STATUS, status word of checks. Value "Good" or "Bad"
ERROR, is set to short ube trigger recipe reason
ERROR_MATCH, is set to some MATCH that happened while triggering UBE message.

Alternatively you check content of header JA_UBE_HDR which contains results of the above variables. Possible values for ERROR are:

AddrAOLinvalid AddrBogus-From AddrInvalid-From AddrInvalid-To AddrNumeric AddrNumericDomain AddrUbeLike BodyAttachment-FileIllegalAdditional BodyAttachment-FileIllegalMatch BodyAttachment-FileIllegalOther BodyAttachment-FileSuspect BodyCharacters-Illegal BodyHtml-NonMime BodyHtml-script BodyHtmlBase64 BodyHtmlImage BodyHtmlTags BodyMimeCharset-Illegal EnvelopeFrom-Invalid From-nslookup FromReceived-Mismatch HdrForgedPegasus HdrReceived HdrReceivedTime HdrX-Distribution HdrX-UIDL Header-ApparentlyTo HeaderCharacters-Illegal HeaderMimeCharset-Illegal Html-base64 Identical-FromTo Marketing-Body Marketing-CountBigLetterWords Marketing-SelectedBigLetterWords Marketing-Subject Marketing-SubjectGreeting MegaSpammer MessageId-Invalid MessageId-Empty NoDirectAddress-ToCc NotEnoughHeaders Received-IPError VirusBody VirusHeader

Usage example

# - All legimate messages should already been handled and saved before this recipe. # - Activate the filter only for messages that are not from # daemon and not from valid senders: like from "my" domain # and mailing lists and from somewhere else. VALID_FROM = "(my@address.example.com|word@here.example.com)" :0 *$ ! ^From:.*$VALID_FROM *$ ! FROM_DAEMON { # Do not add extra headers. This saves external shell call # (formail). Also do not try to kill the message content, # again saving one external call (awk). With these, the # recipe is faster and more CPU friendly. PM_JA_UBE_HDR = "" JA_UBE_ATTACHMENT_ILLEGAL_KILL = "no" INCLUDERC = $PMSRC/pm-jaube.rc # Variable "ERROR" is set if message was UBE, record error # to log file with "()\/" :0 : * ERROR ?? ()\/[a-z].* { # Don't save those *.exe, *.zip UBE attachements :0 * ERROR ?? attacment.*file /dev/null :0 : spam.mbox } }

There may be UBE messages that fool FROM_DAEMON test, so you could also use something more finer check. The standard daemon error message almost always has sentence "Transcript of session follows" in the body. This recipe says: "Unless proven otherwise, I don't believe this is daemon message even if it looked like that". Add More "2^1" checks to raise score for other valid daemon cases.

* -1^0 ^FROM_DAEMON * ! 2^1 B ?? Transcript of session follows { # ... Now call UBE checker }

File layout

The layout of this file is managed by Emacs packages tinyprocmal.el and tinytab.el for the 4 tab text placement. See project http://freecode.net/projects/emacs-tiny-tools/

Pm-javac.rc – Procmail: Vacation framework recipe (id-cache)

File id

Description

Framework for all programs that need to reply to messages only once. Usually known as "vacation" feature. If you cahnge th cache file, you can attach this recipe to any messages that you want to deal with only once.

Required settings

PMSRC must point to source directory of procmail code. This subroutine will include

pm-javar.rc

Call arguments (variables to set before calling)

JA_VAC, To activate vacation, set value to "yes"
JA_VAC_RC, When new message-id is found, run this includerc
JA_VAC_ID_CACHE, Remember to clear this file when you start the vacation.

Usage example

To turn on the vacation feature, create ~/.vac file and recipe below activates vacation. If the vacation is not active, then cache file is removed. (automatic cleanup). The VERBOSE is also turned off when you're on vacation; so that your procmail log will not get filled.

So when you go to vacation, you 'touch ~/.vac' and update ~/vacation.msg. When you come back, you 'rm ~/.vac'. That's it.

IMPORTANT: If you are subscribed to mailing lists, be sure to file messages from those services first and put the vacation recipe only after the list or bot messages. Also add sufficent "!" conditions in order not to reply to other "bot" service messages.

JA_VAC_ID_CACHE = $HOME/.pm-vac.cache :0 *$ ? $IS_EXIST $HOME/.vac { VERBOSE = off JA_VAC = "yes" JA_VAC_RC = $PMSRC/pm-myvac.rc # my vacation recipe INCLUDERC = $PMSRC/pm-javac.rc # framework } :0 E # else * ? $IS_EXIST $JA_VAC_ID_CACHE { dummy = `$RM -f $JA_VAC_ID_CACHE` }

Here is example of pm-myvac.rc recipe

# Change subject :0 fhw * ^Subject: *\/[^ ].* | $FORMAIL -I "Subject: vacation (was: $MATCH)" :0 fb # put message to body | $CAT $HOME/.vacation.msg :0 # Send it | $SENDMAIL

Pm-javar.rc – Global variable definitions

File id

Description

This file defines common variables that you can use in the recipe's condition line. Procmail does not know about escape sequences like \t or \n and it is therefore much more readable to use variables as substitute for common regular expression atoms. Pay attention that the line starts with "*$ ", where "$" expands the variables: In this file, the variable names represent the well known Perl regular expression names, so that $s is alost like Perl expression \s (whitespace) and $S is almost equivalent to \S (non-whitespace). Similarly, $d is \d (digit) and $D resembled \D (non-digit).

:0 *$ $s+something+$s+$d+$a+

The equivalent without variables (you don't see the tabs and spaces here):

:0 # Space + tab * [ ]something[ ][0-9]+[a-z]+

In addition all system dependent variables are defined in this module. For example if you have Gnu awk, it is strongly suggest that you set:

AWK = "/path/to/gawk" # in Linux, this would be /usr/bin/awk

You can define these variables before or after the module, just make sure the binaries reflect your operating system's paths. In general, if you "port" your setup to several system, dont' include absolute paths. In the other hand, if your setup is in the same place using absolute paths will speed up executions by a factor of 3 or more. (depending on how long your PATH is)

Standard variables defined

See pm-tips.txt file for full explanation or look at the source code.

SPC WSPC NSPC SPCL # Whitespace, Non Whitespace, W+linefeed \s \d \D \w \W and \a \A # perl equivalents

Special variable JA_FROM_DAEMON

In order to boost procmail and to save extra CPU cycles, this module defines variable JA_FROM_DAEMON that caches the information of ^FROM_DAEMON. You can refer to JA_FROM_DAEMON as you would to big brother FROM_DAEMON. This has the advantage that procmail has already computed the result and the variable JA_FROM_DAEMON is used as a cache, thus avoiding repeated FROM_DAEMON regexp tests, which are expensive. Variable JA_FROM_DAEMON_match contains "" or the result of matched daemon text.

*$ $JA_FROM_DAEMON

or the familiar

*$ ! $JA_FROM_DAEMON

Instead of using the regexp parsing with

* ^FROM_DAEMON

and

* ! ^FROM_DAEMON

Special variable JA_FROM_MAILER

Works like JA_FROM_DAEMON variable but in respect to FROM_MAILER. The matches text is in JA_FROM_MAILER_MATCH

Usage example

For your .procmailrc, you can simply put this, because you want to load the variables at startup

PMSRC = "/path/to/install/location/of/this/library" INCLUDERC = $PMSRC/pm-javar.rc

If you're developing your own modules that use these variables put these lines at the beginning. ~/.procmailrc. It checks if WSPC variable does not include a space --> load the variable definitions. If the variable is already defined, the file is not loaded. The test line is something alike #ifdef – #endif in C/C++ language or a conditional "import" command in other languages.

:0 * ! WSPC ?? [ ] { INCLUDERC = $PMSRC/pm-javar.rc }

Defined modules

After this file loads, you can refer to any module with $RC_JA_MODULE. E.g. to call email spit module in your code you would use following. See at the end of this file for all defined module names.

INCLUDERC = $RC_JA_UBE