Skip navigation.

Email Filtering

 

There are many different ways you may wish to process your incoming email. For instance, you may wish to put all email from a certain address directly into a mail folder rather than seeing it in your inbox; or to run all email from some other address through a text processing script you have written before saving it to a folder; or to pick out email that has a certain phrase in the Subject, and forward it to someone else as well as keeping a copy for yourself. One key thing you may want to do is pick out mail that is likely to be spam (junk mail). Another common use is to automatically respond to email whilst on vacation.

Filtering Spam (Junk Mail)

Many mail systems now pass all mail through a filter which adds tags indicating various warnings. You can use these tags as a means to filter likely spam into one (or more) separate folders. Once in the separate folder you can check it regularly and delete unwanted junk quickly.

All email is tagged with a rule based system run by OUCS that matches well know spam words/characteristics and constructs a score as to how `spammy' it thinks the message is. Mail can then be filtered into one (or more) folders based on the score.

This type of system has no learning mechanism (beyond software upgrade) so if a message is misclassified there is nothing to immediately correct the error for subsequent similar messages.

Setting Up Filtering

Many mail clients have options to setup mail filtering (e.g. pine, mozilla, thunderbird, evolution, balsa, outlook, eudora). There are also separate applications that can be used to do the filtering (e.g. procmail) regardless of the mail client you use. Using such a generic solution has the benefit that if you change mail client or use more than one client your mail filtering still works without any extra work. Also procmail filters the emails as they arrive whereas an email client has to perform the filtering of all new messages when it is started which can be slow.

If you read through the procmail examples you should get an understanding of what you can filter on and then you can choose whether to setup mail client specific filters or use a generic solution such as procmail.

Using Procmail (UNIX/Linux systems)

The file where you put instructions for procmail to follow is called .procmailrc (note the "."), and must live in your home directory. To make sure that procmail is called when an email arrives for you, you need to have a file (again in your home directory) called .forward, containing the line

"|/usr/bin/procmail"

If you already have some things in your .forward file for the purposes of simple email forwarding, note that it's OK to have the line for procmail in there too (although you may prefer to have procmail do the forwarding too, see last example).

The syntax of the .procmailrc file is described in full in the procmailrc man page, and examples are given in the procmailex man page.

It is recommended that you always test any new rule/recipe you add to confirm they work as you expect.

Note: In order for the mail server software to be able to read your .procmailrc file it needs access to your home directory. You can do this with the command:

chmod 711 ~/
For further information on setting access permissions see the page on setting file and directory permissions.

Procmail file beginning

A .procmailrc file typically begins

#Next two lines are generally required
MAILDIR=/var/mail/$LOGNAME/
DEFAULT=/var/mail/$LOGNAME/
LOGFILE=$HOME/.procmail-log

This instructs procmail which directory in your account you keep your mail folders in (e.g. Mail) and a file in which to log what it is doing.

After these two lines you typically add one or more rules to filter mail as required. See example rules below.

Example procmail filter rules

Note: Recipes are run in order so it is important to put say a spam filter before a vacation responder so you only respond to legitimate mail. In fact in most cases you will want any spam filter first.

Simple match on an address in From field
#Note this is a zero on next line
:0H:
* ^From:.*manager [-at-] mybank [dot] com
.Finance/

This rule matches any header line that beings From: followed by any number of characters (.*) followed by the address manager [-at-] mybank [dot] com

The matching messages are stored in the folder called Finance within the MAILDIR as defined at the start of the file.

Simple match on Subject field
:0H:
* ^Subject:.*Skiing
.Skiing/

This matches any message with the word Skiing in the Subject header and stores it in a folder called Skiing.

Match on From and Subject fields
:0H:
* ^From:.*manager [-at-] work [dot] com
* ^Subject:.*pay
.Salary/

This matches any message with the word pay in the Subject header and coming from manager [-at-] work [dot] com. It stores it in a folder called Salary.

Simple match on message body
:0B:
* .*computer
.IT/

Unlike the previous examples this matches the boby (OB) rather than headers (OH). It matches any message that contains the word computer in the body and stores it in a folder called IT.

Spam Filter
:0H:
* ^X-Oxmail-Spam-Level: \*\*\*\*\*
.Spam-jail/

The rule matches the X-Oxmail-Spam-Level header that indicates it is probably spam and saves the message to a folder called Spam-jail for you to inspect later. Typically you want this rule to appear first to trap all spam before doing anything else!

The filter gives a score in stars and so you can choose to match more or less stars according to how effective you feel it is. The rule above matches messages with a score of 5 or more stars.

Multi-level Spam Filter
# OXMAIL Spam filter High
:0H:
* ^X-Oxmail-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
.Spam-jail-15/

# OXMAIL Spam filter Medium
:0H:
* ^X-Oxmail-Spam-Level: \*\*\*\*\*\*\*\*\*\*
.Spam-jail-10/

# OXMAIL Spam filter Normal
:0H:
* ^X-Oxmail-Spam-Level: \*\*\*\*\*
.Spam-jail-05/

Using the spam score you could filter email into more than one folder depending on how highly it scores. The above rules filter mail with 15 or more stars into one folder, 10 to 14 another and 5 to 9 another.

Tests using this filter show that almost all email scoring 10 or more is spam.

You might use this form of separation to allow you to check the lower scoring spam more frequently for misclassified email whereas you only check the high scoring one weekly or monthly.

In principle you can instruct procmail to delete the message directly. You could use this approach to delete very high scoring spam only, e.g. replace the 15 score rule above with

# OXMAIL Spam filter High
:0H:
* ^X-Oxmail-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
/dev/null

If you do this the only trace of the messages will be the log entries in the procmail log file. If a message is incorrectly deleted you cannot get it back!

Vacation Responder
SHELL=/bin/sh

:0 Whc: $HOME/.vacation.lock
* $^(To:.*$LOGNAME|CC:.*$LOGNAME)
* !^FROM_DAEMON
* !^List-
* !^(Mailing-List|Approved-By|BestServHost|Resent-(Message-ID|Sender)):
* !^Sender: (.*-errors@|owner-)
* !^X-[^:]*-List:
* !^X-(Authentication-Warning|Loop|Sent-To|(Listprocessor|Mailman)-Version):
* !$^From +$LOGNAME(@| |$)
| /usr/bin/formail -rD 8192 $HOME/.vacation.cache

:0 ehc
| (/usr/bin/formail -rI"Precedence: junk" \
 -A"X-Loop: $LOGNAME [-at-] maths [dot] ox [dot] ac [dot] uk" ; \
 cat $HOME/.vacation.msg ) | $SENDMAIL -t

This is a pair of rules that can act as a vacation auto responder in the same way as the vacation program. Using this procmail rule after the spam filter rule instead of the standard vacation program means an autoresponse will only be sent to messages not tagged as spam. All you need to do is write a .vacation.msg file as the response (see vacation message page). Remember when you return to comment out this rule and delete the address cache file .vacation.cache.

Autorespond to all incoming mail without saving it
SHELL=/bin/sh

:0 h
* $^(To:.*$LOGNAME|CC:.*$LOGNAME)
* !^FROM_DAEMON
* !^List-
* !^(Mailing-List|Approved-By|BestServHost|Resent-(Message-ID|Sender)):
* !^Sender: (.*-errors@|owner-)
* !^X-[^:]*-List:
* !^X-(Authentication-Warning|Loop|Sent-To|(Listprocessor|Mailman)-Version):
* !$^From +$LOGNAME(@| |$)
| (/usr/bin/formail -rI"Precedence: junk" \
 -A"X-Loop: $LOGNAME [-at-] maths [dot] ox [dot] ac [dot] uk" ; \
 echo "Mail delivery suspended until 15th June 2004."; \
 echo "Please resend after this date.") | $SENDMAIL -t

This is a variation on a vacation responder which replies to all messages asking people to resend when you are back. No copy of the message is saved. This may be useful to stop mail building up while you are away.

Forward copies of emails below a certain size to another address
:0c
* < 4000
! me [-at-] other [dot] address

This forwards a copy of any message less than 4000 bytes to the address me [-at-] other [dot] address while delivering the original as normal. One might place this rule after the spam filter rule and before any other rules to forward short emails to a second account while away from the department.

If you leave out the size check line (* < 4000) then the rule would forward all messages that have not been matched by a previous rule. This might be useful if you want to forward messages to another account after first filtering out the spam.

Mail client specific filter setup

  • Pine - press S to enter the setup menu and then press R for Rules. Next press F to enter the filter rules settings.
  • Mozilla Thunderbird - The rules setup is found under the Tools -> Message Filters menu item.
  • Outlook/Outlook Express - The rules setup is found under Tools in the menu bar.