Why just another greylist for exim?
        How does it work?
        Pitfalls for IP based greylisting
        Efficiency
        ACL entry
        The Perl script itself
        Performance
        Maintenance
        Download
        Open Issues
        Other known issues

Looking for the "old" version (based on files in some directory structure)? I moved it to grey-sh.

Why just another greylist for exim?

There are a lot greylist implementations around. (e.g. a greylist daemon: http://www.google.com/search?q=greylistd).

I just needed a really simple one. And a "one size fits all" implementation.

My first attempt was a solution based on files in the local file system (see grey-sh). After it ate all inodes of my secondary MX (having a 2GB filesystem only) I started playing with Exim's ability to incorporate Perl functions. I know that there's

I'm quite sure that the current (Perl based) solution is almost as simple as the older one, though it's slower (see Performance).

And yes, I thought about SQL based versions too. MySQL seemed to be to much effort to setup (because it should be really simple) and SQLite is not shipped as part of the Exim packages compiled for Debian GNU/Linux currently. And, another point, the configuration wouldn't read as clean as it is now.

How does it work?

I implemented a small Perl function which stores its data in a DBM file. Each greylisted item (key) gets two associated timestamps (values). One timestamp for the time the item was inserted and the other for the time the item was used the last time. This latter timestamp is for cleaning purpose only.

The unseen function returns "yes" if the item is seen the first time or if there didn't pass enough time since it was seen the first time. In all other cases it returns "no". This greylist works on whatever items you want, preferrably on sender ip address (in $sender_host_address) or sender address (in $sender_address) or on some tuples (sender address + recipient address).

Pitfalls for IP based greylisting

If you expect to get mails from hosts with changing IP this greylist may not be for you. But you may consider using the senders (mail) address.

If you expect to get mails from providers with a large amount of outgoing mail servers it may take a while until your greylist learned about all of them.

If you're hosting several domains you should expect a lower efficency, because it can happen that the sending address will contact you again, not as part of retrying but just for some other domain. To avoid this you may use a tuple ($sender_host_address plus $domain for greylisting). But I do not know if all MTA can handle temporary errors on just some recipient addresses.

Update (6 Feb 07): I've dropped IP based greylisting, since I saw in real live that e.g. web.de uses different outgoing servers each time for attempts sending the same message.

Efficiency

Here some live statistic for a combined (sender address, recipient address) based greylist (updated once per hour)

ACL entry

It's up to you where your greylist rule gets applied. The HELO test may be a good choice. But if you've trusted hosts (authenticated), then you might give them a chance to authenticate first. For this reason I use this rule in the MAIL test:

acl_check_rcpt: accept authenticated = * defer condition = ${perl{unseen}{<$sender_address>:<$local_part@$domain>}} log_message = greylisted ($sender_host_address) ...

For passing more parameters (grey time, DBM-File) please read the comments in the script.

The Perl script itself

It's quite simple, straight forward and hopefully without any considerable bugs. You'll find it here: https://gitea.schlittermann.de/IUS/libexim-grey-perl

You have to load the script into your Exim. Check if your Exim is built with Perl support (exim -v -bV should show you "Perl" among some other features). For further information follow the linked docs: https://gitea.schlittermann.de/IUS/libexim-grey-perl#documentation

Performance

Sure, there is a performance hit, because as soon as Exim starts evaluating the ${perl{...}} statement, it has to load/link the Perl libraries. On my system I got about 30ms for loading plain Exim and about 130ms for loading Exim plus the Perl library. (See exim-lookup-benchmark.txt for some more comparisons.)

But if you see that you have to scan about 80% less messages for containing spam or viruses, you will survive with these 100ms :) I think there is some space for optimization, but this is left as an exercise to the reader.

Maintenance

At the above URL (or in the package) you'll find an exigrey script, it can be used for listing your items, creating some statistics and for cleaning up. (Try --help.)

# exigrey --list
66.135.209.199  :       1167598436 1167600030 (1594s 2006-12-31 21:53:56 2006-12-31 22:20:30)
85.221.200.142  :       1167598622 1167598622 (   0s 2006-12-31 21:57:02 2006-12-31 21:57:02)
62.57.33.50     :       1167598674 1167598674 (   0s 2006-12-31 21:57:54 2006-12-31 21:57:54)
...

# exigrey --stats
...

# exigrey --cleanup 7
....

Download

There is almost nothing to download. You may access the SVN repository with your favorite SVN client or with some web browser directly: https://svn.schlittermann.de/pub/exigrey/trunk/. Everything you need from there is the script exim-exigrey.pl and the exigrey utility for the maintenance.

Or you might download the current snapshot (updated once per hour) from here: http://pu.schlittermann.de/var/exigrey.tar.gz.

A Debian package is available too (it's far from being perfect, but you might get the idea). If there is more demand, I'd give the package some finish (docs, examples etc.) The packaged version is available from: http://apt.schlittermann.de/debian-ius/pool/exigrey/.

Open Issues

~~locking~~ solved!

Other known issues

Some issue, not specifically to this approach, not even to greylisting, but to returning temporary errors: some Exchange 2003 seem to have problems with 4xx responses. (I do not want to discuss here, if Exchange is a useful MTA platform, if it gets confused by 4xx responses. More I'd like to "kill" these Exchange admins, when they try to explain why greylisting is a bad thing. They do not understand, that temporary errors can happen at *any* time, not just generated by greylisting.) But you may enjoy the following article: http://blogs.technet.com/dmelanchthon/archive/2007/07/19/probleme-mit-greylisting.aspx