POPFile Spam Filtering for Fetchmail on Ubuntu
The power outage earlier in the week managed to coincide with an email server problem. I worked through the original problem, but brought attention to the email server and made me think about the things that I had on my fix-this-when-you-can list.
Attention is often like that. I've said repeatedly about marketing that if you get the attention of your customers, you'd better be prepared for it. If they've been irritated with your service and you send them a flyer or a bill, you've just given them a reminder to call and complain.
Anyway, since moving to the Linux mail server, I haven't had POPFile doing my spam filtering, and I've really missed it. I used to run it on my Windows mail server, but just never got around to setting it up on the Linux one.
So, last night, I decided to set it up. There were a couple of tricks to it that I thought were worth sharing.
First, a description of my setup going into this. The mail server itself is a VMWare virtual machine, running Ubuntu Server (which means no GUI, which is important later). Fetchmail grabs email from a bunch of POP3 mailboxes and funnels them into a Courier IMAP mailbox for me.
Since POPFile is implemented as a POP3 proxy, that Fetchmail retrieval was to be the injection point for getting this into the existing setup.
First, I installed POPFile, which is available via aptitude:
sudo aptitude install popfile
That installs the tool, as well as it's web interface on port 7070 (check the docs as this port has changed in the past). I opened a browser on the host machine and got a "connection refused". I even tried to forward ports in and still got the message. I had to think about that for a while before I realized what was going on.
See, by default, POPFile only allows "local" access to the admin panel by default. That's a problem, since it's a web-based admin panel.
So, to get around that, I installed the "links" commandline browser:
sudo aptitude install links
I opened the control panel at http://localhost:7070 in links and changed the setting:
Accept HTTP (User Interface) connections from remote machines (requires POPFile restart)
to "Yes" and rebooted the server.
I was then able to open the panel from http://192.168.0.13:7070 instead from any computer on my network. That enabled me to set up my buckets (plenty of docs on the POPFile site for setting up this stuff) and get POPFile ready to start filtering.
I also shut off the subject modification for all of the buckets. I've seen more than once where someone replies to a message and the subject now reads: "[junk] Are You Coming To The Project Meeting?" and someone gets upset.
Fortunately, if you leave the other indicators on, you can use the header that POPFile adds: X-Text-Classification for any client-side filtering you may need to do.
At any rate, it was time to switch over my Fetchmail retrieval.
In my home directory, the .fetchmailrc file contains the entries for my POP3 accounts. What you basically do is to point it to the POP3 proxy as the server instead of the real one and add the real server information to the username. New entries look like this:
poll 127.0.0.1 port 7071 protocol pop3 username "mail.wynia.org:j@wynia.org" password "PASSWORD"
That's it. The next time you run fetchmail manually or from a cron job, it should start pulling through POPFile and you'll see the messages show up in the history tab.
I recommend letting it run for several hundred messages without doing anything else. All of those messages will go through as "unclassified". Then, you can spend 15 minutes going through and classifying all of them at once.
This cuts down dramatically on the wild swings between everything being marked as spam and nothing being marked as spam that can happen otherwise as you start using it.
So far, it's chugging along great and I'm back to dealing with much less spam again.


August 17th, 2007 at 7:28 am
J Wynia - Alex here…just reaching out as it occurred to me you might be interested in Bungee Connect….I know you like experimenting with stuff, so interested in your take.
Let me know if you're interested, and I'll get you a beta key.
Regards,
Alex.