On Fri, Jun 25, 2004 at 07:36:09PM +0100, Graham wrote:
On Friday 25 June 2004 19:04, Dennis Dryden wrote:
I'm getting about 98% spam at the moment and i was wondering if there was a quick and easy way to setup something some spam filtering software(like spam assassin). Will it involve setting up sendmail or anything scary like that?
What distro are you using? and what mailserver is installed by default on it? also how do you get your email? do you download it via pop3? Imap? or some other method? To answer the question effectively it would be handy to know these things. Although, you have a few options, the first is to integrate it with your mail delivery agent, the second is to run it from a procmail (or similar) configuration when you collect your email and the final option would be to have your mail client run the program when it collects the email (I don't know if your software has this option).
If you're an 'ordinary' home user the simplest approach is to spam-bin anything not coming from someone in your address book. Quite easy to do if you're using KMail - details on request.
That sounds a bit crazy to me. Quite often I get email from people who are not in my address book...
I find Bayesian filters such as SpamAssassin and Bogofilter remove only about 80% of spam (I'm guessing - haven't checked lately), which still leaves a lot. If they can do better than that they're liable to tag wanted mail as spam from time to time, which can be inconvenient because it means wading through the stuff from time to time. Nonetheless it's worth adding one to do some of the grunt work. As with the above, you can add this to KMail on a client machine, or if you run your own mailserver you can do it there.
I find with a well trained spamassassin I get 0 false positives and perhaps 1 spam a day gets through, I just checked my spam folder and it works out that I get on average 219 spams a day so you are looking at a false positive rate of 0% and a failure rate of less than .5% (this has been the case for more than the 2 years I have been running spamassassin.
What I tend to do is bin anything with a score over 10 in spamassassin straight into the spam folder as I have /never/ had a false positive that high (I could probably lower that value also). Anything with a score between 5-10 gets put into another folder that gets checked once a day. I would say that the best (and probably only) way to install spamassassin is to spend a month collecting and saving *all* of your spam and ham email. Then after this month get spamassassin running and immediately train it on the spam/ham lexicon (my spam lexicon now goes back two years so I can retrain it effectively or any other bayesian anti-spam I wish to try, the current archive is about 100 megs with bzip compression), when I originally ran it without doing this first I will admit that it did capture quite a bit of ham (mainly newsletters and the like, as they contained lots of spammy like features). The other thing you can of course do is white-list things so that they never get eaten if you get problems also.
adam