SpamAssassin False Positives

List overview All Threads
Download

newer

older

A couple of interesting articles...

Syleham Meeting Sunday 29th

Nick Heppleston

16 Feb 2004 16 Feb '04

11:14 a.m.

Morning ALUGers! I've got an interesting one with false positives in SpamAssassin and I'm hoping someone here might have used or even had the same problem.

I have a client who runs a precision instrument business for the scientific industry. They have a number of clients out in Japan and legitimate e-mails from them are being flagged as spam.

Our default score for spam is 4 and these messages are coming in ranked at 4.495, hence they are being flagged as spam. I'd be happy to change the setting to mark all messages with a score higher than 5 as spam for this client, but that wouldn't stop a small number of spam messages (10) I've received today alone in my own Inbox.

We have identified some phrases we believe might be causing the high ranking but I don't actually know how to train SA that these phrases are ok and shouldn't be marked as spam. Apart from increasing (or is it decreasing) the spam ranking for the clients domain to 5, I can't see a way around this.

Any suggestions?

Nick

-- Nick Heppleston 07989 581766 | nickheppleston@gmx.co.uk

Show replies by date

Graham Trott

16 Feb 16 Feb

8 p.m.

On Monday 16 February 2004 11:13, Nick Heppleston wrote:

...

Morning ALUGers! I've got an interesting one with false positives in SpamAssassin and I'm hoping someone here might have used or even had the same problem.

I have a client who runs a precision instrument business for the scientific industry. They have a number of clients out in Japan and legitimate e-mails from them are being flagged as spam.

Our default score for spam is 4 and these messages are coming in ranked at 4.495, hence they are being flagged as spam. I'd be happy to change the setting to mark all messages with a score higher than 5 as spam for this client, but that wouldn't stop a small number of spam messages (10) I've received today alone in my own Inbox.

We have identified some phrases we believe might be causing the high ranking but I don't actually know how to train SA that these phrases are ok and shouldn't be marked as spam. Apart from increasing (or is it decreasing) the spam ranking for the clients domain to 5, I can't see a way around this.

Any suggestions?

Nick

Although I have SpamAssassin working with postfix I can't figure how to train it. So I'm using bogofilter on the client, as a KMail filter to catch the increasing amount of stuff SpamAssassin lets through. Bogofilter is easy to train (just throw a couple of KMail folders at it), catches most stuff and hasn't yet given me a single false positive. I can thoroughly recommend it as a client-side solution.

Nick's message prompted me to look see if I could use it on the server in place of SpamAssassin. A quick search came up with

http://cvs.sourceforge.net/viewcvs.py/bogofilter/bogofilter/doc/ integrating-with-postfix?rev=1.3

I intend to have a go with it - has anyone else already done so?

-- GT

Chris Glover

9:20 p.m.

To train spamassassin

type

sa-learn --spam --mbox <mboxfilename>

sa-learn --ham --mbox <mboxfilename>

This works with stanadard unix mail spool files (/var/spool/mail). It supports other types as well.

man sa-learn is your friend.

HTH

Chris

-- Chris ---------------------------------- E Mail chris@glovercc.clara.co.uk On Mon, 16 Feb 2004, Graham Trott wrote: > On Monday 16 February 2004 11:13, Nick Heppleston wrote: > > Morning ALUGers! > > I've got an interesting one with false positives in SpamAssassin and I'm > > hoping someone here might have used or even had the same problem. > > > > I have a client who runs a precision instrument business for the > > scientific industry. They have a number of clients out in Japan and > > legitimate e-mails from them are being flagged as spam. > > > > Our default score for spam is 4 and these messages are coming in ranked > > at 4.495, hence they are being flagged as spam. I'd be happy to change > > the setting to mark all messages with a score higher than 5 as spam for > > this client, but that wouldn't stop a small number of spam messages (10) > > I've received today alone in my own Inbox. > > > > We have identified some phrases we believe might be causing the high > > ranking but I don't actually know how to train SA that these phrases are > > ok and shouldn't be marked as spam. Apart from increasing (or is it > > decreasing) the spam ranking for the clients domain to 5, I can't see a > > way around this. > > > > Any suggestions? > > > > Nick > > Although I have SpamAssassin working with postfix I can't figure how to train > it. So I'm using bogofilter on the client, as a KMail filter to catch the > increasing amount of stuff SpamAssassin lets through. Bogofilter is easy to > train (just throw a couple of KMail folders at it), catches most stuff and > hasn't yet given me a single false positive. I can thoroughly recommend it > as a client-side solution. > > Nick's message prompted me to look see if I could use it on the server in > place of SpamAssassin. A quick search came up with > > http://cvs.sourceforge.net/viewcvs.py/bogofilter/bogofilter/doc/ > integrating-with-postfix?rev=1.3 > > I intend to have a go with it - has anyone else already done so? > > -- GT > > -- GT > > > _______________________________________________ > main@lists.alug.org.uk > http://www.alug.org.uk/ > http://lists.alug.org.uk/mailman/listinfo/main > Unsubscribe? See message headers or the web site above! > > >

Nick Heppleston

17 Feb 17 Feb

10:47 a.m.

Many thanks for the feedback. Unfortunately, they forward all their mail onto a second SMTP server - Exchange!! So they're not holding the mail in mbox format.

Any suggestions given this quirk?

Nick

On Mon, 2004-02-16 at 21:19, Chris Glover wrote:

...

To train spamassassin

type

sa-learn --spam --mbox <mboxfilename>

or

sa-learn --ham --mbox <mboxfilename>

This works with stanadard unix mail spool files (/var/spool/mail). It supports other types as well.

man sa-learn is your friend.

HTH

Chris

-- Nick Heppleston 07989 581766 | nickheppleston@gmx.co.uk

adam＠thebowery.co.uk

11:49 a.m.

On Tue, Feb 17, 2004 at 10:46:02AM +0000, Nick Heppleston wrote:

...

Many thanks for the feedback. Unfortunately, they forward all their mail onto a second SMTP server - Exchange!! So they're not holding the mail in mbox format.

Any suggestions given this quirk?

You need to do 3 or 4 things,

First off, you should perhaps only ditch mail with a score about 10 in spamassassin, just to make sure that you are getting no false positives. Then you want to send all spam on to the end user with a score between 4-10 with a big subject of "POSSIBLE SPAM" or whatever so that they can sort or filter this mail themselves. Then you want to create 2 accounts on the Exchange server, notspam@ and spam@ so that people can forward mail to these accounts which then bounces them back into the mail system at the beginning, notspam@ forwards to something that removes the spamassassin tagging and learns the mail as ham and obviously the other one just learns that it is spam and ditches the mail.

Oh, and I guess that spamassassin is picking up on emails from japan because they are written in kanji or something? so giving them a higher spamassassin score, if you look at the tests that picked up on if the mail is spam or not you could configure spamassassin to give a lower score for these tags.

Adam

-- jabberid = quinophex@jabber.earth.li AFFS || http://www.affs.org.uk/ || Not a filesystem

7860

Age (days ago)

7861

Last active (days ago)

main@lists.alug.org.uk

4 comments

4 participants

tags (0)

participants (4)

adam＠thebowery.co.uk
Chris Glover
Graham Trott
Nick Heppleston