Hello,
I might need to join the spamassassin mailing list for this Q, but just in case anyone here can help first:
I've got a mail set-up where exim4 hands mail to spamassassin before delivering to mailboxes local on the server. Users put any missed spam into missed-spam folders, and misfiled ham into missed-ham folders, and a cron job runs regularly to allow sa-learn to run learning from these folders.
The problem is - it says it is learning, but it isn't. It is letting through handfuls of the same spam over and over. It's as if SA is running without paying any attention to the bayes-db, which would be weird as that is what I thought was a core integral part of spamassassin. Am I missing something in the basic setup?
An example header of a missed spam shows something like:
X-Spam_report: Spam detection software, running on the system "example.co.uk", has NOT identified this incoming email as spam.
-then a few lines further down:
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.0000] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.0000]
Any ideas? It's driving me nuts.
Thanks, Jenny
On 25/02/2021 08:47, Jenny Hopkins wrote:
Hello,
I might need to join the spamassassin mailing list for this Q, but just in case anyone here can help first:
I've got a mail set-up where exim4 hands mail to spamassassin before delivering to mailboxes local on the server. Users put any missed spam into missed-spam folders, and misfiled ham into missed-ham folders, and a cron job runs regularly to allow sa-learn to run learning from these folders.
The problem is - it says it is learning, but it isn't. It is letting through handfuls of the same spam over and over. It's as if SA is running without paying any attention to the bayes-db, which would be weird as that is what I thought was a core integral part of spamassassin. Am I missing something in the basic setup?
An example header of a missed spam shows something like:
X-Spam_report: Spam detection software, running on the system "example.co.uk", has NOT identified this incoming email as spam.
-then a few lines further down:
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.0000] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.0000]
Any ideas? It's driving me nuts.
Hi!
Sometimes incoming mail has fake Spamassassin headers to try and fool you that it's not spam. I don't suppose that's the case in this case.
in /etc/spamassassin
has v320.pre got loadplugin Mail::SpamAssassin::Plugin::Bayes
has v310.pre got
# AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning # loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
has local.cf got
use_bayes 1 bayes_auto_learn 1
if so then I guess it should work.
I seem to remember there's som sort of "gotcha" about global or per-user filtering. As the main user of email on this machine, I'm really the only one reporting spam & ham. Spamassassin runs as a daemon as root.
I report spam using something similar to
cd /home/USERACCOUNT/mail sudo sa-learn --mbox --spam SPAM_TRAINING_FOLDER
(replace mbox with other parameter if not using mbox format)
Because I've sudo-ed it, it goes to the root's training database.
I hope that's of help. If not, good luck!
Steve
On Thu, 25 Feb 2021 at 17:58, steve-ALUG@hst.me.uk wrote:
On 25/02/2021 08:47, Jenny Hopkins wrote:
Hello,
I might need to join the spamassassin mailing list for this Q, but just in case anyone here can help first:
I've got a mail set-up where exim4 hands mail to spamassassin before delivering to mailboxes local on the server. Users put any missed spam into missed-spam folders, and misfiled ham into missed-ham folders, and a cron job runs regularly to allow sa-learn to run learning from these folders.
The problem is - it says it is learning, but it isn't. It is letting through handfuls of the same spam over and over. It's as if SA is running without paying any attention to the bayes-db, which would be weird as that is what I thought was a core integral part of spamassassin. Am I missing something in the basic setup?
An example header of a missed spam shows something like:
X-Spam_report: Spam detection software, running on the system "example.co.uk", has NOT identified this incoming email as spam.
-then a few lines further down:
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.0000] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.0000]
Any ideas? It's driving me nuts.
Hi!
Sometimes incoming mail has fake Spamassassin headers to try and fool you that it's not spam. I don't suppose that's the case in this case.
in /etc/spamassassin
has v320.pre got loadplugin Mail::SpamAssassin::Plugin::Bayes
has v310.pre got
# AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning # loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
has local.cf got
use_bayes 1 bayes_auto_learn 1
if so then I guess it should work.
I got very excited when I saw a reply, but unfortunately the answer to all of the above is yes.
I seem to remember there's som sort of "gotcha" about global or per-user filtering. As the main user of email on this machine, I'm really the only one reporting spam & ham. Spamassassin runs as a daemon as root.
I report spam using something similar to
cd /home/USERACCOUNT/mail sudo sa-learn --mbox --spam SPAM_TRAINING_FOLDER
(replace mbox with other parameter if not using mbox format)
Because I've sudo-ed it, it goes to the root's training database.
Hmm - I have a feeling my sa-learn is running as the user of the mailbox. I'll check that out in the morning.
I hope that's of help. If not, good luck!
I'll let you know the outcome of changing sa-learn to running as sudo.
Many thanks, Steve!
Jenny
On 25/02/2021 18:42, Jenny Hopkins wrote:
On Thu, 25 Feb 2021 at 17:58, steve-ALUG@hst.me.uk wrote:
On 25/02/2021 08:47, Jenny Hopkins wrote:
Hello,
I might need to join the spamassassin mailing list for this Q, but just in case anyone here can help first:
I've got a mail set-up where exim4 hands mail to spamassassin before delivering to mailboxes local on the server. Users put any missed spam into missed-spam folders, and misfiled ham into missed-ham folders, and a cron job runs regularly to allow sa-learn to run learning from these folders.
The problem is - it says it is learning, but it isn't. It is letting through handfuls of the same spam over and over. It's as if SA is running without paying any attention to the bayes-db, which would be weird as that is what I thought was a core integral part of spamassassin. Am I missing something in the basic setup?
An example header of a missed spam shows something like:
X-Spam_report: Spam detection software, running on the system "example.co.uk", has NOT identified this incoming email as spam.
-then a few lines further down:
3.5 BAYES_99 BODY: Bayes spam probability is 99 to
100% [score: 1.0000] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.0000]
Any ideas? It's driving me nuts.
Hi!
Sometimes incoming mail has fake Spamassassin headers to try and fool you that it's not spam. I don't suppose that's the case in this case.
in /etc/spamassassin
has v320.pre got loadplugin Mail::SpamAssassin::Plugin::Bayes
has v310.pre got
# AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning # loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
has local.cf got
use_bayes 1 bayes_auto_learn 1
if so then I guess it should work.
I got very excited when I saw a reply, but unfortunately the answer to all of the above is yes.
I seem to remember there's som sort of "gotcha" about global or per-user filtering. As the main user of email on this machine, I'm really the only one reporting spam & ham. Spamassassin runs as a daemon as root.
I report spam using something similar to
cd /home/USERACCOUNT/mail sudo sa-learn --mbox --spam SPAM_TRAINING_FOLDER
(replace mbox with other parameter if not using mbox format)
Because I've sudo-ed it, it goes to the root's training database.
Hmm - I have a feeling my sa-learn is running as the user of the mailbox. I'll check that out in the morning.
I hope that's of help. If not, good luck!
I'll let you know the outcome of changing sa-learn to running as sudo.
Many thanks, Steve!
Quick google If you want to run it as a per-user system, this post suggests how https://www.nesono.com/node/391
Whereas this starts from by bayes is not working https://stackoverflow.com/questions/42707466/spamassassin-bayes-not-working
It suggests checking the logs for errors, and points to, in the case discussed, spamassassin components are running with insufficient permissions. This is plausible. Log file checking is always a good place to start!
Good luck.
Steve
Hi
Can you post a set of message headers here, anonymised all I'm really interested in is the headers related to spamassassin etc.
It looks like you might not be getting a high enough score for the message to be treated as spam. Do you get a total score ?
On 25/02/2021 18:43:35, Jenny Hopkins hopkins.jenny@gmail.com wrote: On Thu, 25 Feb 2021 at 17:58, wrote:
On 25/02/2021 08:47, Jenny Hopkins wrote:
Hello,
I might need to join the spamassassin mailing list for this Q, but just in case anyone here can help first:
I've got a mail set-up where exim4 hands mail to spamassassin before delivering to mailboxes local on the server. Users put any missed spam into missed-spam folders, and misfiled ham into missed-ham folders, and a cron job runs regularly to allow sa-learn to run learning from these folders.
The problem is - it says it is learning, but it isn't. It is letting through handfuls of the same spam over and over. It's as if SA is running without paying any attention to the bayes-db, which would be weird as that is what I thought was a core integral part of spamassassin. Am I missing something in the basic setup?
An example header of a missed spam shows something like:
X-Spam_report: Spam detection software, running on the system "example.co.uk", has NOT identified this incoming email as spam.
-then a few lines further down:
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.0000] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.0000]
Any ideas? It's driving me nuts.
Hi!
Sometimes incoming mail has fake Spamassassin headers to try and fool you that it's not spam. I don't suppose that's the case in this case.
in /etc/spamassassin
has v320.pre got loadplugin Mail::SpamAssassin::Plugin::Bayes
has v310.pre got
# AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning # loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
has local.cf got
use_bayes 1 bayes_auto_learn 1
if so then I guess it should work.
I got very excited when I saw a reply, but unfortunately the answer to all of the above is yes.
I seem to remember there's som sort of "gotcha" about global or per-user filtering. As the main user of email on this machine, I'm really the only one reporting spam & ham. Spamassassin runs as a daemon as root.
I report spam using something similar to
cd /home/USERACCOUNT/mail sudo sa-learn --mbox --spam SPAM_TRAINING_FOLDER
(replace mbox with other parameter if not using mbox format)
Because I've sudo-ed it, it goes to the root's training database.
Hmm - I have a feeling my sa-learn is running as the user of the mailbox. I'll check that out in the morning.
I hope that's of help. If not, good luck!
I'll let you know the outcome of changing sa-learn to running as sudo.
Many thanks, Steve!
Jenny
_______________________________________________ main@lists.alug.org.uk http://www.alug.org.uk/ https://lists.alug.org.uk/mailman/listinfo/main Unsubscribe? See message headers or the web site above!
On Thu, 25 Feb 2021 at 22:56, Ben Whyall ben@whyall-systems.co.uk wrote:
Hi
Can you post a set of message headers here, anonymised all I'm really interested in is the headers related to spamassassin etc.
It looks like you might not be getting a high enough score for the message to be treated as spam. Do you get a total score ?
On 25/02/2021 18:43:35, Jenny Hopkins hopkins.jenny@gmail.com wrote:
On Thu, 25 Feb 2021 at 17:58, wrote:
On 25/02/2021 08:47, Jenny Hopkins wrote:
Hello,
I might need to join the spamassassin mailing list for this Q, but just in case anyone here can help first:
I've got a mail set-up where exim4 hands mail to spamassassin before delivering to mailboxes local on the server. Users put any missed spam into missed-spam folders, and misfiled ham into missed-ham folders, and a cron job runs regularly to allow sa-learn to run learning from these folders.
The problem is - it says it is learning, but it isn't. It is letting through handfuls of the same spam over and over. It's as if SA is running without paying any attention to the bayes-db, which would be weird as that is what I thought was a core integral part of spamassassin. Am I missing something in the basic setup?
An example header of a missed spam shows something like:
X-Spam_report: Spam detection software, running on the system "example.co.uk", has NOT identified this incoming email as spam.
-then a few lines further down:
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.0000] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.0000]
Any ideas? It's driving me nuts.
Hi!
Sometimes incoming mail has fake Spamassassin headers to try and fool you that it's not spam. I don't suppose that's the case in this case.
in /etc/spamassassin
has v320.pre got loadplugin Mail::SpamAssassin::Plugin::Bayes
has v310.pre got
# AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning # loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
has local.cf got
use_bayes 1 bayes_auto_learn 1
if so then I guess it should work.
I got very excited when I saw a reply, but unfortunately the answer to all of the above is yes.
I seem to remember there's som sort of "gotcha" about global or per-user filtering. As the main user of email on this machine, I'm really the only one reporting spam & ham. Spamassassin runs as a daemon as root.
I report spam using something similar to
cd /home/USERACCOUNT/mail sudo sa-learn --mbox --spam SPAM_TRAINING_FOLDER
(replace mbox with other parameter if not using mbox format)
Because I've sudo-ed it, it goes to the root's training database.
Hmm - I have a feeling my sa-learn is running as the user of the mailbox. I'll check that out in the morning.
I hope that's of help. If not, good luck!
Hello,
Thanks so much for the responses. I've pasted up everything I know here, including headers: https://pastebin.com/nvZjqEzL
It looks as though I already changed sa-learn to run as root.
The last entry - with three spams placed in missed-spam then running the script, it reports 0 learning then deletes them (so it's looking in the right folder). The bayes_journal - I tried finding out about that error message and could only come up with that it was created on the fly.
Thanks, Jenny
On 26/02/2021 10:01, Jenny Hopkins wrote:
Hello, Thanks so much for the responses. I've pasted up everything I know here, including headers: https://pastebin.com/nvZjqEzL
It looks as though I already changed sa-learn to run as root.
Interesting. Your database is in a different place to mine. Mine's in /root/.spamassassin.
Also, I'm not using sa-exim but tweaks inside the exim config file. sa-exim website says it hasn't been maintained since 2006
See here: http://marc.merlins.org/linux/exim/sa.html That also says other ways of integrating spamassassin directly into exim.
Looking at this https://github.com/docker-mailserver/docker-mailserver/issues/365
I surmise that your spamassassin is running as user spamassassin. I think the alternative is running it as root.
Look at
sudo ps -Af | grep spamassassin
what user is it running as?
If it's NOT running as root, then sudo sa-learn will probably not work, because it will update root's database, rather than the spamassasin user's one in /var/lib/spamassasin.
If you're going to continue with the non-root user (I'm guessing it's "spamassassin"), then you'll have to adjust your learn script, perhaps to something like
sudo su - spamassasin sa-learn .....
or use sa-learn --dbpath SOMEPATH
Hmmm....
puzzling
Steve
Hi
So it looks like there is a problem writing to the bayes_journal, I am guessing because of the user that the spamd is running.
However its definitely running the bayes on that message as its given it a score of .492334.
Its also just missing out low on the threshold to be detected as spam scoring 4.6 instead of 4.9
Looking at your config you havent changed any of the rule scores.
It may help you to adjust the x-spam-status to contain scored with the rules too.
Ben On 26/02/2021 10:01:43, Jenny Hopkins hopkins.jenny@gmail.com wrote: On Thu, 25 Feb 2021 at 22:56, Ben Whyall wrote:
Hi
Can you post a set of message headers here, anonymised all I'm really interested in is the headers related to spamassassin etc.
It looks like you might not be getting a high enough score for the message to be treated as spam. Do you get a total score ?
On 25/02/2021 18:43:35, Jenny Hopkins wrote:
On Thu, 25 Feb 2021 at 17:58, wrote:
On 25/02/2021 08:47, Jenny Hopkins wrote:
Hello,
I might need to join the spamassassin mailing list for this Q, but just in case anyone here can help first:
I've got a mail set-up where exim4 hands mail to spamassassin before delivering to mailboxes local on the server. Users put any missed spam into missed-spam folders, and misfiled ham into missed-ham folders, and a cron job runs regularly to allow sa-learn to run learning from these folders.
The problem is - it says it is learning, but it isn't. It is letting through handfuls of the same spam over and over. It's as if SA is running without paying any attention to the bayes-db, which would be weird as that is what I thought was a core integral part of spamassassin. Am I missing something in the basic setup?
An example header of a missed spam shows something like:
X-Spam_report: Spam detection software, running on the system "example.co.uk", has NOT identified this incoming email as spam.
-then a few lines further down:
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.0000] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.0000]
Any ideas? It's driving me nuts.
Hi!
Sometimes incoming mail has fake Spamassassin headers to try and fool you that it's not spam. I don't suppose that's the case in this case.
in /etc/spamassassin
has v320.pre got loadplugin Mail::SpamAssassin::Plugin::Bayes
has v310.pre got
# AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning # loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
has local.cf got
use_bayes 1 bayes_auto_learn 1
if so then I guess it should work.
I got very excited when I saw a reply, but unfortunately the answer to all of the above is yes.
I seem to remember there's som sort of "gotcha" about global or per-user filtering. As the main user of email on this machine, I'm really the only one reporting spam & ham. Spamassassin runs as a daemon as root.
I report spam using something similar to
cd /home/USERACCOUNT/mail sudo sa-learn --mbox --spam SPAM_TRAINING_FOLDER
(replace mbox with other parameter if not using mbox format)
Because I've sudo-ed it, it goes to the root's training database.
Hmm - I have a feeling my sa-learn is running as the user of the mailbox. I'll check that out in the morning.
I hope that's of help. If not, good luck!
Hello,
Thanks so much for the responses. I've pasted up everything I know here, including headers: https://pastebin.com/nvZjqEzL
It looks as though I already changed sa-learn to run as root.
The last entry - with three spams placed in missed-spam then running the script, it reports 0 learning then deletes them (so it's looking in the right folder). The bayes_journal - I tried finding out about that error message and could only come up with that it was created on the fly.
Thanks, Jenny
Hello All,
So I sat down and tried making spamassassin write to individual databases in /home/$user/spamassassin. I followed this guide that Steve suggested: https://www.nesono.com/node/391
I adapted it to fit with the debian structure i.e. none of the vhome thing. However it all fell over when I tried to amend the entry in /etc/default/spamassassin as suggested: OPTIONS="--create-prefs --max-children 5 --helper-home-dir --virtual-config-dir=/vhome/users/%u/spamassassin -x -u vmail" - again, tried to adapt for debian but %u wasn't picked up so sa-learn coudln't locate the dbs.
and also when I tried to tell spamassassin the bayes_path was in each home directory.
Any ideas before I completely give up?
Many thanks for your continued patience. Jenny
On Sat, 27 Feb 2021 at 22:23, Ben Whyall ben@whyall-systems.co.uk wrote:
Hi
So it looks like there is a problem writing to the bayes_journal, I am guessing because of the user that the spamd is running.
However its definitely running the bayes on that message as its given it a score of .492334.
Its also just missing out low on the threshold to be detected as spam scoring 4.6 instead of 4.9
Looking at your config you havent changed any of the rule scores.
It may help you to adjust the x-spam-status to contain scored with the rules too.
Ben
On 26/02/2021 10:01:43, Jenny Hopkins hopkins.jenny@gmail.com wrote:
On Thu, 25 Feb 2021 at 22:56, Ben Whyall wrote:
Hi
Can you post a set of message headers here, anonymised all I'm really interested in is the headers related to spamassassin etc.
It looks like you might not be getting a high enough score for the message to be treated as spam. Do you get a total score ?
On 25/02/2021 18:43:35, Jenny Hopkins wrote:
On Thu, 25 Feb 2021 at 17:58, wrote:
On 25/02/2021 08:47, Jenny Hopkins wrote:
Hello,
I might need to join the spamassassin mailing list for this Q, but just in case anyone here can help first:
I've got a mail set-up where exim4 hands mail to spamassassin before delivering to mailboxes local on the server. Users put any missed spam into missed-spam folders, and misfiled ham into missed-ham folders, and a cron job runs regularly to allow sa-learn to run learning from these folders.
The problem is - it says it is learning, but it isn't. It is letting through handfuls of the same spam over and over. It's as if SA is running without paying any attention to the bayes-db, which would be weird as that is what I thought was a core integral part of spamassassin. Am I missing something in the basic setup?
An example header of a missed spam shows something like:
X-Spam_report: Spam detection software, running on the system "example.co.uk", has NOT identified this incoming email as spam.
-then a few lines further down:
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.0000] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.0000]
Any ideas? It's driving me nuts.
Hi!
Sometimes incoming mail has fake Spamassassin headers to try and fool you that it's not spam. I don't suppose that's the case in this case.
in /etc/spamassassin
has v320.pre got loadplugin Mail::SpamAssassin::Plugin::Bayes
has v310.pre got
# AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning # loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
has local.cf got
use_bayes 1 bayes_auto_learn 1
if so then I guess it should work.
I got very excited when I saw a reply, but unfortunately the answer to all of the above is yes.
I seem to remember there's som sort of "gotcha" about global or per-user filtering. As the main user of email on this machine, I'm really the only one reporting spam & ham. Spamassassin runs as a daemon as root.
I report spam using something similar to
cd /home/USERACCOUNT/mail sudo sa-learn --mbox --spam SPAM_TRAINING_FOLDER
(replace mbox with other parameter if not using mbox format)
Because I've sudo-ed it, it goes to the root's training database.
Hmm - I have a feeling my sa-learn is running as the user of the mailbox. I'll check that out in the morning.
I hope that's of help. If not, good luck!
Hello,
Thanks so much for the responses. I've pasted up everything I know here, including headers: https://pastebin.com/nvZjqEzL
It looks as though I already changed sa-learn to run as root.
The last entry - with three spams placed in missed-spam then running the script, it reports 0 learning then deletes them (so it's looking in the right folder). The bayes_journal - I tried finding out about that error message and could only come up with that it was created on the fly.
Thanks, Jenny