So it looks like there is a problem writing to the bayes_journal, I am guessing because of the user that the spamd is running.

However its definitely running the bayes on that message as its given it a score of .492334.

Its also just missing out low on the threshold to be detected as spam scoring 4.6 instead of 4.9

Looking at your config you havent changed any of the rule scores.

It may help you to adjust the x-spam-status to contain scored with the rules too.

Ben

On 26/02/2021 10:01:43, Jenny Hopkins <hopkins.jenny@gmail.com> wrote:
On Thu, 25 Feb 2021 at 22:56, Ben Whyall wrote:
>
> Hi
>
> Can you post a set of message headers here, anonymised all I'm really interested in is the headers related to spamassassin etc.
>
> It looks like you might not be getting a high enough score for the message to be treated as spam. Do you get a total score ?
>
>
> On 25/02/2021 18:43:35, Jenny Hopkins wrote:
>
> On Thu, 25 Feb 2021 at 17:58, wrote:
> >
> > On 25/02/2021 08:47, Jenny Hopkins wrote:
> > > Hello,
> > >
> > > I might need to join the spamassassin mailing list for this Q, but
> > > just in case anyone here can help first:
> > >
> > > I've got a mail set-up where exim4 hands mail to spamassassin before
> > > delivering to mailboxes local on the server. Users put any missed
> > > spam into missed-spam folders, and misfiled ham into missed-ham
> > > folders, and a cron job runs regularly to allow sa-learn to run
> > > learning from these folders.
> > >
> > > The problem is - it says it is learning, but it isn't. It is letting
> > > through handfuls of the same spam over and over. It's as if SA is
> > > running without paying any attention to the bayes-db, which would be
> > > weird as that is what I thought was a core integral part of
> > > spamassassin. Am I missing something in the basic setup?
> > >
> > > An example header of a missed spam shows something like:
> > >
> > > X-Spam_report: Spam detection software, running on the system "example.co.uk",
> > > has NOT identified this incoming email as spam.
> > >
> > > -then a few lines further down:
> > >
> > > 3.5 BAYES_99 BODY: Bayes spam probability is 99 to
> > > 100% [score: 1.0000]
> > > 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to
> > > 100% [score: 1.0000]
> > >
> > > Any ideas? It's driving me nuts.
> >
> >
> > Hi!
> >
> > Sometimes incoming mail has fake Spamassassin headers to try and fool
> > you that it's not spam. I don't suppose that's the case in this case.
> >
> > in /etc/spamassassin
> >
> > has v320.pre got
> > loadplugin Mail::SpamAssassin::Plugin::Bayes
> >
> >
> > has v310.pre got
> >
> > # AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning
> > #
> > loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
> >
> >
> > has local.cf got
> >
> > use_bayes 1
> > bayes_auto_learn 1
> >
> > if so then I guess it should work.
> >
>
> I got very excited when I saw a reply, but unfortunately the answer to
> all of the above is yes.
>
> > I seem to remember there's som sort of "gotcha" about global or per-user
> > filtering.
> > As the main user of email on this machine, I'm really the only one
> > reporting spam & ham.
> > Spamassassin runs as a daemon as root.
> >
> > I report spam using something similar to
> >
> > cd /home/USERACCOUNT/mail
> > sudo sa-learn --mbox --spam SPAM_TRAINING_FOLDER
> >
> > (replace mbox with other parameter if not using mbox format)
> >
> > Because I've sudo-ed it, it goes to the root's training database.
> >
>
> Hmm - I have a feeling my sa-learn is running as the user of the
> mailbox. I'll check that out in the morning.
>
> >
> > I hope that's of help. If not, good luck!
> >
>
Hello,

Thanks so much for the responses. I've pasted up everything I know
here, including headers:
https://pastebin.com/nvZjqEzL

It looks as though I already changed sa-learn to run as root.

The last entry - with three spams placed in missed-spam then running
the script, it reports 0 learning then deletes them (so it's looking
in the right folder).
The bayes_journal - I tried finding out about that error message and
could only come up with that it was created on the fly.

Thanks,
Jenny