Hi folks,
My *buntu server runs a mail server. At 00:00 on some days, it complains that it can't reach the spam analyser or the virus analyser, despite the fact that they're still running and always have been, as far as I can tell.
Other than stopping up to see what's running at midnight, is there a way of monitoring what's running at a particular time. I know you can see what's slowing things down at boot time using "system-analyze blame", or using pybootchartgui. Is there a way to do something like this around midnight?
Also, of the top of your collective heads, can you think of any services update or hog processor at 00:00.
How are scheduled tasks started, so I can have a root around and see what's happening?
I can think of various ways. 1) A program (daemon) can constantly run, and may update at a specific time. 2) crontab 3) "at" perhaps. Any others?
Any comments or advice appreciated.
Steve
On 23/02/2021 18:15, steve-ALUG@hst.me.uk wrote:
Hi folks,
My *buntu server runs a mail server. At 00:00 on some days, it complains that it can't reach the spam analyser or the virus analyser, despite the fact that they're still running and always have been, as far as I can tell.
Other than stopping up to see what's running at midnight, is there a way of monitoring what's running at a particular time. I know you can see what's slowing things down at boot time using "system-analyze blame", or using pybootchartgui. Is there a way to do something like this around midnight?
Also, of the top of your collective heads, can you think of any services update or hog processor at 00:00.
How are scheduled tasks started, so I can have a root around and see what's happening?
I can think of various ways.
- A program (daemon) can constantly run, and may update at a specific
time. 2) crontab 3) "at" perhaps. Any others?
I've also found/remembered
/etc/cron.d /etc/cron.daily /etc/cron.hourly /etc/cron.monthly /etc/cron.weekly
I've also thought that logrotate may be running...
Any thoughts?
On 23/02/2021 18:15, steve-ALUG@hst.me.uk wrote:
I can think of various ways.
- A program (daemon) can constantly run, and may update at a specific time.
- crontab
- "at" perhaps.
Any others?
Any comments or advice appreciated.
Hi, I had something similar, and as a quick'n'dirty I started a tmux and ran something like: while true ; do ps auxwww > $(date +"ps-%Y%m%d%H%M%S.log") ; sleep 30 ; done This led me to finding a regular spamhammer causing >200 spamassassin processes to fork...
On Tue, 23 Feb 2021 18:15:26 +0000 steve-ALUG@hst.me.uk allegedly wrote:
My *buntu server runs a mail server. At 00:00 on some days, it complains that it can't reach the spam analyser or the virus analyser, despite the fact that they're still running and always have been, as far as I can tell.
Other than stopping up to see what's running at midnight, is there a way of monitoring what's running at a particular time. I know you can see what's slowing things down at boot time using "system-analyze blame", or using pybootchartgui. Is there a way to do something like this around midnight?
Also, of the top of your collective heads, can you think of any services update or hog processor at 00:00.
Steve
What (if anything) does syslog say? Have you checked your (r)syslog.conf to see where cron (or any odd daemon) is logging?
Mick
--------------------------------------------------------------------- Mick Morgan gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312 https://baldric.net/about-trivia ---------------------------------------------------------------------
Log rotation?
'grep cron /var/log/syslog' and look for entries around 0000
sudo 'apt install sysstat' and use 'sar -q' to look at historic load avg around 0000 This is lightweight and recommended on a server.
If all else fails temporarily turn on "process accounting" and 'apt install acct'
google it since I forget the details but I remember its heavyweight!
On Wed, 24 Feb 2021 at 14:17, mick mbm@rlogin.net wrote:
On Tue, 23 Feb 2021 18:15:26 +0000 steve-ALUG@hst.me.uk allegedly wrote:
My *buntu server runs a mail server. At 00:00 on some days, it complains that it can't reach the spam analyser or the virus analyser, despite the fact that they're still running and always have been, as far as I can tell.
Other than stopping up to see what's running at midnight, is there a way of monitoring what's running at a particular time. I know you can see what's slowing things down at boot time using "system-analyze blame", or using pybootchartgui. Is there a way to do something like this around midnight?
Also, of the top of your collective heads, can you think of any services update or hog processor at 00:00.
Steve
What (if anything) does syslog say? Have you checked your (r)syslog.conf to see where cron (or any odd daemon) is logging?
Mick
Mick Morgan gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312 https://baldric.net/about-trivia
main@lists.alug.org.uk http://www.alug.org.uk/ https://lists.alug.org.uk/mailman/listinfo/main Unsubscribe? See message headers or the web site above!
On 24/02/2021 17:06, Steve Mynott wrote:
Log rotation?
'grep cron /var/log/syslog' and look for entries around 0000
sudo 'apt install sysstat' and use 'sar -q' to look at historic load avg around 0000 This is lightweight and recommended on a server.
I'll try this & see what happens. Thanks!
If all else fails temporarily turn on "process accounting" and 'apt install acct'
google it since I forget the details but I remember its heavyweight!
On Wed, 24 Feb 2021 at 14:17, mick mbm@rlogin.net wrote:
I shall save this one for a bit, as I have a few leads at the moment.
Steve
On 24/02/2021 14:17, mick wrote:
On Tue, 23 Feb 2021 18:15:26 +0000 steve-ALUG@hst.me.uk allegedly wrote:
My *buntu server runs a mail server. At 00:00 on some days, it complains that it can't reach the spam analyser or the virus analyser, despite the fact that they're still running and always have been, as far as I can tell.
Other than stopping up to see what's running at midnight, is there a way of monitoring what's running at a particular time. I know you can see what's slowing things down at boot time using "system-analyze blame", or using pybootchartgui. Is there a way to do something like this around midnight?
Also, of the top of your collective heads, can you think of any services update or hog processor at 00:00.
Steve
What (if anything) does syslog say? Have you checked your (r)syslog.conf to see where cron (or any odd daemon) is logging?
Mick
Thanks Mick,
That's a start. rsyslog.conf has no uncommented lines for cron, so cron logs to syslog.
Syslog says, the last time the error happened,
00:35 systemd[1]: Starting Discard unused blocks on filesystems from /etc/fstab... 00:36 exim4: ALERT: exim paniclog /var/log/exim4/paniclog has non-zero size, mail system possibly broken 00:36 systemd[1]: Starting exim4-base housekeeping... 00:36 systemd[1]: Starting Daily man-db regeneration... 00:37 systemd[1]: fstrim.service: Succeeded. 00:37 systemd[1]: Finished Discard unused blocks on filesystems from /etc/fstab. 00:37 exim[339807]: 2021-02-22 00:37 MAIL_HEADER_ID failed to write to main log: length=91 result=-1 errno=9 (Bad file descriptor) 00:37 exim[339807]: write failed on panic log: length=116 result=-1 errno=9 (Bad file descriptor) 00:37 systemd[1]: exim4-base.service: Succeeded. 00:37 systemd[1]: Finished exim4-base housekeeping. 00:37 systemd[1]: Starting Rotate log files...
So a shed-load of things were getting started. These seem to have been started by systemd using config files in /etc/systemd/system/timers.target.wants
Some of the systemd timers specify a time to run, the rest don't and say something like daily. The daily entries are started at 00:00, so I guess systemd is trying to run multiple jobs a the same time.
I shall try changing the OnCalendar=daily to OnCalendar=*-*-* 01:09:00 to see if it works better at a different time.
Thanks for starting me in the right direction. I shall see where it leads.
I didn't know systemd ran scheduled tasks too!
Steve
On Wed, 24 Feb 2021 21:09:35 +0000 steve-ALUG@hst.me.uk allegedly wrote:
I didn't know systemd ran scheduled tasks too!
In my view systemd does a shit load of stuff it has no rational business being involved in. (DNS FFS?)
Feature creep on a gargantuan scale. (But then I'm a beardy old skool unix fan.)
Mick
--------------------------------------------------------------------- Mick Morgan gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312 https://baldric.net/about-trivia ---------------------------------------------------------------------
On 24/02/2021 21:09, steve-ALUG@hst.me.uk wrote: [SNIP]
My change has successfully changed the restart time. Time will tell if this fixes my problem.
Steve
On 25/02/2021 19:24, steve-ALUG@hst.me.uk wrote:
On 24/02/2021 21:09, steve-ALUG@hst.me.uk wrote: [SNIP]
My change has successfully changed the restart time. Time will tell if this fixes my problem.
<chagrin mode>
OK, so having successfully managed to change the, the next time I got an error it occurred at the newly scheduled time. Gosh what's happening at that time thinks me.
Then I think a bit more. The email's being generated at that time, but the error is happening at sometime else. I would have helped if I had looked a the contents of the error email, and not the timestamp of the error email.
Numpty!
</chagrin mode>
Steve