[ALUG] Apache fell over

Brett Parker iDunno at sommitrealweird.co.uk
Wed Oct 24 13:50:03 BST 2007


On Wed, Oct 24, 2007 at 01:04:13PM +0100, Matthew wrote:
> I'm running Apache 2.2.3 on Debian Etch. It generally runs happily and I
> haven't seen any issues since I upgraded the hosting several months ago,
> but at the weekend it died and I can't work out why.
> 
> I have two things I was hoping people might be able to help with.
> 
> /var/log/apache2/error.log includes:
> [Sun Oct 21 06:26:54 2007] [warn] child process 11300 still did not exit,
> sending a SIGTERM
> [Sun Oct 21 06:26:56 2007] [warn] child process 11300 still did not exit,
> sending a SIGTERM
> [Sun Oct 21 06:26:58 2007] [warn] child process 11300 still did not exit,
> sending a SIGTERM
> [Sun Oct 21 06:27:00 2007] [error] child process 11300 still did not exit,
> sending a SIGKILL
> [Sun Oct 21 06:27:01 2007] [notice] caught SIGTERM, shutting down
> 
> The last row before that was from a week before, and shouldn't be part of
> the same incident. 6am is definitely not a peak time for this box. I
> couldn't find anything in any other logs that seemed relevant.
> 
> Anyone got any idea what that was, or where I can look for more detail?

Hmm, that looks at around the right time for a logrotate, which includes
a graceful to apache... I'd guess that there was something making apache
very very unhappy - but it's hard to tell from that!

> Secondly, I really ought to have some mechanism to check on Apache (and
> MySQL) periodically and make sure they're running, either restarting them
> or alerting me (although I can be difficult to get hold of).
> What do people recommend for this? Is it as simple as having cron run a
> script to start the daemon if it's not already on the stored PID, or are
> there issues I should be wary of?

I'd test the services as apposed to the PIDs that you think they should
have... throw a test page up on apache with some known content and then
test against that every once in a while... I'm also not one for
automatically restarting services - it masks potential issues which then
may go unnoticed for a long time. We use nagios + clickatell (sms
service) to monitor our systems - with a period between about 11.30pm
and 6.30am where smses aren't sent (thank god!).

Hope that helps,
-- 
Brett Parker




More information about the main mailing list