Apache fell over

List overview All Threads
Download

newer

older

"business as usual"

Cambridge and East Anglian Python...

Matthew

24 Oct 2007 24 Oct '07

1:04 p.m.

I'm running Apache 2.2.3 on Debian Etch. It generally runs happily and I haven't seen any issues since I upgraded the hosting several months ago, but at the weekend it died and I can't work out why.

I have two things I was hoping people might be able to help with.

/var/log/apache2/error.log includes: [Sun Oct 21 06:26:54 2007] [warn] child process 11300 still did not exit, sending a SIGTERM [Sun Oct 21 06:26:56 2007] [warn] child process 11300 still did not exit, sending a SIGTERM [Sun Oct 21 06:26:58 2007] [warn] child process 11300 still did not exit, sending a SIGTERM [Sun Oct 21 06:27:00 2007] [error] child process 11300 still did not exit, sending a SIGKILL [Sun Oct 21 06:27:01 2007] [notice] caught SIGTERM, shutting down

The last row before that was from a week before, and shouldn't be part of the same incident. 6am is definitely not a peak time for this box. I couldn't find anything in any other logs that seemed relevant.

Anyone got any idea what that was, or where I can look for more detail?

Secondly, I really ought to have some mechanism to check on Apache (and MySQL) periodically and make sure they're running, either restarting them or alerting me (although I can be difficult to get hold of). What do people recommend for this? Is it as simple as having cron run a script to start the daemon if it's not already on the stored PID, or are there issues I should be wary of?

Thanks, Matthew

Show replies by date

Brett Parker

24 Oct 24 Oct

1:50 p.m.

On Wed, Oct 24, 2007 at 01:04:13PM +0100, Matthew wrote:

...

I'm running Apache 2.2.3 on Debian Etch. It generally runs happily and I haven't seen any issues since I upgraded the hosting several months ago, but at the weekend it died and I can't work out why.

I have two things I was hoping people might be able to help with.

/var/log/apache2/error.log includes: [Sun Oct 21 06:26:54 2007] [warn] child process 11300 still did not exit, sending a SIGTERM [Sun Oct 21 06:26:56 2007] [warn] child process 11300 still did not exit, sending a SIGTERM [Sun Oct 21 06:26:58 2007] [warn] child process 11300 still did not exit, sending a SIGTERM [Sun Oct 21 06:27:00 2007] [error] child process 11300 still did not exit, sending a SIGKILL [Sun Oct 21 06:27:01 2007] [notice] caught SIGTERM, shutting down

The last row before that was from a week before, and shouldn't be part of the same incident. 6am is definitely not a peak time for this box. I couldn't find anything in any other logs that seemed relevant.

Anyone got any idea what that was, or where I can look for more detail?

Hmm, that looks at around the right time for a logrotate, which includes a graceful to apache... I'd guess that there was something making apache very very unhappy - but it's hard to tell from that!

...

Secondly, I really ought to have some mechanism to check on Apache (and MySQL) periodically and make sure they're running, either restarting them or alerting me (although I can be difficult to get hold of). What do people recommend for this? Is it as simple as having cron run a script to start the daemon if it's not already on the stored PID, or are there issues I should be wary of?

I'd test the services as apposed to the PIDs that you think they should have... throw a test page up on apache with some known content and then test against that every once in a while... I'm also not one for automatically restarting services - it masks potential issues which then may go unnoticed for a long time. We use nagios + clickatell (sms service) to monitor our systems - with a period between about 11.30pm and 6.30am where smses aren't sent (thank god!).

Hope that helps,

-- Brett Parker

Safe Hammad

4:49 p.m.

...

Hmm, that looks at around the right time for a logrotate, which includes a

graceful to apache... I'd guess that there was something making apache very

...

...
very unhappy - but it's hard to tell from that!

...

...
Secondly, I really ought to have some mechanism to check on Apache (and MySQL) periodically and make sure they're running, either restarting them or alerting me (although I can be difficult to get hold of). What do people recommend for this? Is it as simple as having cron run a script to start the daemon if it's not already on the stored PID, or are there issues I should be wary of?

...

I'd test the services as apposed to the PIDs that you think they should

have... throw a test page up on apache with some known content and then test

...

...
...
...
against that every once in a while... I'm also not one for

automatically restarting services - it masks potential issues which then may go unnoticed for a > long time. We use nagios + clickatell (sms

...

service) to monitor our systems - with a period between about 11.30pm and

6.30am where smses aren't sent (thank god!).

I've battled with exactly this on a virtual server I look after. There are numerous bug reports, fixes and workarounds out there for the taking. The only certain thing is that YMMV.

My fix was probably related to the fact that apache is running in a virtual server, but here it is FWIW:

/tmp was defined as a tmpfs in /etc/fstab. After commenting this out (bearing in mind the performance impact) and rebooting, I've not had any problems (yet?).

Here's where I started my journey: http://www.bytemark.co.uk/page/Live/support/tech/inside/apachereload

HTH

Safe

Mark Rogers

1 Nov 1 Nov

10:26 a.m.

Safe Hammad wrote:

...

I've battled with exactly this on a virtual server I look after. There are numerous bug reports, fixes and workarounds out there for the taking. The only certain thing is that YMMV.

We've also had this on an Ubuntu server.

Our issue appears to have been related to the time taken to restart Apache. For us the fix was simple: in /etc/logrotate.d/apache2 change /etc/init.d/apache2 restart > /dev/null to /etc/init.d/apache2 reload > /dev/null

I'll regret saying this, but our server which was dying on about 1 in every 2 log rotates hasn't died since we made this change. I hope saying that doesn't jinx us!

You'll find a bit more info here: https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/111709 including some comments from me where I was having then fixing this problem. See also: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=400455

-- Mark Rogers // More Solutions Ltd (Peterborough Office) // 0845 45 89 555 Registered in England (0456 0902) at 13 Clarke Rd, Milton Keynes, MK1 1LG

MJ Ray

24 Oct 24 Oct

1:53 p.m.

"Matthew" matthew@somewhatunlikely.com wrote: [...]

...

Secondly, I really ought to have some mechanism to check on Apache (and MySQL) periodically and make sure they're running, either restarting them or alerting me (although I can be difficult to get hold of). What do people recommend for this? Is it as simple as having cron run a script to start the daemon if it's not already on the stored PID, or are there issues I should be wary of?

You can do it that way, but there are many pitfalls to beware and such auto-starters can make mild problems much much worse in some situations. Nevertheless, I think I often use monit for such tasks.

Hope that helps,

-- MJ Ray http://mjr.towers.org.uk/email.html tel:+44-844-4437-237 - Webmaster-developer, statistician, sysadmin, online shop builder, consumer and workers co-operative member http://www.ttllp.co.uk/ - Writing on koha, debian, sat TV, Kewstoke http://mjr.towers.org.uk/

Wayne Stallwood

25 Oct 25 Oct

12:42 a.m.

On Wed, 2007-10-24 at 13:04 +0100, Matthew wrote:

...

Secondly, I really ought to have some mechanism to check on Apache (and MySQL) periodically and make sure they're running, either restarting them or alerting me (although I can be difficult to get hold of). What do people recommend for this? Is it as simple as having cron run a script to start the daemon if it's not already on the stored PID, or are there issues I should be wary of?

Seconding what Brett said, back in circa 2000-2001 when I was responsible for a subscription based online game server and some e-commerce running at a co-lo I had a script running from another location that connected to various ports to gather known content or responses and if it didn't like what it saw it emailed a one liner (mysql on host x down) to a sms gateway we ran.

Of course I thought this was really clever until the first time the damn thing went of at 3AM

The advantage being of course there are several reasons for the service to be down but the daemon is still running. Connectivity or routing issues included. Also if you monitor this from the same box then what happens if the whole box goes down ?

6443

Age (days ago)

6451

Last active (days ago)

main@lists.alug.org.uk

5 comments

6 participants

tags (0)

participants (6)

Brett Parker
Mark Rogers
Matthew
MJ Ray
Safe Hammad
Wayne Stallwood