Ever since I built my latest box I've had *very* occasional total crashes. By "very occasional" I mean once a week or even less, they only ever occurred when I was actively using the system, it runs all the time as a disk server for our home LAN and never died in the middle of the night or while we were away.
Just recently (like yesterday and the day before) it crashed more frequently, like twice or three times, and I decided to try and work out what the problem was. A quick Google showed that I was getting kernel panics as when the system froze the Caps Lock and Scroll Lock lights flashed which apparently indicates a kernel panic. There was absolutely no clue in the logs anywhere, not a murmur of anything wrong before the burst of messages for the reboot.
A while ago (when I first built the system) it seemed as if it might be an Intel video driver issue but there have been a few updates since then and I decided to see if there might be a hardware problem.
First I looked at the temperatures, nothing untoward there, CPU was 32 degrees, motherboard was 20 degrees.
Then I tried running the memtest you can get to from the Grub menu, aha! Test 5 produced some errors (it's the block copy test). I tried moving the DIMMs around to see if that helped, it just moved the errors around but didn't fix them so it seemed there was a real error.
Then I looked at the Asus web site (and my motherboard manual) to see if there were any clues there. The latest BIOS upgrade says "1. Improve the compatibility with some memory.", hmm, I wonder if that means me?
So a BIOS upgrade seemed a good idea, first looks suggested that it might be a bit difficult because I don't have a floppy disk drive in the system. However I was pleasantly surprised to find that the BIOS upgrade utility built into the system BIOS which you get to by hitting ALT/F2 at boot can read the BIOS file from CD and USB as well as from a floppy (this wasn't very well documented anywhere) so I needed neither a floppy disk nor an MS operating system. I just wrote the updated BIOS file to a CD and the system did the rest.
.... and now I appear to have error free memory according to memtest! :-)
It remains to be seen whether it *was* the memory problem causing my kernel panics but it does seem likely.
So, thank you Asus for finding the bug and fixing it and for providing utilities etc. that will work on Linux. They also have Linux drivers on their web site (not that I needed them, ubuntu detected everything without problems).