Wayne Stallwood wrote:
Thanks Jon for your response, that was not a feature I was aware of.
Anyway after taking the machine in question offline over the weekend and running memtest we found an error on one of the DIMMs (took 6 passes before it started to show) given that we can't find much else wrong the theory is that an intermittent/thermal memory issue may have caused the disk corruption as although this server is ECC capable it did not have ECC ram installed.
Running now with new memory and crossed fingers
Wayne,
I have a live mail filter that exhibits this behaviour on one partition, and I can't afford to take it offline for 5 minutes, never mind a weekend. It filters well over 500K messages a month of customers email. (yes, the second one is nearly ready!).
I'd like to hear if your new memory fixes the problem, as the only cure atm is an cron reboot at 4am.
Cheers, Laurie.