Revisiting the 1st post in this thread...
On 08/10/13 16:52, Mark Rogers wrote:
I have 4x2TB disks configured for RAID5.
Initially they were in a USB3 external caddy but this never worked correctly - the raid kept dropping offline before it completed building the array.
So at this point it could be disk, or caddy, or USB issue...
I then switched to eSATA (same caddy)
so at this point it could be disk or caddy, but not USB....
and that improved things but I still failed to build the array completely. So they're now in a new HP microserver.
New microserver? New as in brand new? If yes, then same problems mean it's unlikely to be the new server, or the caddy, and leaves the disk suspect... Unless it's a compatability issue between your OS and the disk, or the Raid S/w and the disk.
Until now I assumed the issues were connectivity but since I still have the same problem there must be a disk issue of some kind.
Seems to be a fair conclusion.
However SMART is reporting healthy even after longer self tests.
Which, to me seem OK.
Do I just right this off as a duff disk or can I investigate this further?
syslog reports thus: Oct 4 01:30:09 backup kernel: [49309.671201] sd 4:0:0:0: [sdd] Unhandled error code Oct 4 01:30:09 backup kernel: [49309.671217] sd 4:0:0:0: [sdd] Oct 4 01:30:09 backup kernel: [49309.671222] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT Oct 4 01:30:09 backup kernel: [49309.671229] sd 4:0:0:0: [sdd] CDB: Oct 4 01:30:09 backup kernel: [49309.671233] Read(10): 28 00 d9 2e b1 f0 00 04 00 00 Oct 4 01:30:09 backup kernel: [49309.671253] end_request: I/O error, dev sdd, sector 3643716080 Oct 4 01:30:09 backup kernel: [49309.671264] md/raid:md0: read error not correctable (sector 3643714032 on sdd1). Oct 4 01:30:09 backup kernel: [49309.671274] md/raid:md0: Disk failure on sdd1, disabling device.
[SNIP] So it works then it stops working. Could it be a timeout issue? I.e. it writes to the disk, then spends a while writing to the other one, in the mean time, the first one has powered down and won't spin back up again?
Try disabling power saving, disks spinning down etc. Grasping at straws, is the power supply at your location reliable? Could it be brown-outs messing things up. Are you using the latest O/S, with all the patches installed?
Good luck!
Steve