On 20 October 2013 16:33, Wayne Stallwood ALUGlist@digimatic.co.uk wrote:
Unfortunately because drive sizes have increased faster than the uncorrected read error rate, the statistical likelihood of recovering completely from a failed drive when you have member sizes of say 2TB is now so low there is almost no point counting RAID 5 as fault tolerant.
There are lots of ways data can get lost that RAID of any kind don't help with (viruses, accidental deletion, etc). In my view, RAID is a convenience that can save the hassle of rebuilding everything from backups. Of-course the backups are on RAID too, but at least there are now multiple copies.
RAID1 has the advantage from a data recovery point of view that each disk should contain all the data so tools like photorec etc stand a decent chance of recovering a lot of data from them even after a failure, which RAID5 doesn't give you.
Ie: If a disk fails, you have a copy on the second. If that also fails, you have your backups. If they're out of date or failed, you have two disks you stand a good chance of recovering data from. If not, well...
I'm not really sure what better options there are. In light of Jonathan's comments:
I build storage (SANs) for a living and our most recent software release no longer allows RAID 5 on the large SATA drives, due to the increased risk of a double disk failure during rebuild. It's worth noting the same applies to a double disk RAID 1 set as well.
.. I think maybe I'll take my strategy of having two disks on RAID1 and instead of replacing a disk every year, I'll add a new disk every year until I've reached the capacity of the hardware (4 disks).
Aside: You will recall the RAID5 array that I was moving everything off due to disks having 1.8M cycles (cf expected lifetime of 300k, design lifetime of 1M)? One of the disks has started to report errors via SMART. (The data is all transferred to my 2x3TB RAID1 array, I guess it was probably the process of copying it that triggered the errors that SMART detected.) At the moment the RAID5 array is still showing healthy, and indeed even the disk (sdc) is showing healthy in SMART but my guess would be that it's on its way out?
smartctl output: http://pastebin.com/ufMMHdFu syslog: http://pastebin.com/RbXywUui
And finally: I just checked the SMART data for my two new ("identical") disks: http://pastebin.com/D09ZkVz9 Can anyone explain why the output from the two disks is so different from each other, given they're the same model?