Re: [ALUG] Diagnosing disk issue

21 Oct 2013


      On 20 October 2013 16:33, Wayne Stallwood ALUGlist@digimatic.co.uk wrote:
...
Unfortunately because drive sizes have increased faster than the uncorrected
read error rate, the statistical likelihood of recovering completely from a failed
drive when you have member sizes of say 2TB is now so low there is almost
no point counting RAID 5 as fault tolerant.
There are lots of ways data can get lost that RAID of any kind don't
help with (viruses, accidental deletion, etc). In my view, RAID is a
convenience that can save the hassle of rebuilding everything from
backups. Of-course the backups are on RAID too, but at least there are
now multiple copies.
RAID1 has the advantage from a data recovery point of view that each
disk should contain all the data so tools like photorec etc stand a
decent chance of recovering a lot of data from them even after a
failure, which RAID5 doesn't give you.
Ie: If a disk fails, you have a copy on the second. If that also
fails, you have your backups. If they're out of date or failed, you
have two disks you stand a good chance of recovering data from. If
not, well...
I'm not really sure what better options there are. In light of
Jonathan's comments:
...
I build storage (SANs) for a living and our most recent
software release no longer allows RAID 5 on the large SATA drives, due
to the increased risk of a double disk failure during rebuild. It's
worth noting the same applies to a double disk RAID 1 set as well.
.. I think maybe I'll take my strategy of having two disks on RAID1
and instead of replacing a disk every year, I'll add a new disk every
year until I've reached the capacity of the hardware (4 disks).
Aside: You will recall the RAID5 array that I was moving everything
off due to disks having 1.8M cycles (cf expected lifetime of 300k,
design lifetime of 1M)? One of the disks has started to report errors
via SMART. (The data is all transferred to my 2x3TB RAID1 array, I
guess it was probably the process of copying it that triggered the
errors that SMART detected.) At the moment the RAID5 array is still
showing healthy, and indeed even the disk (sdc) is showing healthy in
SMART but my guess would be that it's on its way out?
smartctl output: http://pastebin.com/ufMMHdFu
syslog: http://pastebin.com/RbXywUui
And finally: I just checked the SMART data for my two new
("identical") disks: http://pastebin.com/D09ZkVz9
Can anyone explain why the output from the two disks is so different
from each other, given they're the same model?
-- 
Mark Rogers // More Solutions Ltd (Peterborough Office) // 0844 251 1450
Registered in England (0456 0902) @ 13 Clarke Rd, Milton Keynes, MK1 1LG

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: [ALUG] Diagnosing disk issue