I have an old WD NAS that I'm trying to resurrect.
It's actually mostly working and I can see the web configuration GUI and login to it with ssh. However its RAID setup has got broken somehow.
It was configured to use its two 1Tb disk drives in SBOD mode to give what is supposed to look like one 2Tb disk. However something has gone wrong and it would appear that only one of the two disk drives is configured on the RAID array.
It's not possible to unRAID it (I hate RAID) so one can't just configure it to have two disks. The web configuration GUI just shows that both drives are OK/good but that the second drive isn't joining the RAID configuration properly.
So how is software RAID usually configured and started up? I can't find anything that looks as if it's "start up the RAID software" in any of the init files in /etc. So how's it done?
The 'symptom' I see is that /dev/md4 isn't created and mounted, all the rest are OK (i.e. /dev/md0, /dev/md1, /dev/md2 and /dev/md3).
On 07/03/2020 21:10, Chris Green wrote:
I have an old WD NAS that I'm trying to resurrect.
It's actually mostly working and I can see the web configuration GUI and login to it with ssh. However its RAID setup has got broken somehow.
It was configured to use its two 1Tb disk drives in SBOD mode to give what is supposed to look like one 2Tb disk. However something has gone wrong and it would appear that only one of the two disk drives is configured on the RAID array.
It's not possible to unRAID it (I hate RAID) so one can't just configure it to have two disks. The web configuration GUI just shows that both drives are OK/good but that the second drive isn't joining the RAID configuration properly.
If it's a Raid device you've brought, then you may be limited to the tools that the supplier provided with it. Worst case is that you have to somehow recover the info off the disk, format it, re-add it to the array then restore the data.
So how is software RAID usually configured and started up? I can't find anything that looks as if it's "start up the RAID software" in any of the init files in /etc. So how's it done?
It depends. You can have software raid, hardware raid or "fake hardware raid". Software Raid is controlled by a utility called mdadm and a config file /etc/mdadm/mdadm.conf. It is usual to control it with mdadm commands rather than messing with the config file.
It may be possible to take the drives out of the wd device, use mdadm to mount one or both of the drives, rescue the data, then format the bad disk and shove it back into the WD device. There are LOTS of websites on mdadm, e.g. https://raid.wiki.kernel.org/index.php/Linux_Raid
Hardware Raid. You're limited to whatever the manufacture has given you.
Fake hardware raid. The hardware implements some or all of mdam using "hardware", possibly by putting mdadm into ROM. In effect, it's using mdadm. If it's fake hardware raid, you may be able to recover things with mdadm.
Good luck.
However, if you "hate raid", then perhaps don't use it? You could use Linux Logical Volumes. You can add disks to create a large "pseudo-disk" https://en.wikipedia.org/wiki/Logical_Volume_Manager_%28Linux%29
Alternatively, try BTRFS one of the new file systems. You can add multiple disks as one partition and it has a bunch of cool features. However, last time I looked it didn't have a full-set of disk recovery tools (fsck etc) for my liking. https://en.wikipedia.org/wiki/Btrfs
Steve
On Sun, Mar 08, 2020 at 04:47:31PM +0000, steve-ALUG@hst.me.uk wrote:
On 07/03/2020 21:10, Chris Green wrote:
I have an old WD NAS that I'm trying to resurrect.
It's actually mostly working and I can see the web configuration GUI and login to it with ssh. However its RAID setup has got broken somehow.
It was configured to use its two 1Tb disk drives in SBOD mode to give what is supposed to look like one 2Tb disk. However something has gone wrong and it would appear that only one of the two disk drives is configured on the RAID array.
It's not possible to unRAID it (I hate RAID) so one can't just configure it to have two disks. The web configuration GUI just shows that both drives are OK/good but that the second drive isn't joining the RAID configuration properly.
If it's a Raid device you've brought, then you may be limited to the tools that the supplier provided with it. Worst case is that you have to somehow recover the info off the disk, format it, re-add it to the array then restore the data.
So how is software RAID usually configured and started up? I can't find anything that looks as if it's "start up the RAID software" in any of the init files in /etc. So how's it done?
It depends. You can have software raid, hardware raid or "fake hardware raid". Software Raid is controlled by a utility called mdadm and a config file /etc/mdadm/mdadm.conf. It is usual to control it with mdadm commands rather than messing with the config file.
Yes, it has software RAID but no configuration files named anything like mdadm (or bits of it) in /etc. However there is the mdadm command available.
It may be possible to take the drives out of the wd device, use mdadm to mount one or both of the drives, rescue the data, then format the bad disk and shove it back into the WD device. There are LOTS of websites on mdadm, e.g. https://raid.wiki.kernel.org/index.php/Linux_Raid
I considered this, the drives are easy enough to take out, but see below.
Hardware Raid. You're limited to whatever the manufacture has given you.
Fake hardware raid. The hardware implements some or all of mdam using "hardware", possibly by putting mdadm into ROM. In effect, it's using mdadm. If it's fake hardware raid, you may be able to recover things with mdadm.
Neither of these.
Good luck.
However, if you "hate raid", then perhaps don't use it? You could use Linux Logical Volumes. You can add disks to create a large "pseudo-disk" https://en.wikipedia.org/wiki/Logical_Volume_Manager_%28Linux%29
You can't "un RAID" the NAS unfortunately. Both drives were already set up as JBOD so RAID was doing nothing useful except adding a layer of useless complexity.
However it turned out to be a simple fix. One of the RAID block devices had disappeared for some reason, there was no /dev/md4 even though there was /dev/md0 through to /dev/md20 (except for 4). I just used mknod to recreate the /dev/md4 block device and everything works again, both 1Tb disk drives seem intact and error free.
Goodness knows why /dev/md4 disappeared, it's not disappeared again even though I have rebooted the NAS a few times.
Thanks for all the information about RAID, always useful.
On Sun, 8 Mar 2020 at 17:08, Chris Green cl@isbd.net wrote:
Yes, it has software RAID but no configuration files named anything like mdadm (or bits of it) in /etc. However there is the mdadm command available.
I know enough about mdadm that I could probably muddle through if your hardware was in front of me but not enough to explain or help much over email.
But I can say that raid configuration data is stored within the metadata in the arrays themselves and that whilst mdadm can have a configuration file it doesn't need one.
What does: cat /proc/mdstat tell you?
On Mon, Mar 09, 2020 at 09:41:12AM +0000, Mark Rogers wrote:
On Sun, 8 Mar 2020 at 17:08, Chris Green cl@isbd.net wrote:
Yes, it has software RAID but no configuration files named anything like mdadm (or bits of it) in /etc. However there is the mdadm command available.
I know enough about mdadm that I could probably muddle through if your hardware was in front of me but not enough to explain or help much over email.
But I can say that raid configuration data is stored within the metadata in the arrays themselves and that whilst mdadm can have a configuration file it doesn't need one.
What does: cat /proc/mdstat tell you?
A 'cat /proc/mdstat' produced:-
Personalities : [linear] [raid0] [raid1] md4 : active raid1 sda4[0] 973522816 blocks [2/1] [U_]
md1 : active raid1 sdb2[0] sda2[1] 256960 blocks [2/2] [UU]
md3 : active raid1 sdb3[0] sda3[1] 987904 blocks [2/2] [UU]
md2 : active raid1 sdb4[0] 973522816 blocks [2/1] [U_]
md0 : active raid1 sdb1[0] sda1[1] 1959808 blocks [2/2] [UU]
Which led me to the answer (eventually), I was seeing the error (on the GUI) 'Failed to mount /dev/md4' but from the above md4 seemed to be OK. On looking in /dev I saw that /dev/md4 didn't exist, there was just:-
brw-r----- 1 root root 9, 0 Sep 29 2011 /dev/md0 brw-r----- 1 root root 9, 1 Sep 29 2011 /dev/md1 brw-r----- 1 root root 9, 2 Sep 29 2011 /dev/md2 brw-r----- 1 root root 9, 3 Sep 29 2011 /dev/md3 brw-r----- 1 root root 9, 5 Sep 29 2011 /dev/md5 brw-r----- 1 root root 9, 6 Sep 29 2011 /dev/md6 brw-r----- 1 root root 9, 7 Sep 29 2011 /dev/md7 brw-r----- 1 root root 9, 8 Sep 29 2011 /dev/md8 brw-r----- 1 root root 9, 9 Sep 29 2011 /dev/md9
So I added the missing block device using mknod (fortunately had root shell access to the NAS) and rebooted and everything now works perfectly. I have no idea why /dev/md4 disappeared though, it has survived several further reboots now so its disappearance seems to have been a one-off oddity.
Thanks for the help all.
On Mon, 9 Mar 2020 at 11:06, Chris Green cl@isbd.net wrote:
Personalities : [linear] [raid0] [raid1] md4 : active raid1 sda4[0] 973522816 blocks [2/1] [U_]
[...]
Which led me to the answer (eventually), I was seeing the error (on the GUI) 'Failed to mount /dev/md4' but from the above md4 seemed to be OK. On looking in /dev I saw that /dev/md4 didn't exist, [...]
Very odd, I don't see how mdstat could have been telling you that md4 was active (albeit missing one of its components) without md4 existing, but I'm glad it led you to a solution!