On 15/06/2024 19:38, Phil Ashby wrote:
Firstly, thanks for your help so far, Phil, I really appreciate it. Sadly, the wheeze didn't work.
Position in a nutshell:
I have an LSI megaraid card that has suddenly stopped working after more than a year in a W11 machine.
Said machine will no longer see the card at all so: I tried a new card I tried a new mobo
No change.
I stuck the card in an old machine from my parts bin with a PCIe slot and it saw the card's BIOS, enabling me to "recover" the RAID.
So that's where I am, and the steps I've followed are below:
I booted the machine with the card installed with both Ubuntu and Gentoo live, and it crapped out with an error on the megaraid_sas module as follows:
---- cut here ---- [ 4.961795] megaraid_sas 0000:01:00.0: Init cmd return status FAILED for SCSI host 6 [ 4.967678] megaraid_sas 0000:01:00.0: Failed from megasas_init_fw 6539 ---- cut here ----
I have been completely unable to find a way past this...
Here are all relevant messages from dmesg:
---- cut here ---- [ 4.923697] megaraid_sas 0000:01:00.0: BAR:0x1 BAR's base_addr(phys):0x00000000fe900000 mapped virt_addr:0x(____ptrval____) [ 4.923707] megaraid_sas 0000:01:00.0: FW now in Ready state [ 4.924144] megaraid_sas 0000:01:00.0: 63 bit DMA mask and 32 bit consistent mask [ 4.925232] megaraid_sas 0000:01:00.0: firmware supports msix : (96) [ 4.929531] megaraid_sas 0000:01:00.0: requested/available msix 5/5 poll_queue 0 [ 4.930028] megaraid_sas 0000:01:00.0: current msix/online cpus : (5/4) [ 4.930475] megaraid_sas 0000:01:00.0: RDPQ mode : (disabled) [ 4.930935] megaraid_sas 0000:01:00.0: Current firmware supports maximum commands: 272 LDIO threshold: 237 [ 4.931316] ACPI: video: Video Device [VGA1] (multi-head: yes rom: no post: no) [ 4.933681] megaraid_sas 0000:01:00.0: Performance mode :Latency (latency index = 1) [ 4.934186] megaraid_sas 0000:01:00.0: FW supports sync cache : Yes [ 4.934666] megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 [ 4.961795] megaraid_sas 0000:01:00.0: Init cmd return status FAILED for SCSI host 6 [ 4.967678] megaraid_sas 0000:01:00.0: Failed from megasas_init_fw 6539 [ 5.998178] GPT:Primary header thinks Alt. header is not at the end of the disk. [ 5.998789] GPT:6862219 != 60062499 [ 5.999361] GPT:Alternate GPT header not at the end of the disk. [ 5.999936] GPT:6862219 != 60062499 [ 6.000510] GPT: Use GNU Parted to correct GPT errors.
---- cut here ----
# lspci 01:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
So the system sees the card.
Booting with the 4 SSDs and no RAID card, Ubuntu could see that the 4 SSDs were all part of a RAID, but wouldn't mount them as such.
I rebooted with a Gentoo Live, and it saw the SSDs and assembled the RAID giving me /dev/md126. However, I couldn't actually get at the data (as follows):
---- cut here ---- # fdisk -l . . . Disk /dev/md126: 1.82 TiB, 1999307276288 bytes, 3904897024 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 65536 bytes / 131072 bytes Disklabel type: gpt Disk identifier: E28C6D9E-3DE3-4B20-A8DF-7B5C2300025B
Device Start End Sectors Size Type /dev/md126p1 34 32767 32734 16M Microsoft reserved /dev/md126p2 32768 3904894975 3904862208 1.8T Microsoft basic data
Partition 1 does not start on physical sector boundary. # # mkdir /mnt/l # mount /dev/md126p2 /mnt/l Found restart area in incorrect position in $LogFile. The disk contains an unclean file system (0, 0). Metadata kept in Windows cache, refused to mount. Falling back to read-only mount because the NTFS partition is in an unsafe state. Please resume and shutdown Windows fully (no hibernation or fast restarting.) Could not mount read-write, trying read-only # ls -al /mnt/l total 8 drwxrwxrwx 1 root root 0 Jun 21 2023 '$RECYCLE.BIN' drwxrwxrwx 1 root root 4096 May 15 02:32 . drwxr-xr-x 1 root root 60 Jun 17 12:10 .. drwxrwxrwx 1 root root 4096 Apr 23 2023 Broadcom-RAID-notes drwxrwxrwx 1 root root 0 Jun 19 2023 Dental drwxrwxrwx 1 root root 0 Jun 21 2023 JDDental drwxrwxrwx 1 root root 0 Apr 23 2023 'System Volume Information' drwxrwxrwx 1 root root 0 Apr 25 2023 WINNT livecd ~ # ls -al /mnt/l/Dental/ total 164 drwxrwxrwx 1 root root 0 Jun 19 2023 . drwxrwxrwx 1 root root 4096 May 15 02:32 .. drwxrwxrwx 1 root root 163840 May 13 22:35 Dental livecd ~ # ls -al /mnt/l/Dental/Dental/ ls: reading directory '/mnt/l/Dental/Dental/': Input/output error total 0 ---- cut here ----
So I can't get to the data.
I tried to install W11 on this machine as the RAID was created and the data used under W11, but it just kept telling that is didn't meet their minimum spec, which isn't true, so I gave up on that, and it appears that there is no such thing as a live W11 CD (there might be a bodge-up. Will investigate).
Then I tried Phil's suggestion:
# mkdir /mnt/1 # mount /dev/sda1 /mnt/1 mount: /mnt/1: special device /dev/sda1 does not exist. dmesg(1) may have more information after failed mount system call. # sfdisk -l /dev/sda Disk /dev/sda: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors Disk model: WDC WDS100T1R0A Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x00000000
Device Boot Start End Sectors Size Id Type /dev/sda1 1 4294967295 4294967295 2T ee GPT
So, 4294967295*512 = 2199023255040
# losetup -o 2199023255040 -b512 -f /dev/sda # losetup -a /dev/loop1: [0005]:173 (/dev/sda), offset 2199023255040 /dev/loop0: [2115]:1002 (/run/initramfs/live/image.squashfs) # mount -o ro /dev/loop1 /mnt/1 mount: /mnt/1: can't read superblock on /dev/loop1. dmesg(1) may have more information after failed mount system call. # ---- cut here ---- dmesg said: [ 1466.244037] mount: attempt to access beyond end of device loop1: rw=4096, sector=2, nr_sectors = 2 limit=0 [ 1466.244053] EXT4-fs (loop1): unable to read superblock ---- cut here ---- # # fdisk -l ... Disk /dev/md126: 1.82 TiB, 1999307276288 bytes, 3904897024 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 65536 bytes / 131072 bytes Disklabel type: gpt Disk identifier: E28C6D9E-3DE3-4B20-A8DF-7B5C2300025B
Device Start End Sectors Size Type /dev/md126p1 34 32767 32734 16M Microsoft reserved /dev/md126p2 32768 3904894975 3904862208 1.8T Microsoft basic data
Partition 1 does not start on physical sector boundary. ---- cut here ----
I've been using clones (byte copy with dd) of the original disks, so I can play around with them, but I can't risk the original disks...
I'm still flummoxed
Cheers, Laurie.