Best way to replace failed RAID disk

johni · February 21, 2024, 8:31pm

After power failure server rebooted with one software RAID (md122) in degraded status:

State : clean, degraded
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
- 0 0 1 removed

Tried gdisk on the affected drive /dev/sdc and reports:

GPT fdisk (gdisk) version 1.0.7
Problem opening /dev/sdc for reading! Error is 2.
The specified file does not exist!

Obviously smartctl doesn’t respond so I can’t find the serial number of the defunct disk to identify it.
In past I have lost all data on a RAID cluster when replacing the drive so I want to be very careful I give right instructions in proper order to mdadm this time! What is the proper procedure or where should I look to find it and how can I identify the proper drive to replace (all drives on this machine are WD Blue SSD’s).

jlehtone · February 21, 2024, 8:57pm

But you can get the serial numbers of all the non-broken disks? That should leave one physical disk with “unseen” serial.

johni · February 21, 2024, 10:44pm

Thanks. Yes, I had thought of that but it means I have to pull all the other disks because the serial numbers are on the inside. I was hoping there might be an easier way! Given unfortunate prior experiences with RAID I am hoping that someone will let me know exactly how to replace this disk. In the past the failed disk didn’t go to status “removed” until I issued an mdadm --fail command and then a --remove. Has this been automated?

ganphx · February 21, 2024, 10:58pm

johni · February 22, 2024, 7:07pm

Followed those instructions and RAID cluster is now active again and resync. A number of questions arise from this exercise but I’ll do a different post on them as I think many people will encounter the same questions.

Thanks to both of you; without that pointer I might have made a pig’s ear out of the whole recovery.

system · April 22, 2024, 7:07pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error message from sdd Rocky Linux Help & Support rocky-linux-9	2	107	June 29, 2024
Very Short disk life expectancy on Rocky 9 Rocky Linux Help & Support rocky-linux-9	24	415	September 16, 2024
Failed to start Software RAID monitoring Rocky Linux Help & Support	4	1044	November 24, 2023
Boot failed: Integrated RAID Controller 1: Rocky Linux Rocky Linux Help & Support	3	378	June 23, 2024
RAID 1 and RAID 10 Array Creation and Maintenance Rocky Linux Help & Support	4	1757	August 25, 2023

Best way to replace failed RAID disk

Related topics