After power failure server rebooted with one software RAID (md122) in degraded status:
State : clean, degraded
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
- 0 0 1 removed
Tried gdisk on the affected drive /dev/sdc and reports:
GPT fdisk (gdisk) version 1.0.7
Problem opening /dev/sdc for reading! Error is 2.
The specified file does not exist!
Obviously smartctl doesn’t respond so I can’t find the serial number of the defunct disk to identify it.
In past I have lost all data on a RAID cluster when replacing the drive so I want to be very careful I give right instructions in proper order to mdadm this time! What is the proper procedure or where should I look to find it and how can I identify the proper drive to replace (all drives on this machine are WD Blue SSD’s).
Thanks. Yes, I had thought of that but it means I have to pull all the other disks because the serial numbers are on the inside. I was hoping there might be an easier way! Given unfortunate prior experiences with RAID I am hoping that someone will let me know exactly how to replace this disk. In the past the failed disk didn’t go to status “removed” until I issued an mdadm --fail command and then a --remove. Has this been automated?
Followed those instructions and RAID cluster is now active again and resync. A number of questions arise from this exercise but I’ll do a different post on them as I think many people will encounter the same questions.
Thanks to both of you; without that pointer I might have made a pig’s ear out of the whole recovery.