Rocky linux 9.3 nvme device order changes every reboot

I faced the following problems.

  1. Installing 10 U.2 Type nvme
  2. ROCKY Linux 9.3 Installation / Kernel Version 5.14.0-362.18.1.e19_3.x86_64
  3. nvme device order changes with each reboot
  4. In ROCKY LINUX 9.2 / Kernel version 5.14.0-284.11.el9_2.x86_64, the order of nvme devices is fixed without any settings

Settings and Rules

  1. udevadm, udev rule not available
  2. Disable the aspm option in the BIOS
  3. After rebooting from nvmebay 0 to 9, there should be no change in the order of the devices in the nvmelist.
  4. Disable SR-IOV
  5. UUID not available

I know uuid or udev is a realistic solution, but the customer says it’s difficult to use and manage, so I’m looking for other ways, but it’s still not resolved, so I’m posting a question.

I’ve tried applying a number of kernel options to help with the order nvme list twist issue, but the symptoms don’t improve

I ask for your help.
Thank you.

Since for a while now, mounting partitions in /etc/fstab uses UUID= which solves problems when disk ordering changes. Obviously if they are using /dev/nvmexxxx in fstab then this will be a problem. Which is why since a long time now, it’s been preferred to mount disks in fstab via UUID.

Can we ask what exactly is the problem that the disk ordering changes? Because if fstab is configured correctly, the system will always boot when UUID’s are being used and in such a situation the device ordering change is purely cosmetic. So I’m curious as to what the problem actually is?

Hi iwalker

Thank you for your response.

Except for nvme0n1 with OS installed, the rest of the nvme has not created partitions and is not registered in fstab.

The values in the nvmelist command must remain constant after the reboot.

This is the customer’s testing process.

That is their mistake.

Names, like sda and eth0 are assigned during device enumeration on boot. They are not persistent. An observation “were always same in the past” is pure bad luck. One can roll Nat-20 many times in the row.

The filesystem UUID, LVM, RAID, MAC address, and some others are identifiers stored in the devices. They are persistent. The question is, do those blank devices have any?

Do run:

ls -l /dev/disk/by-*/

If that output contains any persistent id’s, then tell the customer to use those.

If issue seems to be that random order is random, then solution is not necessarily to enforce arbitrary order, but to find the least inconvenient way to live in chaos.

You might want to look at the nvme-cli package from there you can do nvme id-ctrl -v /dev/nvmeNNNN to gather information about the disk. It might help you perform whatever task you are trying to do.

I use it for doing packer builds where i need to partition drives in AWS where i want to map the NVME device name to what amazon lists as their device name

And why don’t you just use the “label” option ? It is reliable and very simple to use.
Available also in grub2 cmds

Even though I forgot that from the list it does not change anything. Label and UUID are both properties of a filesystem. The OP does not have filesystem on the block devices. Not even partition table to get partition UUID.

I don’t think so. If grub sees the labels, no os is yet launched.

What “labels”? The grub can clearly search by filesystem labels, but it can’t find those from blank drives.

And what are we discussing about ?
If the disk is blank, there is no need to call it in anyway. UUID, /dev/ or label or something else (don’t know what it could be)
And if you create a parttition it is given an UUID and you can label it and use it.

I do agree with that logic.

That you have to ask from @dadf

Thank you for your many answers to solve the problem.

The above is not applicable due to circumstances.

I don’t understand, but I can’t help it. I’ll just shut down this case.